John Carmack (@ID_AA_Carmack) “I always lost performance when I tried to use silu/gelu activations in my RL val” — TopicDigg

注册并分享邀请链接，可获得视频播放与邀请奖励。

立即注册

John Carmack 的头像

John Carmack

@ID_AA_Carmack

AGI at Keen Technologies, former CTO Oculus VR, Founder Id Software and Armadillo Aerospace

加入 August 2010

285 正在关注 1.6M 粉丝

John Carmack@ID_AA_Carmack

2026.02.23 16:54

I always lost performance when I tried to use silu/gelu activations in my RL value networks, and I finally understand why. If the pre-activation values are small, the smooth curve through zero is basically a linear activation, destroying the representation power of the network. You need a batch/layer/rms norm on the preactivations to put them in the range the smooth activations are designed for. Internal norms generally hurt performance on our RL tasks, but combining them with a smooth activation at least works basically as well as a raw relu (but slower). So, not actually a win, but the lightbulb of understanding was good!

显示更多

0

0

37

911

40

转发到社区

热门用户

730.8K 粉丝

sakuramomo🍑紫玥桃🍑

760.9K 粉丝

414.4K 粉丝

Bill The Investor

@billtheinvestor

117.4K 粉丝

半半子💖BANBANKO

521.6K 粉丝

土澳大狮兄BroLeon | 🔶BNB |

115.4K 粉丝

郭宇 guoyu.eth

176.7K 粉丝

Natsuko夏夏子💕C107(水)東7 T-11b

284.9K 粉丝

855.6K 粉丝

凤九歌🔶BNB

5K 粉丝

CryptoMaid加密女仆お嬢様 .edge🦭

145.4K 粉丝

六二二同学

320.6K 粉丝

226.3K 粉丝

102.5K 粉丝

ねね🐻‍❄

370.3K 粉丝