#1679 post — AI DeepMind (@AI

TGStat

Qidiruv uchun matnni kiriting

Ilg‘or kanal qidiruvi

Uzbek

Sayt tili

Russian English Uzbek
Saytga kirish

Katalog

Kanal va guruhlar katalogi Kanallar qidiruvi
Kanal/guruh qo‘shish
Reytinglar

Kanallar reytingi Guruhlar reytingi Postlar reytingi
Brendlar va shaxslar reytingi
Analitika
Postlarda qidiruv
Telegram'ni kuzatish

AI DeepMind

4 Oct, 11:11

Telegram'da ochish Ulashish Shikoyat qilish

Were RNNs All We Needed?

Interesting work on reviving RNNs. arxiv.org/abs/2410.01201 -- in general the fact that there are many recent architectures coming from different directions that roughly match Transformers is proof that architectures aren't fundamentally important in the curve-fitting paradigm (aka deep learning)

Curve-fitting is about embedding a dataset on a curve. The critical factor is the dataset, not the specific hard-coded bells and whistles that constrain the curve's shape. As long as your curve is sufficiently expressive all architectures will converge to the same performance in the large-data regime.

#مقاله #ایده_جذاب

🔸 مطالب بیشتر 👇👇

✅ @AI_DeepMind
🔸 @AI_Person

Were RNNs All We Needed?

The scalability limitations of Transformers regarding sequence length have renewed interest in recurrent sequence models that are parallelizable during training. As a result, many novel recurrent...