Unifying Probabilistic Learning in Transformers

By jabbyai
No Comments

NEW PAPER: Unifying Probabilistic Learning in Transformers

What if attention, diffusion, reasoning and training were all the same thing?

Our paper proposes a novel, unified way of understanding AI — and it looks a lot like quantum mechanics.

Intelligent models should not be a melting pot of different structures. This work aims to take a first step in unifying those ideas — next-token prediction, diffusion, attention, reasoning, test-time training… Can these objects which all seem so different all arise from the same framework? The paper includes a novel, exact derivation and explanation of attention. More interesting still, however, is that the framework (and so AI) appears to be an approximation of a quantum system.

What do you think about the work? Please let me know I’m eager for thoughts on the content or ideas!

submitted by /u/LahmacunBear
[link] [comments]

No Comments

Uncategorized

Unifying Probabilistic Learning in Transformers

Leave a Comment Cancel reply

Recent Posts

Recent Comments

Archives

Categories