Energy-Based Transformers [video]

cs702 18 hours ago

I would read the blog post by the lead author instead of watching this video:

https://alexiglad.github.io/blog/2025/ebt/

Also, see:

https://www.reddit.com/r/MachineLearning/comments/1lu1ia0/r_...

programjames 12 hours ago

TLDR; Train an "energy" model that checks if the output is correct (rather than just outputting something), and gradient descent to find good outputs. Using transformers.

tripplyons 15 hours ago

I've seen some of that channel's videos before, and many of them contain errors. I haven't read the Energy-Based Transformers paper yet, so I can't say for sure if this video contains any errors, but be careful.