Yannic Kilcher

311 K subscribers

I make videos about machine learning research papers, programming, and issues of the AI community, and the broader impact of AI in society. Twitter: https://twitter.com/ykilcher Discord: https://ykilcher.com/discord BitChute: https://www.bitchute.com/channel/yannic-kilcher LinkedIn: https://www.linkedin.com/in/ykilcher BiliBili: https://space.bilibili.com/2017636191 If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this): SubscribeStar: https://www.subscribestar.com/yannickilcher Patreon: https://www.patreon.com/yannickilcher Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2 Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n

4 summaries available YouTube ← All channels

Videos Channels Newsletter

2:33:37

Yannic Kilcher

Traditional X-Mas Stream

While streaming Minecraft gameplay, ML researcher Yannic Kilcher discusses how recursive self-improvement in AI faces practical exploration limits similar to reinforcement learning, and notes the field's shift from fundamental research to market-driven product development focused on coding and image generation applications.

4 months ago · 6 points

TiDAR: Think in Diffusion, Talk in Autoregression (Paper Analysis)

47:02

Yannic Kilcher

TiDAR: Think in Diffusion, Talk in Autoregression (Paper Analysis)

TiDAR accelerates autoregressive LLM inference by utilizing idle GPU capacity during memory-bound phases to pre-draft future tokens via diffusion, then verifying them through autoregressive rejection sampling to maintain exact output quality without auxiliary model overhead.

4 months ago · 10 points

Titans: Learning to Memorize at Test Time (Paper Analysis)

32:31

Yannic Kilcher

Titans: Learning to Memorize at Test Time (Paper Analysis)

This analysis of Google's Titans paper explores an architecture that extends context windows by using a 2-layer MLP as a neural memory module that learns to compress and retrieve long-range information at test time, though the reviewer notes it reinvents some existing linear attention concepts while offering genuine innovation in adaptive memory.

5 months ago · 7 points

[Paper Analysis] The Free Transformer (and some Variational Autoencoder stuff)

40:10

Yannic Kilcher

[Paper Analysis] The Free Transformer (and some Variational Autoencoder stuff)

The Free Transformer extends decoder architectures by introducing latent variables at the start of generation to capture global sequence decisions (like sentiment), replacing the implicit inference required by standard token-level sampling with explicit conditioning that simplifies learning and improves coherence.

6 months ago · 8 points