The Information Bottleneck cover art

The Information Bottleneck

The Information Bottleneck

By: Ravid Shwartz-Ziv & Allen Roush
Listen for free

Two AI Researchers - Ravid Shwartz Ziv, and Allen Roush, discuss the latest trends, news, and research within Generative AI, LLMs, GPUs, and Cloud Systems.

2025 Ravid Shwartz-Ziv & Allen Roush
Science
Episodes
  • AI for Science with Qichao Hu (Molecular Universe / SES AI)
    Jun 29 2026
    1 hr and 1 min
  • Infrastructure for AI at Scale - With Benny Chen (Fireworks AI)
    Jun 24 2026

    We talk a lot on this show about RL, agents, and the move between pre-training and post-training, but not enough about the layer everything actually runs on. Benny Chen, co-founder of Fireworks AI, one of the largest inference platforms around, walks us through what it takes to serve models at scale: sourcing GPUs, writing the kernels, the runtime, and the routing layer that lets a customer hit one endpoint and forget the rest.

    We talk why the real bottleneck is power, not chips, and why that favors Nvidia and Google. Why MoE keeps winning even when dense models look better on paper and why he'd rather run fungible capacity at 95% than specialized chips at 60%. We also talk about quantization limits, where RL efficiency has to go next, and his case that AI is still under-hyped. We also get into cross-region training, sparse autoencoders and why interpretability hasn't taken off in open source, whether open models can close the gap, and a frank read on Anthropic's go-to-market.

    Timeline

    • 00:00 — Intro: the part of AI nobody talks about
    • 01:20 — What "infrastructure for AI" actually means: the layers, from GPUs up to routing
    • 02:59 — Why not just buy your own GPUs and do it yourself?
    • 05:17 — The scale Fireworks runs at
    • 06:35 — Hardware inflation, GPU costs, and the real risk hiding in commit duration
    • 10:14 — Nvidia vs AMD vs TPUs, and why power is the bottleneck
    • 11:57 — Mixing GPU types and generations; fungibility vs. specialization
    • 14:22 — Once you have the GPUs, what's the next layer to build?
    • 17:04 — Dense vs. MoE, and why the hardware picks the winner
    • 21:07 — Quantization: is FP4 the floor? TurboQuant and INT vs. FP
    • 24:28 — How tied are the algorithms to the hardware?
    • 25:12 — DeepSeek, DeepGEMM, and next-token prediction as reconstruction loss
    • 28:50 — Why RL is still wildly inefficient compared to pre-training
    • 30:08 — Speculative decoding, AI-generated kernels, and auto-research
    • 34:00 — The AGI question: why text gets automated but vision may stay expensive
    • 37:07 — Hype check: why Benny thinks AI is still under-hyped
    • 41:28 — Training vs. inference at the infrastructure level
    • 44:12 — Scaling across data centers: cross-region training with Cursor
    • 45:40 — Sparse autoencoders, interpretability, and why open source is human-constrained
    • 49:04 — Will open models catch up — on quality and on compute?
    • 51:41 — Are we plateauing? Opus 4.7 vs. 4.6 and the coming data wars
    • 54:41 — Physical limits, HBM, and whether chips keep getting faster
    • 58:17 — The belief about inference everyone gets wrong
    • 59:31 — Anthropic, mythos, and a frank take on go-to-market
    • 1:04:41 — Wrap-up

    Music:

    • "Kid Kodi" - Blue Dot Sessions - via Free Music Archive - CC BY-NC 4.0.

    About: The Information Bottleneck is hosted by Ravid Shwartz-Ziv and Allen Roush, featuring in-depth conversations with leading AI researchers about the ideas shaping the future of machine learning.

    Show More Show Less
    1 hr and 6 mins
  • Broken Peer Review, AI, and Worms — with Oded Rechavi
    Jun 21 2026

    Oded Rechavi is a biologist at Tel Aviv University and the co-founder of QED, a company building AI to review scientific work. He's also spent years studying worms.

    We start with what's wrong with peer review and grant funding: why it takes years to publish, why reviewers are often your own competitors, and why the whole thing is locked to an economic model that rewards publishing more papers, not better ones. Oded explains why he doesn't call QED "peer review" at all, and what it would take to actually validate science instead of just stamping it.

    Then we get into the biology. C. elegans has exactly 959 cells, every one of them named, and a fully mapped brain. Oded's lab studies how a worm's experiences get passed to its offspring through RNA rather than DNA — meaning what happens to a worm in its lifetime can change its descendants. We also talk about using ancient DNA to reassemble the Dead Sea Scrolls, what AI can and can't do for biology, and why he wants to build an "Ironman suit" for researchers rather than replace them.

    00:00 Intro

    01:35 Why scientific publishing is broken

    04:02 Years to publish, and what it costs science

    07:20 Bad reviewers, conflicts of interest, and the money

    10:47 Why preprints don't fix it

    15:37 How AI conferences handle review

    22:07 Conferences vs. journals — does slow review help?

    25:22 Building QED: review, not peer review

    30:02 Tracking a paper from idea to submission

    33:11 What writing a grant actually involves

    35:00 The ERC reviewer crisis

    37:06 Tailoring feedback to your field

    41:48 Switching to biology

    44:30 Every cell has a name: inside C. elegans

    46:28 Inheritance without DNA

    48:16 What the worm "thinks" changes its offspring

    51:58 Reassembling the Dead Sea Scrolls with ancient DNA

    56:07 Psychedelics and worms

    58:36 Can AI run the research itself?

    1:04:49 Automation vs. validation

    1:07:12 The origin of life

    1:08:49 Why people reject AI-written work

    1:16:18 Will humans still have a role?

    1:17:39 Wrap-up

    Music:

    • "Kid Kodi" - Blue Dot Sessions - via Free Music Archive - CC BY-NC 4.0.

    About: The Information Bottleneck is hosted by Ravid Shwartz-Ziv and Allen Roush, featuring in-depth conversations with leading AI researchers about the ideas shaping the future of machine learning.

    Show More Show Less
    1 hr and 18 mins
adbl_web_anon_alc_button_suppression_t1
No reviews yet