Why AI Inference Costs Are Crashing 40 Percent This Year

Failed to add items

Sorry, we are unable to add the item because your shopping cart is already at capacity.

Add to basket failed.

Please try again later

Add to wishlist failed.

Please try again later

Remove from wishlist failed.

Please try again later

Adding to library failed

Please try again

Follow podcast failed

Unfollow podcast failed

Why AI Inference Costs Are Crashing 40 Percent This Year

Listen for free

View show details

Lucas and Luna unpack the dramatic collapse in AI inference pricing, using the latest numbers from NVIDIA, AMD, and Super Micro to show how hardware competition is reshaping the economics of running large language models. They trace the shift from training to inference, explain why inference costs have dropped roughly 40 percent year-over-year, and discuss what that means for startups, cloud margins, and the next wave of AI applications. Along the way, they touch on the surprising rally in Super Micro Computer and what the 'death of the GPU shortage' means for the entire supply chain. A focused, data-driven conversation for anyone trying to understand where the AI market is heading in mid-2026. #AIInference #InferenceCosts #NVIDIA #AMD #SuperMicro #GPU #LLM #CloudComputing #AIHardware #TechInvesting #Semiconductors #AIPricing #DataCenters #AIStartups #Technology #FexingoBusiness #BusinessPodcast #ChatGPTAndBeyond Keep every episode free: buymeacoffee.com/fexingo

No reviews yet