How Datadog Rebuilt Its Observability Pipeline for 100 Trillion Events Daily cover art

How Datadog Rebuilt Its Observability Pipeline for 100 Trillion Events Daily

How Datadog Rebuilt Its Observability Pipeline for 100 Trillion Events Daily

Listen for free

View show details
Episode 79 of The CTO Podcast dives into the engineering behind Datadog's core pipeline. Hosts Lucas and Luna unpack how Datadog re-architected its ingestion, processing, and storage layers to handle over 100 trillion events per day by mid-2026. They explore the shift from a monolithic intake to a sharded, stream-oriented architecture, the decision to build custom compression rather than use off-the-shelf codecs, and how the team maintained sub-second query latencies while scaling throughput by 10x over three years. Along the way, they discuss tradeoffs between consistency and availability, the role of probabilistic data structures for sampling, and why Datadog eventually rewrote parts of its query engine in Rust. This episode offers a concrete look at what it takes to keep observability observant when the data never stops growing. Perfect for engineering leaders and senior architects wrestling with scale. #Datadog #Observability #DataPipeline #Architecture #Engineering #Scalability #StreamProcessing #Compression #Rust #QueryEngine #APM #Telemetry #CloudInfrastructure #BigData #SRE #BusinessAndTechnology #FexingoBusiness #BusinessPodcast Keep every episode free: buymeacoffee.com/fexingo
adbl_web_anon_alc_button_suppression_t1
No reviews yet