Episodes

  • Custom AI Software Development: What Your Business Needs to Know
    Jun 12 2026

    Most businesses are running AI tools that handle surface-level tasks — and if those tools disappeared tomorrow, little would change. The companies pulling ahead aren't using fancier off-the-shelf software; they're building AI systems shaped entirely around their own data, rules, and workflows. This episode of Development draws on the full guide to custom AI software development to walk through everything a business needs to know before, during, and after building a tailored AI solution.

    Here's what the episode covers:

    • What "custom AI" actually means — and why it's defined by your problem shaping the solution, not a vendor tweaking settings on your behalf.
    • When to go custom vs. off-the-shelf — custom starts paying off the moment quality, privacy, or workflow fit become decisive, especially in specialized or regulated domains.
    • How to scope your first project — start with one stubborn workflow, capture baseline numbers, and define the smallest version of success that would make people genuinely cheer.
    • Data readiness and infrastructure — why clean, well-curated data consistently outperforms massive messy datasets, and how to build pipelines that are dependable rather than heroic.
    • Model selection and architecture — why bigger isn't better, when classic ML methods still win, and how retrieval-augmented generation (RAG) keeps outputs grounded in facts you trust.
    • Operations, safety, cost, and team structure — from reproducible training pipelines and rollback plans to compliance-by-design, budget guardrails, and the small cross-functional team that actually ships.

    The episode closes with a practical readiness checklist — covering problem clarity, data access, team alignment, security requirements, and budget — and a clear call to start thin, measure what matters, and let evidence drive every upgrade. More from the show: if you're interested in avoiding subtle engineering pitfalls, check out Five PHP Mistakes That Quietly Wreck Your Codebase.

    DEV

    Show More Show Less
    8 mins
  • Five PHP Mistakes That Quietly Wreck Your Codebase
    Jun 11 2026

    PHP makes it easy to move fast, and that's precisely where the trouble starts. The same flexibility that lets teams ship quickly creates a gravitational pull toward shortcuts that look harmless in the moment but compound into serious problems over months and years. This episode of Development draws on the five PHP mistakes that quietly wreck your codebase to walk through the patterns that trip up even experienced teams — and the disciplined habits that keep codebases clean, secure, and maintainable.

    The episode covers five distinct failure modes, each with concrete fixes:

    • Silencing errors without logging them — Suppressing warnings to keep output clean is reasonable; letting those warnings vanish into the void is not. The fix is environment-aware configuration: display errors locally, log everything in staging and production, and set up alerts so recurring issues don't pile up unnoticed.
    • Mixing business logic with presentation — PHP's templating roots make it tempting to drop database queries directly into view files, especially under deadline pressure. Once that pattern takes hold, the codebase becomes difficult to navigate for everyone. A consistent separation-of-concerns pattern — MVC, ADR, or otherwise — enforced by documentation and code review, is the antidote.
    • Neglecting server-side input validation — Client-side checks are a convenience, not a security boundary. SQL injection, XSS, and parameter tampering remain real threats, and the downstream cost of a breach — lost trust, corrupted data, emergency patches — far outweighs the cost of rigorous, context-aware validation from the start.
    • Reinventing solved problems — PHP's standard library and the Composer ecosystem cover an enormous range of well-tested functionality. Custom implementations often quietly skip the edge-case handling that established packages have spent years getting right. A "package first, custom second" culture, backed by a vetted internal dependency list and a commitment to keeping packages updated, closes this gap.
    • Weak version control and missing documentation — Vague commit messages, long-lived branches, and undocumented intent are predictable consequences of shipping under pressure. The episode frames good commit discipline as a "tour guide mentality": future teammates — including your future self — should be able to reconstruct the reasoning behind any change from the history and comments alone.

    The throughline across all five mistakes is the same: small, consistent habits compound. None of the fixes require a framework migration or a full rewrite — just deliberate practice applied repeatedly over time. If you want to go deeper, the full written breakdown is worth bookmarking. And if you enjoyed this one, don't miss the recent episode on Why Businesses Are Building Private LLMs Instead of Renting Them for another look at how technical architecture decisions play out in the real world.

    DEV

    Show More Show Less
    8 mins
  • Why Businesses Are Building Private LLMs Instead of Renting Them
    Jun 10 2026

    The convenience of public AI APIs is hard to argue with — until the moment it isn't. This episode of Development examines the growing enterprise movement away from rented, third-party models and toward privately owned, custom-built LLMs, drawing on the case for building versus renting large language models. For organizations where data sensitivity, regulatory exposure, or product reliability is on the line, the calculus is shifting fast.

    The episode walks through the full decision landscape — from the initial appeal of public APIs to the structural reasons they break down at enterprise scale, and from model selection all the way through agentic deployment. Here's what's covered:

    • Why public APIs create real risk: Proprietary data leaving your network, vendor-controlled rate limits and policy changes, and outages that become your problem to absorb.
    • Data sovereignty as the accelerating factor: Tightening regulations in finance, healthcare, law, and defense are making third-party API routing legally untenable for sensitive workloads — not just inadvisable.
    • What a private LLM actually means: Owning the model weights, controlling the inference pipeline, keeping every prompt and response inside your own perimeter, and maintaining full audit logs.
    • Model selection and open-source options: How to choose between models like LLaMA 3, Mistral, and Falcon — and why a smaller, domain-fine-tuned model often outperforms a large generic one for specific use cases.
    • Data integration strategies: The difference between full fine-tuning, retrieval-augmented generation (RAG), and lightweight techniques like LoRA/QLoRA — and why keeping that data pipeline refreshed and auditable matters as much as the initial build.
    • The agentic layer: How orchestration frameworks can turn a private LLM from a question-answering tool into an agent that reasons through multi-step tasks, queries internal systems, and takes real action — a distinction that's critical for workflow automation.

    The episode also looks at real-world traction in legal (contract review with citations), financial services (compliance flagging), healthcare (clinical support within secure perimeters), and enterprise SaaS (internal documentation assistants that actually know the product). The throughline: the organizations getting the most from AI right now are treating it as infrastructure they own — not a subscription they hope stays stable.

    For more on managing the complexity that comes with running LLMs at scale, check out the Development episode on Token Budgeting Strategies for Long-Context LLM Apps.

    DEV

    Show More Show Less
    8 mins
  • Token Budgeting Strategies for Long-Context LLM Apps
    Jun 4 2026

    Context windows keep growing, but bigger doesn't mean better — or cheaper. This episode of Development tackles one of the most consequential engineering challenges in building LLM-powered applications: deciding deliberately what goes into each prompt, what gets left out, and how to manage the cumulative cost of every token you send. Drawing on the token budgeting strategies for long-context LLM apps article from DEV, the episode moves from first principles to concrete, production-tested patterns you can start applying today.

    The episode explains why even frontier models with million-token windows don't solve the problem on their own — and then walks through seven strategies that separate well-optimized apps from ones that blow budgets, return degraded output, or stall entirely:

    • Summarize before you send — distill large documents down to their relevant essence, either manually or by routing text through a cheaper summarization model, before it reaches your main prompt.
    • Chunk and retrieve — break documents into semantically coherent pieces, store them in a vector database, and pull only the chunks that match the user's query via similarity search — the foundation of retrieval-augmented generation (RAG).
    • Relevancy checks — gate content with an embedding similarity score, a lightweight classifier, or a pre-filter prompt so only material that clears a relevancy threshold makes it into the final request.
    • External memory for conversation history — store chat history in a database and retrieve only the most recent or relevant exchanges per turn, using rolling summaries for older context to prevent history from ballooning across a session.
    • Lean prompt engineering — audit and trim system prompts ruthlessly; verbose, repetitive instructions compound in cost across every API call and often dilute output quality.
    • Real-time token monitoring — instrument token counts from day one, set alerts for spikes, and add guardrails on user-submitted content length before an unexpected bill forces the conversation.
    • Sequential processing for unavoidable full-context tasks — when the content genuinely can't be condensed, use a model with a larger limit or process the material in passes, feeding each round's summary into the next.

    The episode closes by walking through a concrete end-to-end example — a developer documentation assistant — to show how these strategies layer together into a prompt pipeline that is tight, cost-effective, and accurate. The core takeaway: the cost gap between a naively built LLM app and a well-optimized one can be an order of magnitude at scale, and none of the fixes require exotic tooling — just intentional design.

    DEV

    Show More Show Less
    7 mins
  • Java in 2025: Still Worth It, or Time to Move On?
    Jun 4 2026

    Java has a reputation problem — not because of what it is today, but because of what it used to be. This episode of Development takes a clear-eyed look at the Java ecosystem in 2025, drawing on the full analysis of whether Java is still worth it to separate outdated assumptions from current engineering reality. Whether you're defending an existing Java investment or weighing it as a greenfield choice, the picture is more nuanced — and more favorable — than the meme threads suggest.

    Here's what this episode covers:

    • The "legacy" label problem: How Java's heavyweight, XML-heavy, slow-starting past earned it a reputation that has outlived the actual technical reality by years.
    • Ecosystem depth and JVM versatility: Maven Central's half-million-plus artifacts, polyglot JVM language support (Kotlin, Scala, Clojure, and more), and the commercial and community backing that newer languages simply can't match.
    • Modern language features and developer experience: Records, pattern matching, switch expressions, and local type inference have meaningfully reduced Java's infamous verbosity — and IntelliJ IDEA remains one of the strongest development environments available for any language.
    • Framework and infrastructure transformation: Spring Boot, Quarkus, Micronaut, and Helidon have made container-first, serverless-ready Java services the norm, not the exception — dramatically cutting cold-start times and memory footprints in the process.
    • GraalVM Native Image: How ahead-of-time compilation to statically linked binaries brings Java startup times into the millisecond range and memory usage competitive with Go or Rust — a fundamental shift, not an incremental one.
    • Where Java fits — and where it doesn't: High-throughput microservices, regulated industries, JVM-native data pipelines, and legacy modernization are Java's strong suits; ultra-low-memory edge devices, real-time 3D engines, and static CLI tools are better served elsewhere.

    The episode closes with a practical framework for making the Java decision rigorously: benchmark startup costs, profile memory under realistic load, audit library availability, and honestly gauge developer morale — because a team that resents its stack ships slower regardless of how good the runtime is. More from the show: if you're thinking carefully about framework choices in adjacent ecosystems, the earlier episode Flask vs. Django: Choosing the Right Python Web Framework covers similar decision-making territory from a Python perspective.

    DEV

    Show More Show Less
    7 mins
  • Flask vs. Django: Choosing the Right Python Web Framework
    Jun 4 2026

    Choosing between Flask and Django is one of the most common decisions in Python web development — and also one of the most misunderstood. This episode of Development breaks down the core philosophies behind both frameworks, drawing on the Flask vs. Django framework comparison guide to give developers a practical, context-driven way to think about the choice. Rather than declaring a winner, the episode argues that the right framework depends entirely on what you're building and who's building it.

    Here's what the episode covers:

    • Batteries included vs. bring your own: Django's all-in-one philosophy versus Flask's minimal core — and why that foundational difference still defines both frameworks today.
    • Where Flask shines: Why its simplicity and transparency appeal to developers building microservices, APIs, or anything that demands fine-grained control over the stack.
    • What Django gets right out of the box: The built-in ORM, auto-generated admin panel, authentication, and migrations that can save teams weeks on feature-rich applications.
    • Team size and experience: How Flask's low overhead suits solo developers and small teams, while Django's consistent conventions reduce onboarding friction as teams scale.
    • Project complexity and timelines: When Django's pre-built plumbing actually makes it the faster choice for MVPs — and when Flask's flexibility earns its keep in specialized, loosely coupled architectures.
    • Ecosystem and long-term maintenance: How each framework handles upgrades, third-party integrations, and the ongoing cost of keeping a codebase healthy over time.

    The episode closes with a practical suggestion: if you're genuinely undecided, prototype the same small feature in both frameworks and let the experience speak for itself. The right framework is the one that fits how your team thinks and lets your codebase grow without friction.

    DEV

    Show More Show Less
    8 mins
  • Custom LLM Development: Securing and Customizing Your Private AI Stack
    May 30 2026
    Private large language models are rapidly becoming essential infrastructure for organizations that need AI capabilities without sacrificing control over their data. In this episode, we explore the full lifecycle of custom LLM development — from model selection and fine-tuning through deployment, agentic orchestration, and ongoing operations — based on a detailed breakdown published on the DEV.co blog.Public AI APIs from providers like OpenAI and Anthropic have made language models accessible to virtually any organization. But accessibility comes with trade-offs. You can't fully control latency. You can't inspect how your data is handled on the other side. Vendor-imposed rate limits, shifting usage policies, and hidden data-sharing risks create real constraints for companies operating in regulated industries or handling sensitive intellectual property. A private LLM eliminates those dependencies — it runs within your environment, on your infrastructure, under your rules.The demand is being driven by several converging forces. Data sovereignty laws in finance, healthcare, legal, and defense increasingly restrict where sensitive information can be processed. AI-native companies building products and decision pipelines around language models need performance guarantees that third-party APIs can't provide. And the open-source model ecosystem — led by LLaMA 3, Mistral, Mixtral, and Falcon — has matured to the point where self-hosted models can genuinely compete with proprietary offerings for many enterprise use cases.Model selection is the foundation of any private LLM project, and it involves more nuance than simply choosing the largest available model. Bigger doesn't always mean better — larger models carry higher hardware costs and longer inference times, and a smaller model carefully fine-tuned on domain-specific data often outperforms a generic large model at a fraction of the cost. Licensing terms also vary significantly across open-source models, with some imposing commercial use restrictions or attribution requirements that need to be evaluated before committing to a base architecture.Data integration and fine-tuning transform a general-purpose model into one that genuinely understands your business. This means ingesting internal documentation, knowledge bases, customer communications, and operational data to give the model contextual fluency. Full fine-tuning is one approach, but techniques like retrieval-augmented generation allow the model to look up relevant information on the fly without retraining. Lightweight adapter methods like LoRA and QLoRA offer another path — delivering significant performance gains with minimal computational overhead. The critical requirement is building a complete data pipeline that keeps the model's knowledge current and secure over time, not just a one-time import.Infrastructure and deployment is where many projects succeed or stall. The options range from fully on-premises installations that satisfy air-gapped compliance requirements to private cloud architectures that scale elastically with demand. Either way, the work includes GPU provisioning, container orchestration, access control, audit logging, and compliance certification — SOC 2, HIPAA, or whatever regulatory frameworks apply. Inference optimization is equally critical, because a model that takes several seconds to respond to every query will quickly lose user adoption regardless of its accuracy.The agentic AI layer is where private LLMs move from question-answering tools to genuine workflow engines. Orchestration frameworks like LangChain and AutoGen turn language models into agents that can reason through multi-step tasks, interact with APIs, query databases, generate reports, and route decisions through approval chains. This transforms the model from a text generator into the core engine of automated business processes — triaging support tickets, producing compliance documentation, extracting insights from unstructured data, and integrating with CRMs, ERPs, and existing enterprise systems.Industry applications span legal contract review and e-discovery, financial compliance and SEC filing analysis, clinical support tools under HIPAA, AI-powered documentation and onboarding in SaaS, and automated standard operating procedures in manufacturing. In each case, the private deployment model ensures that sensitive data never leaves the organization's controlled environment while still delivering the speed and intelligence advantages that language models provide.Engagement models for private LLM development range from fixed-scope proof-of-concept builds to full production deployments with ongoing LLMOps retainers covering model tuning, security updates, hallucination filtering, and prompt audits. Fully managed private LLM-as-a-service options are also available for organizations that want enterprise AI capabilities without managing the underlying infrastructure.To learn more about custom LLM ...
    Show More Show Less
    9 mins
  • Open Source Software: Pros, Cons, and How to Choose for Custom Development
    May 25 2026
    In this episode, Alex and Molly walk through DEV.co's comprehensive guide on the pros and cons of using open source software for custom development projects. Whether you're a startup choosing your first tech stack or an enterprise evaluating alternatives to expensive proprietary licenses, this conversation covers the benefits, the risks, the history, and the practical decision-making framework you need.Open source software is not a niche alternative anymore. It is the foundation the modern internet is built on. Linux powers the world's top supercomputers, U.S. Air Traffic Control systems, and the vast majority of web and cloud servers. WordPress runs more websites than any other platform on earth. Firefox, Gimp, VLC, Magento, Apache OpenOffice — the list of open source tools that individuals and enterprises rely on every day is enormous. The question is no longer whether to use open source software. It is how to use it well.The episode begins with the fundamentals: what defines open source software and the four freedoms that underpin the movement — the freedom to run, study, redistribute, and improve software. From there, the conversation digs into the concrete benefits that make open source compelling for development teams and businesses of all sizes.Cost savings are the most immediately obvious advantage. Most open source software is free, which means no per-seat licensing fees, no recurring subscriptions, and dramatically lower software costs for teams of any size. The article notes that paying developers to create proprietary software from scratch can cost tens or hundreds of thousands of dollars, while customizing an existing open source project to meet your specific needs is almost always cheaper. For businesses running multiple software tools across entire teams, the savings can reach tens of thousands of dollars per year.Security is the second major benefit, and the episode addresses the common misconception that open source code being publicly visible makes it less secure. In practice, the opposite is often true. Because thousands of developers can inspect the code, vulnerabilities are discovered and patched faster than in proprietary software. Back doors and malware are harder to hide. And the community-driven model produces frequent updates and security fixes. The episode discusses the Equifax breach as a case study — the company blamed Apache Struts, but analysis showed the breach was likely caused by their own failure to apply an available patch, not by a flaw in the open source framework itself.Full customization is the third advantage. Unlike proprietary software, where you are locked into the vendor's feature set, open source gives you complete control over the code. You can remove features you don't need, add features you do, and redesign how the software functions. The episode walks through practical examples of how this works, from adding a time clock feature to a task management suite to transforming WordPress into a lead generation engine.Other benefits covered include extended backwards compatibility, strong community support, no subscription costs, the ability for entire teams to use software without licensing concerns, and the fact that open source is often cheaper for businesses to maintain long-term because improvements are crowdsourced rather than handled by a dedicated internal team.The conversation then shifts to the ten open source projects that run the world, including WordPress, Mozilla Firefox, Gimp, Magento 2, Apache OpenOffice, VLC Media Player, Linux, Handbrake, PDF Creator, and Pidgin. Each example illustrates how deeply embedded open source software is in daily operations across industries and use cases.The history of the open source movement provides important context. The episode traces the origins from Eric Raymond's influential essays The Cathedral and the Bazaar through Netscape's groundbreaking decision to release Navigator's source code in 1998, to the formal coining of the term "open source" by the Open Source Initiative. That history explains how the movement evolved from a developer philosophy into a mainstream approach to building and distributing software.The episode also covers the downsides honestly. Hackers can exploit organizations that fail to update and patch their open source software. Employees may resist switching from familiar proprietary brands. Not all open source projects have active communities or strong support. And many projects struggle with funding, which can lead to abandoned codebases and delayed updates. These are real risks, but they are manageable with proper evaluation and maintenance practices.The final section walks through four practical guidelines for choosing the right open source software: avoid building your business around any single application, review the history of releases and security patches, download only from trusted sources, and don't rely on unsupported or unmaintained projects.This episode is for ...
    Show More Show Less
    12 mins