What happens when the model CAN'T fix it? Interview w/ software engineer Landon Gray [Podcast #213]

freeCodeCamp.org

| Programming | March 27, 2026 | 11 Thousand views | 1:32:41

TL;DR

Software engineer Landon Gray explains that LLMs are merely 'raw fuel' requiring 'harnesses' (specialized tooling infrastructure) to produce reliable results, distinguishes AI engineering from data science and ML engineering, and argues developers must understand ML fundamentals to solve critical problems that models themselves cannot fix.

🔗 LLM Harnesses and Infrastructure 3 insights

Harnesses are the true product differentiator

A 'harness' refers to the tooling, constraints, and infrastructure built around raw LLM outputs to structure results, reduce hallucinations, and constrain behavior for specific business needs.

Perplexity's competitive advantage

Perplexity likely uses models like Claude but delivers superior deep research capabilities through sophisticated harness layers that process and refine outputs beyond raw API calls.

Software beats retraining costs

While improving foundational models requires hundreds of millions in training costs, building harness software allows teams to iterate quickly and improve performance through traditional code changes.

🧭 Defining AI Engineering 3 insights

Three distinct data disciplines

Data science focuses on statistical algorithms and Bayesian methods; data engineering handles data plumbing and preparation (consuming 80% of effort); AI engineering applies software development skills to leverage existing models.

Job title confusion

The term 'AI Engineer' is inconsistently used by employers to describe both software developers who build with LLMs and ML engineers who train models, requiring careful reading of job descriptions.

The software engineer's entry point

AI engineering allows software developers to enter the field by leveraging existing coding strengths while gradually learning model fundamentals, rather than requiring immediate deep ML expertise.

🚧 When Models Can't Fix The Code 3 insights

The inevitable bottleneck

Teams relying solely on AI-generated code eventually hit walls—such as latency bottlenecks or architectural constraints—where asking the model to fix the problem produces no solution.

First principles prevent paralysis

Understanding how models work under the hood enables developers to research white papers and architect creative solutions when LLMs fail to diagnose complex system issues.

The accountability gap

Without foundational ML knowledge, teams cannot explain to leadership why critical performance issues persist or how to resolve them when AI tools reach their diagnostic limits.

Bottom Line

Build robust harness tooling around LLMs rather than treating AI as a magic black box, and invest in understanding ML fundamentals so you can architect solutions when the inevitable problems arise that the model cannot fix itself.

Watch on YouTube

More from freeCodeCamp.org

AWS Certified Cloud Practitioner Certification Course 2026 (CLF-C02) - Pass the Exam!

freeCodeCamp.org

AWS Certified Cloud Practitioner Certification Course 2026 (CLF-C02) - Pass the Exam!

Andrew Brown provides a comprehensive guide to the AWS Certified Cloud Practitioner (CLF-C02) exam, covering certification value, exam logistics, cloud computing fundamentals, and AWS history while outlining a structured study roadmap for beginners and experienced professionals.

about 19 hours ago · 10 points

Open Models Coding Essentials – Running LLMs Locally and in the Cloud Course

freeCodeCamp.org

Open Models Coding Essentials – Running LLMs Locally and in the Cloud Course

Andrew Brown tests open-source coding models including Gemma 4, Kimi 2.5, and Qwen across local and cloud deployments to evaluate viable alternatives to proprietary solutions, finding that while some models perform surprisingly well, hardware constraints make cloud hosting the practical choice for most developers.

8 days ago · 10 points

JavaScript Event Loop & Asynchronous Programming

freeCodeCamp.org

JavaScript Event Loop & Asynchronous Programming

This video demystifies how JavaScript handles asynchronous operations while remaining single-threaded, explaining the interplay between the call stack, web APIs, callback queues, and the event loop that enables non-blocking execution.

10 days ago · 9 points

Stanford's youngest instructor on InfoSec, AI, catching cheaters - Rachel Fernandez [Podcast #217]

freeCodeCamp.org

Stanford's youngest instructor on InfoSec, AI, catching cheaters - Rachel Fernandez [Podcast #217]

Rachel Fernandez, Stanford's youngest instructor at 19, discusses why C++ remains vital to modern infrastructure despite security challenges, the risks of AI-generated code built on potentially vulnerable foundations, and her journey from a resource-starved high school to organizing one of the world's largest hackathons with million-dollar budgets.

14 days ago · 10 points

Browse more: 💻 Programming All Videos All Categories