What happens when the model CAN'T fix it? Interview w/ software engineer Landon Gray [Podcast #213]

| Programming | March 27, 2026 | 11 Thousand views | 1:32:41

TL;DR

Software engineer Landon Gray explains that LLMs are merely 'raw fuel' requiring 'harnesses' (specialized tooling infrastructure) to produce reliable results, distinguishes AI engineering from data science and ML engineering, and argues developers must understand ML fundamentals to solve critical problems that models themselves cannot fix.

🔗 LLM Harnesses and Infrastructure 3 insights

Harnesses are the true product differentiator

A 'harness' refers to the tooling, constraints, and infrastructure built around raw LLM outputs to structure results, reduce hallucinations, and constrain behavior for specific business needs.

Perplexity's competitive advantage

Perplexity likely uses models like Claude but delivers superior deep research capabilities through sophisticated harness layers that process and refine outputs beyond raw API calls.

Software beats retraining costs

While improving foundational models requires hundreds of millions in training costs, building harness software allows teams to iterate quickly and improve performance through traditional code changes.

🧭 Defining AI Engineering 3 insights

Three distinct data disciplines

Data science focuses on statistical algorithms and Bayesian methods; data engineering handles data plumbing and preparation (consuming 80% of effort); AI engineering applies software development skills to leverage existing models.

Job title confusion

The term 'AI Engineer' is inconsistently used by employers to describe both software developers who build with LLMs and ML engineers who train models, requiring careful reading of job descriptions.

The software engineer's entry point

AI engineering allows software developers to enter the field by leveraging existing coding strengths while gradually learning model fundamentals, rather than requiring immediate deep ML expertise.

🚧 When Models Can't Fix The Code 3 insights

The inevitable bottleneck

Teams relying solely on AI-generated code eventually hit walls—such as latency bottlenecks or architectural constraints—where asking the model to fix the problem produces no solution.

First principles prevent paralysis

Understanding how models work under the hood enables developers to research white papers and architect creative solutions when LLMs fail to diagnose complex system issues.

The accountability gap

Without foundational ML knowledge, teams cannot explain to leadership why critical performance issues persist or how to resolve them when AI tools reach their diagnostic limits.

Bottom Line

Build robust harness tooling around LLMs rather than treating AI as a magic black box, and invest in understanding ML fundamentals so you can architect solutions when the inevitable problems arise that the model cannot fix itself.

More from freeCodeCamp.org

View all
Open Models Coding Essentials – Running LLMs Locally and in the Cloud Course
2:17:28
freeCodeCamp.org freeCodeCamp.org

Open Models Coding Essentials – Running LLMs Locally and in the Cloud Course

Andrew Brown tests open-source coding models including Gemma 4, Kimi 2.5, and Qwen across local and cloud deployments to evaluate viable alternatives to proprietary solutions, finding that while some models perform surprisingly well, hardware constraints make cloud hosting the practical choice for most developers.

8 days ago · 10 points
JavaScript Event Loop & Asynchronous Programming
46:23
freeCodeCamp.org freeCodeCamp.org

JavaScript Event Loop & Asynchronous Programming

This video demystifies how JavaScript handles asynchronous operations while remaining single-threaded, explaining the interplay between the call stack, web APIs, callback queues, and the event loop that enables non-blocking execution.

10 days ago · 9 points
Stanford's youngest instructor on InfoSec, AI, catching cheaters - Rachel Fernandez [Podcast #217]
1:18:39
freeCodeCamp.org freeCodeCamp.org

Stanford's youngest instructor on InfoSec, AI, catching cheaters - Rachel Fernandez [Podcast #217]

Rachel Fernandez, Stanford's youngest instructor at 19, discusses why C++ remains vital to modern infrastructure despite security challenges, the risks of AI-generated code built on potentially vulnerable foundations, and her journey from a resource-starved high school to organizing one of the world's largest hackathons with million-dollar budgets.

14 days ago · 10 points