Daniel Guetta on the Guts of AI, Agentic AI & Why LLMs Hallucinate | The Real Eisman Playbook Ep 46

| Stock Investing | February 16, 2026 | 109 Thousand views | 1:02:40

TL;DR

Columbia Business School Professor Daniel Guetta explains that Large Language Models are fundamentally probabilistic autocomplete engines trained on internet data, converting words to numerical embeddings to predict sequences—making hallucinations an inherent feature rather than a malfunction when handling novel information.

🧠 AI Fundamentals: Two Distinct Paradigms 2 insights

Predictive AI dominates current value

Traditional machine learning (predictive AI) has generated business value since the 1980s through numerical pattern recognition for credit scoring, fraud detection, and pricing models like Zillow's Zestimate, which tweaks parameters against historical sales data.

Generative AI unlocks unstructured data

Unlike predictive AI which requires Excel-style numerical inputs, GenAI and Large Language Models process text and images through deep neural networks, overcoming the limitation of earlier systems that couldn't multiply text or analyze visual content.

⚙️ The Mechanics of LLMs 3 insights

Sophisticated autocomplete engines

LLMs generate responses by predicting one word at a time based on probability, sequentially building sentences by determining what word should follow the previous sequence, requiring the model to reprocess the entire conversation history for each new token.

Embeddings convert language to mathematics

Words are transformed into numerical vectors (embeddings) with thousands of dimensions based on co-occurrence patterns in training data, causing conceptually similar words like 'king' and 'queen' to cluster together while unrelated terms like 'Pringle' and 'existentialism' remain distant.

Internet-scale training creates the model

Models learn these embedding values through machine learning algorithms that ingest vast portions of internet text, automatically adjusting numerical weights until the statistical patterns match the training corpus without human-defined rules.

⚠️ Why Hallucinations Are Inevitable 2 insights

Pattern mimicry versus factual retrieval

LLMs hallucinate because they statistically predict plausible-sounding word sequences rather than retrieving verified facts, meaning they should be expected to generate incorrect information when the statistical patterns in their training data don't reflect reality.

The novelty problem

These models fail catastrophically with breaking news or novel events—such as when ChatGPT incorrectly stated Maduro was still in Venezuela immediately after he was removed—because they rely entirely on historical patterns absent from their training data.

Bottom Line

Treat LLMs as probabilistic pattern-matching autocomplete tools rather than knowledge databases, implementing verification workflows for factual accuracy while recognizing that their probabilistic nature makes hallucinations unavoidable, particularly with real-time or novel information.

More from Steve Eisman

View all