πŸ”¬ Training Transformers to solve 95% failure rate of Cancer Trials β€” Ron Alfa & Daniel Bear, Noetik

| Podcasts | April 20, 2026 | 565 views | 1:25:22

TL;DR

Noetik is tackling the 95% failure rate of cancer clinical trials by training transformers on proprietary multimodal patient tumor data to identify hidden biological subtypes and match therapies to responsive populations, moving beyond simplistic biomarkers and outdated cell lines.

🎯 The Patient Selection Problem 3 insights

Cancer trials fail due to patient mismatch, not drug quality

Ron Alfa argues that 90-95% of oncology drug failures stem from inability to identify responsive patient subpopulations rather than pharmacological flaws in the molecules themselves.

Traditional biomarkers are too simplistic for complex biology

Current clinical methods rely on single mutations or protein stains that miss the rich multimodal patterns determining therapeutic response.

True cancer subtypes remain largely unknown

Pathology-based classifications like 'lung cancer' mask multiple distinct functional subtypes that require data-driven approaches to identify properly.

πŸ”¬ Data Generation Strategy 3 insights

Building proprietary datasets from fresh human tumors

Noetik built an in-house lab to process thousands of patient samples into spatial arrays, rejecting decades-old immortalized cell lines that fail to reflect actual human tumor biology.

Intentional dataset design beats brute force collection

Following the Protein Data Bank and ImageNet models, they curate high-quality, multimodal data at scale rather than cobbling together existing public repositories.

Imaging enables scalable, information-dense profiling

Spatial array imaging captures many patients per slide, providing rich visual biological data at significantly lower cost than sequencing runs.

πŸ€– AI Architecture & Applications 3 insights

Self-supervised transformers learn biological subtypes unbiased

Models identify therapeutically relevant cancer subtypes directly from patient data without preconceptions about whether drivers are genetic, immune, or spatial.

Dual-use models for discovery and trial rescue

The same architecture enables reverse translation for target discovery and retrospective analysis of failed Phase 2/3 trials to design better patient cohorts.

Scaling toward generalizable cancer models

With several hundred patients per major indication, models generalize across cancer types, though Daniel Bear notes biology remains complex and requires continued data expansion.

Bottom Line

Pharmaceutical companies must abandon outdated cell lines and simplistic single-biomarker approaches in favor of training transformers on large-scale, multimodal patient tumor datasets to identify true biological subtypes and match therapies to responsive populations before initiating trials.

More from Latent Space

View all
Extreme Harness Engineering for the 1B token/day Dark Factory β€” Ryan Lopopolo, OpenAI Frontier
1:17:54
Latent Space Latent Space

Extreme Harness Engineering for the 1B token/day Dark Factory β€” Ryan Lopopolo, OpenAI Frontier

Ryan Lopopolo reveals how OpenAI's Frontier team built a 'Dark Factory' processing 1 billion tokens daily, generating over 1 million lines of code from zero human-written code in 5 months. By treating human attention as the only scarce resource and enforcing strict constraints like sub-minute builds, the team shifted from manual coding to autonomous agents that write, review, and merge their own code.

15 days ago · 10 points
Marc Andreessen introspects on Death of the Browser, Pi + OpenClaw, and Why "This Time Is Different"
1:16:20
Latent Space Latent Space

Marc Andreessen introspects on Death of the Browser, Pi + OpenClaw, and Why "This Time Is Different"

Marc Andreessen frames artificial intelligence as an '80-year overnight success,' arguing that while the field has cycled through boom-bust periods since 1943, the current convergence of LLMs, reasoning models, agents, and recursive self-improvement represents a permanent inflection point where the technology finally 'works' at scale, justifying the view that 'this time is different' for builders and investors.

19 days ago · 9 points