Coding Challenge Session: Local Browser Conversational Chatbot (STT, TTS, and more?)

| Programming | October 06, 2025 | 9.14 Thousand views | 2:16:32

TL;DR

Daniel Shiffman builds a local browser-based conversational chatbot using p5.js and Transformers.js, demonstrating how to run lightweight open-source AI models (Whisper for speech-to-text, Kokoro for text-to-speech) entirely in the browser without cloud dependencies.

🎨 Creative AI Philosophy 3 insights

Prioritize local open-source over cloud AI

Shiffman emphasizes running models locally on consumer hardware rather than using closed cloud-based systems from big tech companies to maintain data privacy and agency.

AI for artistic expression, not productivity

The goal is demystifying AI through creative coding, social commentary, and weird art projects rather than building useful assistants.

Simple brains beat complex LLMs

You don't need large language models—Markov chains, context-free grammars, or pattern matching can power creative chatbots effectively.

💻 Technical Architecture 3 insights

Speech recognition with Whisper

Uses OpenAI's open-source Whisper model (Tiny variant optimized for English) converted to ONNX format for browser compatibility via Transformers.js.

Lightweight voice synthesis

Implements Kokoro TTS, a recent small-footprint text-to-speech model from Hugging Face that runs fast on local hardware without cloud APIs.

Browser-based ML stack

Leverages the Hugging Face Transformers.js library imported via CDN to run ML pipelines directly in p5.js sketches using modern async/await patterns.

⚙️ Implementation Details 3 insights

Push-to-talk interface

Creates a minimal UI where holding the mouse button starts recording (red) and releasing triggers transcription (green).

Model transparency checks

Stresses the importance of reading model cards and cites Margaret Mitchell's framework for transparent model reporting before implementation.

ONNX standard format

Uses Open Neural Network Exchange (ONNX) format to ensure model weights are compatible with JavaScript and browser-based inference.

Bottom Line

Build privacy-preserving voice interfaces entirely in the browser using lightweight open-source models like Whisper and Kokoro TTS to create artistic AI projects without cloud dependencies.

More from The Coding Train

View all
Coding Challenge 187: Bayes Theorem
53:38
The Coding Train The Coding Train

Coding Challenge 187: Bayes Theorem

The Coding Train demonstrates how to implement a Naive Bayes text classifier in JavaScript from scratch, using a concrete library book probability example to explain Bayes Theorem before coding a lightweight, browser-based word-frequency classification system.

5 months ago · 9 points

More in Programming

View all
Deploying AI Models with Hugging Face – Hands-On Course
6:53:14
freeCodeCamp.org freeCodeCamp.org

Deploying AI Models with Hugging Face – Hands-On Course

This hands-on tutorial demonstrates how to navigate the Hugging Face ecosystem to deploy AI models, focusing on text generation with GPT-2 using both high-level Pipeline APIs and low-level tokenization workflows. The course covers practical implementation details including subword tokenization mechanics and the platform's three core components: Models, Datasets, and Spaces.

about 4 hours ago · 9 points
Claude Code Tutorial - Build Apps 10x Faster with AI
58:11
Programming with Mosh Programming with Mosh

Claude Code Tutorial - Build Apps 10x Faster with AI

Mosh Hamadani demonstrates how Claude Code enables developers to build production-grade software 10x faster by constructing a full-stack AI-powered support ticket system, emphasizing that AI augments rather than replaces software engineering fundamentals.

1 day ago · 10 points