🤖 AI & Machine Learning

Artificial intelligence, machine learning, and data science

All Podcasts News Programming AI & Machine Learning Stock Investing Cryptocurrency Personal Finance Real Estate Business & Entrepreneurship Economics

Reinforcement Learning with Human Feedback (RLHF), Clearly Explained!!!

18:02

StatQuest with Josh Starmer

Reinforcement Learning with Human Feedback (RLHF), Clearly Explained!!!

Reinforcement Learning with Human Feedback (RLHF) aligns large language models to produce helpful, polite responses by training a reward model on human preference comparisons, solving the overfitting and cost limitations of supervised fine-tuning.

about 1 year ago · 9 points

Reinforcement Learning with Neural Networks: Mathematical Details

25:01

StatQuest with Josh Starmer

Reinforcement Learning with Neural Networks: Mathematical Details

This video provides a step-by-step mathematical walkthrough of policy gradient reinforcement learning, demonstrating how to derive gradients via the chain rule and use binary reward signals (+1/-1) to correct update directions when training neural networks without labeled data.

about 1 year ago · 6 points

Reinforcement Learning with Neural Networks: Essential Concepts

24:00

StatQuest with Josh Starmer

Reinforcement Learning with Neural Networks: Essential Concepts

This video explains how policy gradients enable neural network training without known target values by guessing actions, observing environmental rewards, and using those rewards to correct the direction of gradient descent updates.

about 1 year ago · 9 points