Tutorials·January 6, 2026·12 min readCS336 Notes: Lecture 4 - Mixture of ExpertsMixture of Experts (MoE): adding capacity without proportional compute, routing, load balancing, and what makes MoE stable.machine-learningtransformersstanford-cs336moeRead