Stanford MS&E435 Economics of the AI Supercycle

Topic 1: The Evolution of Model Training and Reasoning

The Shift to Transformers: The evolution of AI accelerated with deep learning and AlexNet, but truly took off with the Transformer architecture, which allowed massively scalable language model training compared to older systems.
- Attention in the transformer architecture: a mechanism that combines dot product to create similarity scores, SoftMax function to turn similarity scores into probability(weights), then computes the weighted sum by multiplying weights by value vectors. This process updates the original word embedding with the new representation.
- Encoder processes the sequence of input words into a vector. Decoder converts this vector back into a sequence.
Scaling Laws to Post-Training: Model intelligence grew by following "scaling laws" (increasing parameters and training data), followed by post-training techniques like RLHF (Reinforcement Learning with Human Feedback) to make base models safe, aligned, and useful as chat assistants.
Emergent Reasoning Capabilities: Recent models achieve advanced intelligence through "test-time compute" and rigorous RL (Reinforcement Learning) environments. This is essentially just throwing a ton of compute at difficult problems during the inference step rather than relying on pre-training. Chain-of-thought reasoning (CoT) was not explicitly programmed but emerged as a natural behavior when models were given massive compute to solve constrained tasks. CoT splits a problem up into logical steps rather than trying to immediately come to a conclusion.

Topic 2: AI Bottlenecks and the "Data Wall"

Current and Future Bottlenecks: While past hurdles included compute limits and pre-training data, the current bottleneck is creating high-quality RL environments. The next major frontier is continual learning—allowing models to learn efficiently from sparse real-world interactions, similar to how a human learns immediately from touching a hot stove.
Running Out of Internet Text: AI labs have nearly exhausted available internet data for pre-training models. To push past this "data wall," they are scanning ancient books and investing heavily in synthetic data generation.

Topic 3: Why Code is the Ultimate AI Training Ground

Verifiable Rewards: Leading labs focus on coding and math because they allow for Reinforcement Learning with Verifiable Rewards (RLVR). Code can be compiled and tested deterministically, providing immediate, accurate feedback on whether the model succeeded.
"AGI Complete": Many researchers view coding as a proxy for General Intelligence, as AI agents can use code as a universal language to execute real-world tasks and tool calls.

Topic 4: The Importance of "Evals" (Evaluations)

Setting the Roadmap: Evals are highly guarded assets because they define what "good" and "bad" performance looks like, acting as the target that models hill-climb toward during training.
Custom Enterprise Standards: Evals vary significantly between companies. A "good" output for Goldman Sachs may look totally different from one for JP Morgan, meaning general models must be specialized to meet these unique standards.

Topic 5: Enterprise Model Specialization

Setting the Ceiling: While massive general AI models set a baseline, enterprises will set their operational "ceiling" by training specialized models on their own proprietary data.
High ROI of Post-Training: Training a specialized model via reinforcement learning uses a fraction of the compute—about 5% of a pre-training budget—making it incredibly cost-effective for enterprises to build their own optimized tools.
Real-World Successes: Applied Compute helped DoorDash automate highly complex menu digitization by directly optimizing the model to reduce error rates against internal style guides. They also partnered with Cognition to build sub-two-second, ultra-fast models dedicated purely to catching coding bugs.

Topic 6: Emerging Architectures and Continual Learning

Learning in Production: Continual learning involves models updating themselves based on user telemetry. For example, Cursor captures data when users accept or revert a code suggestion and uses those implicit rewards to take training steps and improve the model over time.
Sticking with Transformers: Despite debates about the power inefficiency of transformers versus new architectures like Mamba, the current lab consensus is to continue scaling the transformer architecture, relying on massive compute rather than pivoting to untested designs.

Topic 7: Future Industry Predictions

Bullish on Compute Hardware: Due to severe compute scarcity, hardware and chip providers like Nvidia will continue to dominate. However, to avoid Nvidia's massive 75% margins, AI labs may begin designing their own chips in-house.
Bearish on the Human Data Market: As AI models get smarter, they can rely on synthetic data pipelines and automated unit testing. This will drastically shrink the market for human-labeled data companies, forcing them to pivot to harder domains like robotics or egocentric video data.