LLMs & Models
Real-time tracking of Large Language Model releases, benchmarks, and architecture updates.
Intelligence Leaderboard
Ranking the world's most powerful models by reasoning capability (MMLU).
| Rank | Model | Reasoning (MMLU) | Math (GSM8K) | Coding (HumanEval) | Cost (In/Out per 1M) |
|---|---|---|---|---|---|
1 | GPT-5 (Orion) OpenAI | 96.4% | 98.2% | 97.5% | $10.00/$30.00 |
2 | Gemini 3.0 Ultra Google | 95.8% | 97.1% | 96% | $5.00/$15.00 |
3 | Claude 4 Opus Anthropic | 95.5% | 96.8% | 98.1% | $15.00/$45.00 |
4 | Grok-4 (Supercluster) xAI | 94.9% | 95.5% | 94% | $2.00/$2.00 |
5 | Llama 4 1T MetaOpen Source | 93.5% | 92% | 91.5% | $0.00/$0.00 |
6 | OpenAI o1 (Final) OpenAI | 92.1% | 96% | 94% | $15.00/$60.00 |
7 | Mistral Huge Mistral | 91% | 89% | 90% | $2.00/$6.00 |
Latest Lab News
Real-time IngestionModel Release Videos
Visual Intelligence
Tesla Optimus Gen 2 Update
New capabilities of Tesla's humanoid robot, including delicate object manipulation.

Andrej Karpathy: LLM OS
Andrej Karpathy explains the concept of Large Language Models as an Operating System.

Llama 3.2: Open Source Multimodal
Meta releases Llama 3.2, bringing vision capabilities to edge devices and open source models.

Coding with OpenAI o1
Demonstration of o1's advanced coding and problem-solving abilities in real-time.

Building OpenAI o1
Deep dive into the reasoning capabilities of OpenAI's new o1 model series with the research team.

Introducing Claude 3.5 Sonnet
Anthropic's new model sets industry benchmarks for reasoning, coding, and nuance.

Apple Intelligence | WWDC 2025
Apple integrates generative AI across iPhone, iPad, and Mac with on-device privacy.

Google Project Astra: Real-time Multimodal AI
Google's answer to GPT-4o: A universal AI agent that can see and understand the world.

GPT-4o Launch: Omni Model Demo
First native multimodal model with real-time audio/visual reasoning capabilities.

NVIDIA GTC 2025 Keynote
Jensen Huang unveils the Blackwell platform, defining the next era of AI compute.

Figure 01 + OpenAI | Speech-to-speech
Humanoid robot Figure 01 demonstrating full conversation and task execution.

Google Gemini 1.5 Pro: Long Context
DeepMind showcases 1M+ token context window capabilities in Gemini 1.5 Pro.

Introducing Sora — Text-to-Video
OpenAI's groundbreaking video generation model that simulates physical world dynamics.