
AI Explained Official Podcast
Covering the biggest news of the century - the arrival of smarter-than-human AI. From the author of Simple Bench, which reveals the remaining gap between LLM and human reasoning. Hype-free, and the British accent is a freebie bonus.
Episodes
20 episodes
Gemini 2.5 Pro - It’s a Smart Chatbot … (New Simple High Score)
Gemini gets a new record on Simple Bench, and several other benchmarks. I’ll go deep to explore its nuances, including how it deceptively reverse engineers answers, does better on certain coding benchmarks than others, may have a universal ‘con...
•
Season 2
•
Episode 11
•
21:21
.png)
Did AI Just Get Commoditized? Gemini 2.5, New DeepSeek V3, & Microsoft vs OpenAI
Gemini 2.5 is out, on the same day as the new DeepSeek V3 (which should power Deepseek R2). Do both models prove AI is being commoditized? Let’s find out, on this blockbuster day of AI releases. Plus exclusives from the Information, Simple indi...
•
Season 2
•
Episode 10
•
13:47
.png)
Manus AI - The Calm Before the Hypestorm … (vs Deep Research + Grok 3)
Is Manus AI the memecoin of the AI world, or legit? I’ll compare it to OpenAI’s Deep Research, Operator, Grok 3 DeepSearch and more to find out. I’ll also let you in on some of the secrets of what makes a good hype campaign, the estimated costs...
•
Season 2
•
Episode 9
•
12:58

GPT 4.5 - not so much wow
GPT 4.5 is here, and do you remember when AI lab CEOs like Sam Altman and Dario Amodei were betting everything on scaling up base models like this one? Well let’s find out what would have happened if the future of AI rested on models like GPT 4...
•
Season 2
•
Episode 8
•
25:05
.png)
Claude 3.7 is More Significant than its Name Implies (ft DeepSeek R2 + GPT 4.5 coming soon)
Claude 3.7 is here, hot on the heels of Grok 3 and a host of other developments, but how good is it really? And what does it say about the next few months in AI? I’ve read the papers, played with the model for hours, and benched it on Simple...
•
Season 2
•
Episode 7
•
27:39

AGI: (gets close), Humans: ‘Who Gets the Money?’
A 'frontier reasoning model' from just 1000 examples (s1). A $100B Musk bid for power. Gemini 2, Rand and warning from Amodei. Here’s 7-8 developments you may have missed but which I would argue help us understand how the next few years will pl...
•
Season 2
•
Episode 6
•
22:17

Deep Research by OpenAI - The Ups and Downs vs DeepSeek R1 Search + Gemini Deep Research
12 hours ago Deep Research was unveiled, and I’ve tested it thoroughly, including vs Deepseek R1 with search, Gemini Deep Research and even R1 in Perplexity. It’s a notable step forward, with one big caveat. I’ll go through all the benchmark...
•
Season 2
•
Episode 5
•
18:32

o3-mini and the “AI War”
o3-mini is here, and yes, I’ve read the paper in full - 2 hours after release, and even the post-launch Reddit AMA. Some epic details like a FrontierMath score that made me double-take, a likely new Cursor favorite, bio risk expertise and a ...
•
Season 2
•
Episode 4
•
15:21
.png)
Nothing Much Happens in AI, Then Everything Does All At Once
When it rains, it pours. OpenAI Operator tested and reviewed, with full paper analysis. Perplexity Assistant is useful. Then Stargate, is it all smoke and mirrors? Strong rumours of an o3+ model from Anthropic. Then a full breakdown of Deeps...
•
Season 2
•
Episode 3
•
23:09
.png)
Altman Expects a ‘Fast Take-off’, ‘Super-Agent’ Debuting Soon and DeepSeek R1 Out
OpenAI looks set to debut their Operator system, and some leaks are out. At the same time Deepseek R1 releases some numbers, and Sam Altman says he might have been wrong before, and now anticipates a 'fast take-off'. Plus two papers to gi...
•
Season 2
•
Episode 2
•
13:11
.png)
OpenAI Backtracks on Superintelligence + Altman Brings His Timeline Forward
Sam Altman unexpectedly brings his timelines to AGI forward, while OpenAI backtrack on superintelligence. None of these changes were heralded, but they are significant. Plus the new year brings new assessments of the true capability of models t...
•
Season 2
•
Episode 1
•
23:41

o3 - wow
o3 isn’t one of the biggest developments in AI for 2+ years because it beats a particular benchmark. It is so because it demonstrates a reusable technique through which almost any benchmark could fall, and at short notice. I’ll cover all the...
•
Season 1
•
Episode 9
•
22:20
.png)
Never Browse Alone? - Gemini 2 Live and ChatGPT Vision
The ‘Gemini 2 Era’ begins … with screen-sharing? But really, it’s a great free tool, for curiosity satisfying rather than bleeding-edge intelligence. I give you the benchmarks, the highlights and of course, the latest from OpenAI Advanced Vo...
•
Season 1
•
Episode 8
•
13:40

Sora is Out, But is it a Distraction?
After a 10 month wait, OpenAI have released Sora to paying users. With just a prompt it can generate videos of up to 20 seconds in lower resolutions, and 10 seconds at 1080p if you can fork out $200/month. I’ve tested it and read the system ...
•
Season 1
•
Episode 7
•
15:34

o1 Pro Mode – Full Analysis (plus o1 paper highlights)
Oh boy. o1 pro mode out on the same night as o1 full. I read the 49 page paper, ran my own tests, spent my fuel allowance on Pro Mode and will give you all the highlights. Suffice to say the story is not as simple as it first appears. ...
•
Season 1
•
Episode 6
•
16:43
.png)
AI Breaks Its Silence: OpenAI’s ‘Next 12 Days’, Genie 2, and a Word of Caution
Calmest before the storm? Whatever analogy you want to use things had gotten quiet toward the end of 2024. But then tonight we got Genie 2, and a series of scheduled announcements from OpenAI. Sora is soon here, and o1, but I dive deeper int...
•
Season 1
•
Episode 5
•
15:29

New Google Model Ranked ‘No. 1 LLM’, But There’s a Problem
A new and mysterious Gemini model appears at the top of the leaderboard, but is that the full story? I dig behind the headline to show you some anti-climactic results, give some context with leaks in the last 48 hours of diminishing returns to ...
•
Season 1
•
Episode 4
•
15:19
.png)
Leak: ‘GPT-5 exhibits diminishing returns’, Sam Altman: ‘lol’
The last few days have seen two narratives emerge. One, derived from yesterday’s OpenAI leak in TheInformation, that GPT-5/Orion is a disappointment, and less of a leap than GPT-3 to GPT-4. The second comes from a series of 4 clips (shown in th...
•
Season 1
•
Episode 3
•
15:44
.png)
ChatGPT with Search, Altman Answers Anything and Simple Bench Out
The Google destroyer, the Perplexity crusher? Or just hype? ChatGPT with Search is here, and simultaneously Altman and co did an AMA on Reddit, covering GPT-5, Sora, SearchGPT and a lot more. Plus, the biggest news of them all: Simple Bench is ...
•
Season 1
•
Episode 2
•
15:20
.png)
The New Claude 3.5 Sonnet: Better, Yes, But Not Just in the Way You Might Think
A new state of the art LLM (at least for creative writing and basic reasoning) but what lies behind the numbers that were put out? Is it for real, and are AI agents about to grab your mouse and shake your cursor? Plus, results on my own...
•
Season 1
•
Episode 1
•
22:34
