AI Explained Official Podcast
Covering the biggest news of the century - the arrival of smarter-than-human AI. From the author of Simple Bench, which reveals the remaining gap between LLM and human reasoning. Hype-free, and the British accent is a freebie bonus.
AI Explained Official Podcast
Deep Research by OpenAI - The Ups and Downs vs DeepSeek R1 Search + Gemini Deep Research
Use Left/Right to seek, Home/End to jump to start or end. Hold shift to jump forward or backward.
12 hours ago Deep Research was unveiled, and I’ve tested it thoroughly, including vs Deepseek R1 with search, Gemini Deep Research and even R1 in Perplexity. It’s a notable step forward, with one big caveat. I’ll go through all the benchmark figures, my initial impression of the o3 model within, and much more.
Deep Research: https://openai.com/index/introducing-deep-research/
https://www.youtube.com/watch?v=YkCDVn3_wiw
GAIA Bench: https://openreview.net/forum?id=fibxvahvs3
https://openreview.net/pdf?id=fibxvahvs3
CodeELO:https://arxiv.org/pdf/2501.01257
CamelCamel:https://uk.camelcamelcamel.com/
Deepseek R1 with search: https://chat.deepseek.com/
https://arxiv.org/pdf/2501.12948
HaluBench: https://arxiv.org/pdf/2407.08488
Chapters:
00:00 - Introduction
01:06 - Powered by o3, Humanity’s Last Exam, GAIA
03:55 - Simple Tests
06:00 - Good News vs Deepseek R1 and Gemini Deep Research
09:32 - Bad News on Hallucinations
14:14 - What Can’t it Browse?
14:42 - For Shopping?
16:40 - Final thoughts