Anthropic ships Claude Opus 4 — tops SWE-bench, adds extended thinking
Posted 2026-05-23 · Review by 2026-07-23
Claude Opus 4 takes #1 on SWE-bench Verified. Extended thinking and parallel tool use make it the strongest coding agent model available.
Signal, not noise
The AI world, made legible.
The live edge — what's cutting edge right now.
Anthropic
Powered by Opus 4.8 — SWE-bench Pro 69.2 %, Verified 88.6 %. Dynamic workflows with parallel subagents; fast mode 3x cheaper.
Anthropic
Desktop agent now with Claude for Small Business: 15 agentic workflows across finance, ops, sales, marketing, HR and CS. Opus 4.8 upgrades flow through.
Anthropic
SWE-bench Pro 69.2 %, Verified 88.6 %; dynamic workflows enable hundreds of parallel subagents. Anthropic filed a confidential S-1 at ~$965 B valuation.
Cursor
Enterprise multi-team management GA; new pricing at $32–$96/seat/mo. SpaceX plans $60 B acquisition post-IPO (expected close July).
Leads most published reasoning benchmarks and has the cheapest output among the majors; paired with Gemini Spark for long cloud tasks.
Granola
The bot-free favourite of VCs and consultants; captures audio locally on a Mac with no bot in the participant list.
The forces shaping AI — ranked by Overall importance.
Is the $800B+ of AI infrastructure a rational, telecom-style build-out — or a self-referential bubble inflating on its own balance sheets?
Does AI eliminate knowledge work — especially entry-level — or expand it?
Everyone is buying AI; almost no one can prove it's working. Is the gap a measurement lag or a value mirage?
Posted 2026-05-23 · Review by 2026-07-23
Claude Opus 4 takes #1 on SWE-bench Verified. Extended thinking and parallel tool use make it the strongest coding agent model available.
Posted 2026-04-26 · Review by 2026-09-26
Sora 2 is officially deprecated. The API shuts down 24 September 2026. OpenAI is refocusing on enterprise video tooling.
The full, searchable list of every major AI application.
Anthropic
Powered by Opus 4.8 — SWE-bench Pro 69.2 %, Verified 88.6 %. Dynamic workflows with parallel subagents; fast mode 3x cheaper.
Anthropic
Desktop agent now with Claude for Small Business: 15 agentic workflows across finance, ops, sales, marketing, HR and CS. Opus 4.8 upgrades flow through.
Anthropic
SWE-bench Pro 69.2 %, Verified 88.6 %; dynamic workflows enable hundreds of parallel subagents. Anthropic filed a confidential S-1 at ~$965 B valuation.
Leads most published reasoning benchmarks and has the cheapest output among the majors; paired with Gemini Spark for long cloud tasks.
Alibaba
Currently tops the Artificial Analysis leaderboard.
Gemini 3 Pro Image; leads the image-arena leaderboard, with the best multilingual text rendering, free in the Gemini app.
Tool combinations that work together — curated by profession and industry.
bookkeeping and advisory practices
automating the grunt work frees accountants for advisory — the higher-value seat.
generalists and lean teams
a small team can punch far above its weight — pick one strong generalist model and go deep, rather than spreading across many.
estate and letting agents
the real wins are faster, more personal follow-up and listing copy that isn't boilerplate.
What to learn, and in what order — courses, programmes and concepts mapped to the competency you need.
Curated by Eric McLean