Everyone talks about AGI, nobody can really define it. Google DeepMind just dropped a cognitive framework to measure progress toward so‑called “artificial general intelligence”.
Here’s the interesting part: it’s not just another corporate slide deck full of buzzwords. It’s a reasonably solid grid for tracking, step by step, how we go from “nice chatbot” to “system that beats top humans on most useful tasks”.
And if you’re a founder, this is not just research candy. It’s a tool to understand where we actually are, what’s likely to show up in the next 3–5 years, and how to prepare so you can automate as much of your operations as possible before your competitors do.
- break down DeepMind’s framework without the academic fluff,
- see where current models really stand (GPT‑4, Gemini, Claude…)
- look at the limits and blind spots of the framework,
- and most importantly: turn it into a concrete strategy for your business.
---
1. The core problem: everyone says “AGI”, nobody measures it
- “AGI is already here”
- “AGI is 50 years away”
- “AGI is just a marketing term”
The real issue behind these hot takes is simple: no shared metric. If we don’t know what we’re measuring, anyone can say anything.
DeepMind tries to fix that with a simple idea:
Measure progress toward AGI like you’d measure a human’s cognitive abilities: > - their performance on tasks, > - the generality of what they can do, > - their autonomy in the real world.
This is a very “cognitive science” approach: treat AI as an agent that perceives, reasons, acts and learns — not just a token‑predicting machine.
---
2. DeepMind’s framework in plain English: performance × generality × autonomy
The paper (“Levels of AGI for Operationalizing Progress on the Path to AGI”) defines three key axes:
- Performance (depth) – how does the AI compare to humans?
- Generality – is the AI strong on one narrow task (e.g. chess) or on a wide spectrum of tasks (coding, contracts, strategy, learning new domains, etc.)?
- Autonomy – does the AI:
Based on this, DeepMind defines levels of AGI‑ness.
The 6 performance levels
Simplified:
- Level 0 – No useful AI
- Level 1 – Emerging
- Level 2 – Competent
- Level 3 – Expert
- Level 4 – Virtuoso
- Level 5 – Superhuman
Then you cross that with generality: are you superhuman at one game, or expert across 50 types of real‑world tasks?
Finally, you add autonomy: is this just a model you query, or an agent that can chain actions, call tools, trigger workflows, and learn over time?
That combination gives you a way to say: “we’re at this level on the path to AGI”.
---
3. Where do current models actually sit?
DeepMind and others more or less converge on this picture:
- Models like GPT‑4, Claude, Gemini 2.0/3.1 are in the Emerging AGI bucket.
- But they are not competent or expert on most tasks, especially when:
Some numbers to anchor this:
- On abstract reasoning benchmarks like ARC‑AGI, top systems hover around 20–25% accuracy. Very far from expert humans.
- On complex software tasks, some reports show models can complete work that would take a human dev ~2 hours in roughly 50% of cases. Impressive — but not yet reliable or general.
On the other hand, in very narrow domains, we already see Expert/Virtuoso levels:
- AlphaGeometry 2: solves International Mathematical Olympiad geometry problems with performance close to medal‑level humans.
- AlphaEvolve: discovers improved algorithms in scientific/maths problems, beating state‑of‑the‑art on ~75–80% of a 50‑problem set.
So:
We already have superhuman AI in narrow pockets, but not yet general, autonomous intelligence.
For you as a founder, this boils down to one thing:
- No, AGI is not here yet.
- Yes, we already have more than enough to automate 30–60% of many cognitive jobs.
---
4. What DeepMind doesn’t say loudly (but you should care about)
The framework is clean and well‑designed. But it has blind spots.
4.1. Benchmarks are still weak proxies for the real world
- are done on static datasets,
- test tasks that are far from daily business reality,
- don’t measure continuous reliability.
- it hallucinates on a customer contract,
- forgets half the context in a support ticket,
- or writes borderline‑legal emails.
Your real benchmark is:
How much human time do I save at equal or better quality, on a specific process?
No academic framework measures that properly yet.
4.2. “Compensability” is misleading
DeepMind’s grid tends to aggregate capabilities: you can be amazing in one area and terrible in another, and still look “okay on average”.
Some researchers push a different idea: coherence.
A true AGI shouldn’t be brilliant at math and terrible at following basic instructions or planning simple tasks.
- a model that’s insane at writing, but awful at following requirements, is dangerous.
4.3. The real blockers: memory, world models, uncertainty
- short memory,
- fragile understanding of the real world,
- poor handling of uncertainty (they hallucinate instead of saying “I don’t know”).
- box it into well‑defined processes,
- connect it to structured knowledge bases,
- and monitor what it does.
---
5. What this framework changes if you’re actually building
You’re not DeepMind. Your job is not to “reach AGI”. Your job is to:
Cut costs, grow revenue, and free up brain‑time for what actually moves the needle.
DeepMind’s framework is useful for one key thing: projection.
5.1. The next 0–3 years: the age of “useful Emerging AGI”
- models that are more reliable,
- better tooled up (agents, tools, APIs, long‑term memory),
- still far from general Expert‑level AGI.
For you, the question is not “will we have AGI?” but:
Which processes can I already push down to “Emerging/Competent” level with today’s AI?
Typical examples:
- Level 1–2 customer support
- Document preparation (proposals, contracts, reports)
- Prospecting and qualification
- Internal ops (SOPs, documentation, reporting)
All of this is doable today with “Emerging” models + solid engineering.
5.2. 3–7 years: toward “Competent AGI” on many tasks
If DeepMind is roughly right and we hit broad Competent‑level AGI before 2030, you’ll see:
- agents that can run end‑to‑end projects (e.g. launch a full marketing campaign),
- AI that genuinely learns your business over months,
- systems that cut 70–90% of time on some cognitive tasks.
- have their data structured,
- have clear, documented processes,
- and a pro‑automation culture.
The rest will still be arguing on X about whether it’s “real AGI” or not.
---
6. How to use DeepMind’s framework pragmatically
Instead of fantasizing about Level 5 Superhuman, use this framework as a process design tool.
Step 1 – Map tasks to human‑level requirements
List your key operations and ask a simple question for each:
- Does this task really need expert‑level performance?
- Or is competent enough?
- Or is intern‑level fine if it’s reviewed?
- Answering “where is my invoice?” tickets → intern‑level.
- Writing a SEO blog post → competent, with review.
- Negotiating a $500k contract → probably human expert only.
Step 2 – Match with current AI levels
- if Emerging is enough → near‑term automation is realistic,
- if you need Competent → partial automation with human supervision,
- if you need Expert/Virtuoso → AI is a copilot, not a pilot.
You stop dreaming about “full automation” and start doing smart automation.
Step 3 – Layer in autonomy
- Should the AI just propose (drafts, suggestions)?
- Or can it act (send emails, create tickets, push code)?
- proposal + human validation,
- then, once you have quality metrics, gradually enable autonomy (e.g. AI fully handles simple cases, routes complex ones to humans).
---
7. Conclusion: AGI is an horizon, not an excuse to wait
- it clarifies performance levels,
- it highlights the importance of generality and autonomy,
- it gives a common language to track progress.
But if you’re a founder, the worst move right now is to sit back and “wait for real AGI”.
- automating 30–60% of their operations,
- cutting costs,
- scaling without hiring 10 extra people.
If AGI shows up around 2030 like DeepMind suggests, it will just amplify a gap that started years earlier.
You can either be in the camp that gets disrupted by AGI, or in the camp that’s already building an AI‑powered operational machine.
At Deepthix, we’re firmly in the second camp.
Want to automate your operations with AI? Book a 15-min call to discuss.
