Tag: Stanford University

  • The 2026 AI Index Report by Stanford University

    The 2026 AI Index Report by Stanford University

    About the paper

    The Stanford AI Index Report 2026 is a broad, global data-pack and secondary-analysis report on the state of artificial intelligence across research, technical performance, responsible AI, economy, science, medicine, education, policy and public opinion.

    It is based on multiple datasets and contributors, including sources such as Epoch AI, GitHub, Lightcast, LinkedIn, Quid, Zeki and McKinsey; it is not a single original survey, and sample sizes vary by section.

    The report’s geographic scope is global, though some chapters rely heavily on U.S., China, Europe and selected country-level data; some methodology details are chapter-specific rather than contained in one unified method statement.

    Length: 425 pages

    More information / download:
    https://hai.stanford.edu/ai-index/2026-ai-index-report

    Core Insights

    1. What is the central argument of the AI Index Report 2026?

    The central argument is that AI is scaling faster than the surrounding systems can adapt. The report frames 2025–26 as the period “after arrival”: AI is no longer an emerging technology sitting on the margins, but a mainstream force moving through work, education, science, medicine, infrastructure and policy. Generative AI reached roughly 53% population-level adoption within three years, organisational adoption reached 88%, and AI companies are scaling revenue and investment faster than previous technology waves.

    But the report repeatedly stresses that capability growth is outpacing governance, measurement and institutional readiness. Benchmarks are saturating, leading models are becoming harder to distinguish, frontier labs are disclosing less, and independent testing does not always confirm developer-reported performance. The result is not a simple story of progress or danger, but a more complex pattern: AI capabilities, adoption and investment are accelerating, while evaluation, regulation, education, labour-market adaptation and responsible AI practice are struggling to keep up.

    2. What does the report say about AI capability and model competition?

    The report argues strongly against the idea that AI capability is plateauing. Frontier systems continued to improve in 2025 across reasoning, coding, mathematics, multimodal understanding and agentic task execution. On SWE-bench Verified, performance rose from around 60% to near the human baseline in a single year. Some models now meet or exceed human baselines on PhD-level science questions, multimodal reasoning and competition mathematics.

    At the same time, the report highlights a “jagged frontier”. AI systems can produce astonishing results in some domains while remaining surprisingly weak in others. One striking example is that Gemini Deep Think achieved gold-medal performance at the International Mathematical Olympiad, while the top model could read analogue clocks correctly only about half the time. Similarly, AI agents improved dramatically on OSWorld, from around 12% to roughly 66% task success, but still fail about one in three attempts on structured computer-use tasks. Robots also remain far from general competence in the physical world, succeeding in only 12% of household tasks despite strong performance in controlled simulation environments.

    The competitive landscape is also changing. The U.S.–China model performance gap has effectively closed, with U.S. and Chinese models trading the lead multiple times since early 2025. As of March 2026, the top U.S. model led the top Chinese model by only 2.7%. At the same time, top frontier models from Anthropic, xAI, Google, OpenAI, Alibaba and DeepSeek are tightly clustered, making raw benchmark performance less useful as a differentiator. The report suggests that competition may increasingly shift towards cost, reliability, latency, usability and domain-specific performance.

    3. How are AI development, infrastructure and talent distributed globally?

    The report shows a field that is both concentrated and dispersing. Frontier model development remains heavily concentrated in industry and in a small number of countries. Industry produced more than 90% of notable AI models in 2025, while academia produced only two notable models. The United States still leads in notable model production, with 59 notable models in 2025 compared with China’s 35 and South Korea’s 8.

    However, China leads in several research and innovation indicators: publication volume, citations, patent output and industrial robot installations. The U.S. still retains advantages in top-tier model production, higher-impact patents and private investment, but China’s scale in research and patenting is now central to the global AI landscape. South Korea stands out for innovation density, leading the world in AI patents per capita.

    Infrastructure is even more concentrated. The United States hosts 5,427 data centres, more than ten times any other country, and global AI compute capacity has grown roughly 3.3 times per year since 2022. Yet the hardware supply chain has a critical dependency: TSMC in Taiwan fabricates almost every leading AI chip. This makes AI sovereignty difficult for most countries because the underlying compute, chips, data centres and talent are unevenly distributed.

    Talent patterns are also shifting. The U.S. remains home to more AI talent than any other country, but the number of AI researchers and developers moving to the U.S. has dropped 89% since 2017, including an 80% decline in the last year alone. Switzerland and Singapore lead in AI researchers and developers per capita, while the report notes that gender gaps in AI talent remain deeply entrenched, with no country approaching parity.

    4. What economic, labour-market and environmental consequences does the report identify?

    Economically, the report depicts AI as both a major investment boom and an uneven productivity story. U.S. private AI investment reached $285.9 billion in 2025, more than 23 times the $12.4 billion invested privately in China, though the report notes that private investment alone may understate China’s total AI spending because of government guidance funds. The U.S. also led in entrepreneurial activity, with 1,953 newly funded AI companies in 2025.

    The productivity evidence is promising but not universal. The report cites productivity gains of 14% to 26% in customer support and software development, while finding weaker or even negative effects in tasks requiring more judgement. This is important because some of the clearest productivity gains appear in fields where entry-level employment is beginning to decline. In U.S. software development, developers aged 22 to 25 saw employment fall nearly 20% from 2024, while headcount for older developers continued to grow.

    The environmental footprint is becoming much harder to ignore. Grok 4’s estimated training emissions reached 72,816 tons of CO₂ equivalent. AI data centre power capacity rose to 29.6 GW, roughly comparable to New York state at peak demand. The report also estimates that annual GPT-4o inference water use may exceed the drinking water needs of 1.2 million people. The implication is that AI’s economic value and social utility must increasingly be weighed against energy, water, infrastructure and supply-chain constraints.

    5. What does the report imply for governance, education, science, medicine and public trust?

    The report’s broader implication is that AI is moving into high-stakes domains before institutional systems are fully ready. Responsible AI is not keeping pace with capability: leading frontier developers commonly report capability benchmarks, but responsible AI benchmark reporting remains inconsistent. Documented AI incidents rose from 233 in 2024 to 362, and the report notes that improving one responsible AI dimension, such as safety, can sometimes degrade another, such as accuracy.

    In education, formal systems are lagging behind actual use. More than 80% of U.S. high school and college students now use AI for school-related tasks, but only half of middle and high schools have AI policies, and just 6% of teachers say those policies are clear. This suggests a growing gap between everyday AI use and the institutional guidance needed to use it well.

    In science and medicine, the report is cautiously optimistic but evidence-conscious. AI models for science can outperform human scientists on some benchmarks, such as ChemBench, but they remain weak in areas such as astrophysics replication and Earth observation. The report also notes that smaller, specialised models can outperform much larger ones in scientific domains, challenging the assumption that bigger is always better. In medicine, AI scribes and clinical note-generation tools saw substantial adoption, with some physicians reporting up to 83% less time spent writing notes and reduced burnout. But the broader evidence base remains thin: a review of more than 500 clinical AI studies found that nearly half relied on exam-style questions rather than real patient data, and only 5% used real clinical data.

    Policy is becoming more active but also more fragmented. AI sovereignty is emerging as a defining policy principle, especially as countries seek greater control over infrastructure, data, models, applications and talent. National AI strategies are expanding, particularly among developing economies, but actual capabilities remain uneven. The EU, U.S. and Asian countries are also moving in different policy directions, with the EU AI Act’s first prohibitions taking effect while the U.S. shifted towards deregulation.

    Finally, public opinion is divided. Experts and the public view AI’s future very differently: 73% of experts expect AI to have a positive impact on how people do their jobs, compared with only 23% of the public. Trust in institutions to regulate AI is fragmented, with the United States reporting the lowest level of trust in its own government to regulate AI among surveyed countries, at 31%. The report therefore closes on a central tension: AI is becoming more capable, more widely adopted and more economically valuable, but trust, governance and social readiness remain far less mature.