Produits
  • Mutment IQ
    Assistant IA pour les professionnels du développement durable
  • OBJECTIF
    Une formation numérique pour renforcer la résilience au quotidien
  • Objectif du musée
    Rendre le développement durable engageant, facile et mesurable
Laboratoires
  • Vue d'ensemble des laboratoires
    Des solutions innovantes, adaptées à vos besoins
  • Flux de travail IA
    Automatisez les tâches complexes sans effort
  • Formation sur le développement durable
    Une formation personnalisée qui génère de l'impact
  • Consultatif
    Conseils et soutien d'experts en matière d'ESG
L'entreprise
  • À propos de Muuvment
    À propos de nous, de nos clients, de nos carrières
  • Ressources
    Dernières informations sur le secteur
  • Questions fréquemment posées
    Apprenez tout sur nos produits
Etudes de cas
Zédra

Nous avons lancé un programme d'action environnementale pour inciter les jeunes à participer à des initiatives de développement durable.

Ministère de l'Intérieur, Gouvernement des Bermudes

Nous avons lancé un programme d'action environnementale pour inciter les jeunes à participer à des initiatives de développement durable.

Nous avons élaboré une stratégie ESG complète et pluriannuelle qui associe la philanthropie d'entreprise aux objectifs commerciaux.

Nous avons lancé un programme d'action environnementale pour inciter les jeunes à participer à des initiatives de développement durable.

Contacter
Connectez-vous
  • QI
    Assistant IA pour les professionnels du développement durable
  • Finalité
    Rendre le développement durable engageant, facile et mesurable
Connectez-vous
fr-FR
English
Français
Essayez IQ dès aujourd'hui
Essayez IQ dès aujourd'hui

Connectez-vous à Muuvment.

À quelle plateforme souhaitez-vous accéder ?

QI Assistant d'aide pour les professionnels du développement durable

Finalité Rendre le développement durable engageant, facile et mesurable

Environmental Impact Tracking — Sources & Methodology

Data Sources, Assumptions & Calculation Approach

Overview

This document describes the sources, methodology, and caveats for the per-model environmental impact factors used in the IQ Assistant's environmental impact tracking feature (src/functions/ai/environmental-factors.ts).

Glossary

Abbreviation
Full Term
ADPe
Abiotic Depletion Potential for elements — depletion of non-renewable mineral/metal resources (kg Sb eq)
CFE
Carbon-Free Energy — percentage of electricity from carbon-free sources, measured hourly (Google metric)
CIF
Carbon Intensity Factor — grams of CO₂ equivalent emitted per watt-hour of electricity (gCO₂e/Wh)
CML-IA
CML Impact Assessment — characterization method for life cycle impact assessment (Leiden University)
CO₂e / CO₂eq
Carbon dioxide equivalent — standardized unit for greenhouse gas emissions
DEA
Data Envelopment Analysis — mathematical method for comparing multi-dimensional efficiency
FLOP
Floating-Point Operation — basic unit of computational work
GHG
Greenhouse Gas — gases that trap heat in the atmosphere (CO₂, CH₄, N₂O, etc.)
GPU
Graphics Processing Unit — hardware accelerator used for AI model inference
kT
Thousand tokens — unit for measuring LLM input/output volume (1 kT = 1,000 tokens)
LCA
Life Cycle Assessment — methodology for evaluating environmental impacts across a product's full lifecycle (ISO 14044)
MJ / kJ / J
Megajoule / kilojoule / joule — units of energy (1 MJ = 1,000 kJ = 1,000,000 J)
mL / µL
Milliliter / microliter — units of volume (1 mL = 1,000 µL)
MTok
Million tokens — unit for LLM pricing and throughput measurement
PE
Primary Energy — total energy extracted from natural resources to produce electricity (MJ)
PPA
Power Purchase Agreement — contract to buy renewable electricity directly from generators
PUE
Power Usage Effectiveness — ratio of total data center energy to IT equipment energy (1.0 = perfect)
REC
Renewable Energy Certificate — tradable certificate representing 1 MWh of renewable electricity generated
Sb eq
Antimony equivalent — reference unit for comparing mineral/metal resource depletion (CML-IA method)
TPU
Tensor Processing Unit — Google's custom AI accelerator hardware
Wh / mWh / µWh
Watt-hour / milliwatt-hour / microwatt-hour — units of energy (1 Wh = 1,000 mWh)
WUE
Water Usage Effectiveness — liters or milliliters of water consumed per watt-hour of energy (mL/Wh)

Primary References

  1. Jegham et al. 2025 — "How Hungry is AI? Benchmarking Energy, Water, and Carbon Footprint of LLM Inference"
    • arXiv: 2505.09598 (v6, November 24, 2025)
    • Provider infrastructure multipliers (PUE, WUE, CIF) from Table 1
    • Per-query energy benchmarks from Table 4 (30+ models)
    • Validated within 19% of OpenAI CEO's disclosed 0.34 Wh/query for GPT‑4o
    • Includes reasoning model class tracking (o3, DeepSeek‑R1) and recent Claude/GPT families
    • Key finding: o3 and DeepSeek‑R1 consume >29 Wh per long prompt (~65× the most efficient models)
    • Uses cross-efficiency DEA (Data Envelopment Analysis) for multi-dimensional sustainability ranking
  2. Google 2025 — "Measuring the environmental impact of delivering AI at Google Scale"
    • Google Cloud Blog
    • Full Report PDF
    • arXiv: 2508.15734
    • First official per-query measurement: 0.24 Wh energy, 0.03 gCO₂e, 0.26 mL water (median Gemini text prompt)
    • 33× reduction in per-prompt energy over 12 months (May 2024–May 2025); 44× total emissions reduction
  3. Epoch AI 2025 — "How much energy does ChatGPT use?"
    • Article
    • Bottom-up FLOP-based estimates for GPT‑4o class models
  4. Couch 2026 — "Electricity use of AI coding agents"
    • Blog
    • Per-token rates derived from Epoch AI data: ~0.39 Wh/MTok input, ~1.95 Wh/MTok output
  5. EcoLogits — Open-source parametric estimation library (v0.9.2, January 2025)
    • Methodology
    • GitHub
    • PyPI
    • Formula: E_output = #T × (8.91e-5 × P_active + 1.43e-3) Wh
    • LCA framework compliant with ISO 14044
    • Source for Primary Energy (PE), Abiotic Depletion Potential (ADPe), and Water Consumption Factor (WCF) lifecycle factors via ADEME Base Empreinte®
    • v0.9.0 methodology update (Nov 2024): updated model repository, electricity mix listing, homepage
  6. Ritchie 2025 — "What's the carbon footprint of using ChatGPT or Gemini?"
    • Substack
  7. GreenPT — Sustainability documentation
    • Sustainability Page
    • Metric Blog Post
  8. Scaleway — Data center environmental reports
    • Environmental Leadership
    • DC5 PUE Dashboard
    • Environmental Footprint Calculator
    • Footprint Estimation Methodology
    • 2024 CSR Impact Report (PDF)
  9. RTE France — Annual electricity review 2024
    • Key Findings (2024)
    • 2025 First Trends (PDF)
    • France grid carbon intensity: 21.7 gCO₂eq/kWh (2024)
  10. Google 2025 Environmental Report
    • Report
    • Region Carbon Data
  11. Microsoft 2025 Environmental Sustainability Report
    • Report
  12. AWS Sustainability
    • Overview
  13. ADEME Base Empreinte® — French Agency for Ecological Transition
    • Base Empreinte®
    • ADEME Data Portal
    • Source for electricity lifecycle data: Primary Energy (PE) and Abiotic Depletion Potential (ADPe) per kWh by country/region
    • Used by EcoLogits for PE and ADPe factors
  14. Boavizta — Open-source methodology for embodied impacts of IT equipment
    • Methodology
    • Boavizta API
    • Combined with NVIDIA Product Carbon Footprint data for hardware manufacturing impacts
    • Used by EcoLogits for lifecycle impact allocation
  15. Muxup 2026 — "Per-query energy consumption of LLMs"
    • Article
    • Independent energy benchmarking of open-weight models (DeepSeek‑R1, GPT‑OSS‑120B) using InferenceMAX benchmark suite
    • Confirms proprietary models lack independent benchmarks; only provider self-reporting available
    • DeepSeek‑R1: 0.63–16.3 Wh/query depending on quantization and output length
  16. CML-IA — Characterization factors for life cycle impact assessment (Leiden University)
    • CML-IA Characterisation Factors
    • Abiotic Depletion in LCIA
    • The Abiotic Depletion Potential: Background, Updates, and Future (2016)
    • Abiotic depletion characterization factors for elements: copper = 1.4 × 10⁻³ kg Sb eq/kg, gold = 52.2 kg Sb eq/kg
    • Used for ADPe real-world comparison (copper mining equivalent)
  17. Fairphone 5 LCA 2024 — Life Cycle Assessment of the Fairphone 5 (Fraunhofer IZM, updated September 2024)
    • Fairphone Sustainability
    • FP5 LCA Report (PDF)
    • Smartphone ADPe: 1.25 × 10⁻³ kg Sb eq (whole device, 3-year use cycle)
    • Production phase accounts for ~100% of ADPe; integrated circuits and precious metals are dominant contributors
    • Gold and copper identified as top mineral depletion contributors in consumer electronics
    • See also: Fairphone 6 LCA (Dec 2025)
  18. Oviedo et al. 2025 — "Energy Use of AI Inference: Efficiency Pathways and Test-Time Compute" (Microsoft Research)
    • arXiv: 2509.20241 (September 24, 2025)
    • Microsoft Research
    • Bottom-up methodology estimating per-query energy of large-scale LLM systems based on token throughput estimation
    • Median energy per query: 0.34 Wh (IQR: 0.18–0.67 Wh) for frontier models (>200B params) on H100 nodes
    • Test-time compute / agentic workflows: 10–15× more energy-intensive than standard inference (up to 4.32 Wh)
    • Key insight: non-production estimates overstate energy use by 4–20× vs production deployments
  19. Caravaca et al. 2025 — "From Prompts to Power: Measuring the Energy Footprint of LLM Inference"
    • arXiv: 2511.05597 (November 5, 2025)
    • 32,500+ measurements across 21 GPU configurations and 155 architectures
    • Batch processing dramatically reduces energy: Llama 405B single prompt 21.7 Wh vs 0.6 Wh/prompt in batch of 100
    • Output tokens have ~11× greater energy impact than input tokens
    • Released Chrome browser extension for estimating energy for ChatGPT/Gemini/DeepSeek
  20. Niu et al. 2025 — "TokenPowerBench: Benchmarking the Power Consumption of LLM Inference"
    • arXiv: 2512.03024 (December 2, 2025)
    • 15+ open-source models, 1B–405B parameters
    • Super-linear energy scaling: LLaMA‑3 1B to 70B = 7.3× energy increase for 70× parameters
    • MoE models (Mixtral‑8x7B) consumed energy comparable to dense 8B models
    • TensorRT-LLM and vLLM reduce energy per token by 25–40% vs Transformers engine
  1. Wilhelm et al. 2025 — "Beyond Test-Time Compute Strategies: Advocating Energy-per-Token"
    • EuroMLSys '25, ACM (5th Workshop on Machine Learning and Systems, Rotterdam)
    • Chain-of-Thought prompting on Llama 1B: accuracy gains of +0% to +19% depending on task
    • Majority Voting energy overhead: +72% to +177% more energy
    • Proposes dynamic reasoning depth regulation to balance accuracy and energy
  2. Jin et al. 2025 — "The Energy Cost of Reasoning: Analyzing Energy Usage in LLMs with Test-time Compute" (Harvard)
    • arXiv: 2505.14733 (May 20, 2025; revised November 9, 2025)
    • Test-time compute surpasses traditional model scaling in accuracy/energy efficiency for complex reasoning tasks
    • Rising computational demands of reasoning require careful energy-cost consideration
  3. Li et al. 2023–2025 — "Making AI Less 'Thirsty': Uncovering and Addressing the Secret Water Footprint of AI Models"
    • arXiv: 2304.03271 (April 2023, updated through 2025)
    • Peer-reviewed: Communications of the ACM, 2025
    • Training GPT‑3 in Microsoft US data centers: 700,000 liters on-site, 5.4 million liters total
    • GPT‑3 inference: 500 mL bottle of water per 10–50 responses depending on location/time
    • Principled framework covering Scope 1 (on-site) and Scope 2 (off-site) water footprint
  4. De Vries 2025 — "Carbon and Water Footprints of Data Centers"
    • Patterns (Cell Press), 2025
    • AI systems: 32.6–79.7 million tonnes CO₂ in 2025 (comparable to New York City)
    • AI water footprint: 312.5–764.6 billion liters in 2025 (comparable to global annual bottled water consumption)
  5. EcoLogits (JOSS) — "EcoLogits: Evaluating the Environmental Impacts of Generative AI"
    • JOSS Paper (Journal of Open Source Software, 2025)
    • Authors: Samuel Rince et al. (GenAI Impact non-profit)
    • ISO 14044-compliant LCA approach
    • GWP now sourced from Our World in Data; PE and ADPe retained from ADEME Base Empreinte
    • v2025 update integrates ML.ENERGY Leaderboard v3.0 data for improved per-token energy calibration across 46 models
  6. Mistral 2025 — Official Environmental Report
    • Announcement (January 2025)
    • First comprehensive lifecycle analysis of an AI model (Mistral Large 2, 123B params)
    • Per 400-token query: 1.14 gCO₂e and 45 mL water
  7. Pronk et al. 2025 — "Benchmarking Energy Efficiency of Large Language Models Using vLLM"
    • arXiv: 2509.08867 (September 10, 2025)
    • Energy efficiency decreases close to linearly with model parameter size for same-architecture models
  8. Kumar et al. 2025–2026 — "OverThink: Slowdown Attacks on Reasoning LLMs"
    • arXiv: 2502.02542 (February 2025, revised February 2026)
    • Adversarial attacks can force reasoning models to generate massively inflated reasoning chains — up to 18× slowdown on FreshQA and 46× slowdown on SQuAD
    • Directly translates to proportional energy increases since energy scales linearly with token generation
  9. Ozcan et al. 2025 — "Quantifying the Energy Consumption and Carbon Emissions of LLM Inference via Simulations"
    • arXiv: 2507.11417 (July 15, 2025)
    • GPU power model simulation framework
    • Found renewable offset potential of up to 69.2% with carbon-aware scheduling
  10. van Oers et al. 2020 — "Abiotic resource depletion potentials (ADPs) for elements revisited"
    • Int J Life Cycle Assess, Springer
    • Updated ultimate reserve estimates and introducing time series for production data
    • Latest revision of CML-IA ADPe characterization factors
  11. Nature Scientific Reports 2024 — "Reconciling the contrasting narratives on the environmental impact of large language models"
    • Nature (2024)
    • Addresses uncertainties from hardware, geography, and individual worker behavior
    • Excluding Scope 3 helps avoid non-trivial uncertainty that could distort comparative eco-efficiency
  12. ecoinvent — Life cycle inventory database
    • ecoinvent Electricity
    • Version 3.12 (February 2026), updated electricity market mixes reflecting 2021–2022 data
    • 3,500+ datasets in 250+ geographies; large countries split into sub-regions
    • Key paper: Treyer & Bauer (2016), "Life cycle inventories of electricity generation and power supply in version 3 of the ecoinvent database"
  13. SCI for AI — Software Carbon Intensity for Artificial Intelligence (ISO/IEC 21031:2024)
    • Green Software Foundation — SCI for AI
    • Ratified December 17, 2025; extends the SCI specification (ISO/IEC 21031:2024) specifically for AI workloads
    • Provides standardized methodology for measuring carbon intensity of AI inference and training
    • Defines functional units, system boundaries, and reporting requirements for AI carbon accounting
  14. ML.ENERGY Leaderboard v3.0 — Standardized LLM energy benchmarks
    • ML.ENERGY Leaderboard
    • Version 3.0 (December 2025): 46 models across 1,858 hardware configurations
    • Provides per-token energy measurements under controlled conditions (standardized prompts, batch sizes, hardware)
    • Used by EcoLogits for calibrating energy-per-token estimates
  15. NVIDIA HGX H100 Product Carbon Footprint — GPU manufacturing emissions
    • HGX H100 PCF Summary (PDF)
    • ISO 14067-conformant, third-party reviewed (WSP) product carbon footprint
    • Cradle-to-gate emissions: 1,312 kg CO₂e per HGX H100 system (8× H100 SXM GPUs)
    • Materials/components account for 91% of emissions: HBM (42%), ICs (25%), thermal (18%)
    • Key input for embodied carbon allocation in LLM lifecycle assessments
  16. TechInsights (2026) — GPU manufacturing emissions growth projection
    • TechInsights Sustainability Insights
    • 2026 Inflection Point: Semiconductor Sustainability Predictions
    • Global AI GPU Carbon Emissions Forecast 2025–2030: manufacturing emissions to grow ~16× from 2024 to 2030 (CAGR 58.3%), reaching 19.2 million metric tons CO₂e
    • 2026 semiconductor manufacturing emissions projected to reach 186 million metric tons CO₂e (+9% YoY)
    • HBM stacking yield identified as structural sustainability risk; AI GPU production to account for 8.7% of all semiconductor emissions by 2030
    • Highlights tension between inference efficiency gains and explosive hardware scaling
  17. Coalition for Sustainable AI — International governance framework
    • Coalition for Sustainable AI
    • AI Action Summit (Wikipedia)
    • Launched at Paris AI Action Summit, 10–11 February 2025; 1,000+ participants from 100+ countries
    • Led by France, UNEP, and ITU; supported by 11 countries, 5 international organisations, and 37 tech companies (including EDF, IBM, NVIDIA, SAP)
    • 58 countries signed the "Statement on Inclusive and Sustainable AI for People and the Planet"
    • Signals regulatory direction toward mandatory AI environmental reporting
  18. EF 3.1 — Environmental Footprint characterization factors (JRC, 2025)
    • JRC Environmental Footprint
    • Updated characterization factors for 16 impact categories including climate change, water use, and mineral resource depletion
    • Successor to EF 3.0; used alongside CML-IA for ADPe characterization
    • Official EU Product Environmental Footprint method
  19. GPT-5 energy estimate — University of Rhode Island AI Lab (August 2025)
    • The Guardian / AI Commission
    • Tom's Hardware analysis
    • Researchers: Nidhal Jegham et al. (same group as reference #1)
    • GPT‑5 average: ~18.35 Wh per 1000-token query; up to 40 Wh for medium-length response
    • ~8.6× increase over GPT‑4 (2.12 Wh); reasoning mode can add 5–10× further overhead
    • Methodology: response time × estimated hardware power draw (Azure H100/H200), with PUE/WUE/CIF multipliers
    • Validates the importance of per-model energy factors rather than fixed averages

Infrastructure Multipliers

Source: Jegham et al. Table 1, supplemented with provider sustainability reports (Google, Microsoft, AWS, Scaleway) and independent verification.

Provider
Infrastructure
PUE
WUE On-Site (mL/Wh)
WUE Off-Site (mL/Wh)
WUE Total (mL/Wh)
CIF (gCO₂/Wh)
Renewable
Azure (OpenAI)
US East
1.12
0.30
4.35
4.61
0.350
~60% (100% matched via RECs)
AWS (Anthropic, Perplexity)
US East (Virginia)
1.14
0.18
5.11
5.27
0.287
100% matched via RECs
GCP (Google)
Various US, TPUs
1.09
—
—
~1.08 ¹
~0.125 ²
66% CFE (hourly)
Scaleway (GreenPT)
DC5, France
1.25
0.067
~0.48
~0.55
0.065 ³
100% (wind/hydro, GO)

PUE Cross-Validation

Provider
Our Value
Provider-Reported (2024)
CCF Open Source
Assessment
Azure
1.12
1.12 (design target), 1.16 (global avg)
1.185 (fleet-wide)
Matches next-gen DC design PUE
AWS
1.14
1.15 (2024 global)
~1.135
Slightly optimistic vs reported 1.15
GCP
1.09
1.09 (2024 fleet avg)
1.1
Accurate. Best: 1.07 (Oregon)
Scaleway DC5
1.25
1.25 (DC5 2024)
N/A
Updated from 1.15 (historical). Fleet avg is 1.37
Industry avg
—
1.56 (2024 survey)
—
Enterprise on-premise: 1.63 (IDC)

WUE Cross-Validation

Provider
Reported WUE (L/kWh)
Year
Notes
AWS
0.15
2024
17% improvement from 2023; best-in-class among hyperscalers
Microsoft Azure
0.30
FY2024
39% improvement from 0.49 in 2021. New zero-water evaporation designs starting Aug 2024
Google GCP
~1.0
2024
Annualized global on-site water efficiency. Total: ~22.7 billion liters in 2024 (+8% YoY)
Industry avg (hyperscale)
0.45–0.48
2024
Berkeley Lab 2024 US Data Center Energy Report projection
Notes:
¹ Google WUE derived from official figures: 0.26 mL per median prompt ÷ 0.24 Wh = ~1.08 mL/Wh.
² Google carbon intensity is a blended estimate using 66% CFE × regional grid mix. Varies significantly by region: Iowa 87% CFE, South Carolina 31% CFE, Oregon 87% CFE.
³ GreenPT/Scaleway CO₂: Scaleway's Environmental Footprint Calculator publishes PAR-2 (DC5) carbon intensity as 0.065 kgCO₂e/kWh (65 gCO₂e/kWh), calculated using EMBER electricity mix data × DC5 PUE, following a location-based methodology per ADEME PCR guidelines (deliberately excluding their 100% renewable Guarantees of Origin).

Primary Energy & Abiotic Depletion Lifecycle Factors

Source: EcoLogits / ADEME Base Empreinte® / ecoinvent electricity lifecycle data.

These factors convert electricity consumption (Wh) into lifecycle impact metrics, accounting for the full supply chain of electricity generation including fuel extraction, processing, transport, and infrastructure.

Region / Grid
Primary Energy (MJ/Wh)
ADPe (kg Sb eq/Wh)
Used By
US grid (Azure, AWS, GCP)
0.0096884
9.855 × 10⁻¹¹
OpenAI, Anthropic, Google, Perplexity
France grid (Scaleway)
0.0093135
4.858 × 10⁻¹¹
GreenPT

Primary Energy (PE)

Measures the total energy extracted from natural resources (fossil fuels, nuclear, renewables) required to produce the electricity consumed. Includes extraction, refining, transport, and generation losses. A factor of ~0.0097 MJ/Wh means roughly 2.7× the direct electricity is consumed as primary energy from nature.

Abiotic Depletion Potential for Elements (ADPe)

Measures the depletion of non-renewable mineral and metal resources (lithium, copper, gold, rare earths) required for the electricity generation infrastructure. Expressed in kg antimony equivalent (kg Sb eq) per the CML-IA characterization method. France has lower ADPe than the US grid due to nuclear-dominated generation requiring less diverse mineral extraction.

Notes:
• PE and ADPe are infrastructure-level lifecycle factors — they depend on the electricity grid, not the model itself.
• These factors capture only the operational electricity lifecycle, not hardware manufacturing (embodied impacts).
• Values derived from the same ADEME/ecoinvent databases used by EcoLogits for LCA compliance (ISO 14044).
• The same uncertainty ranges (uncertaintyMin/uncertaintyMax) applied to energy, water, and CO₂ are also applied to PE and ADPe.

Per-Model Energy (Wh per 1,000 Tokens)

green-l-raw (GreenPT)
GreenPT / EcoLogits
Input Wh/1kT
0.075
Output Wh/1kT
0.1
Estimated from Mistral Mini class (~8–22B parameters) using EcoLogits parametric formula, with ~30% reduction per GreenPT's claimed compression and quantization optimizations. GreenPT has not published per-token energy figures. Their primary environmental advantage comes from running on the French nuclear grid (21.7 gCO₂/kWh per RTE France, among the lowest in the world).
gpt-4.1 (OpenAI)
Jegham et al. / Couch
Input Wh/1kT
0.125
Output Wh/1kT
0.4
Jegham et al. Table 4 reports 0.87 Wh (short), 3.16 Wh (medium), 4.83 Wh (long) per query. Cross-referenced with Couch derivation (~0.39 Wh/1kT input, ~1.95 Wh/1kT output for GPT‑4o class). GPT‑4.1 is architecture-optimized relative to GPT‑4o, so values are adjusted downward.
gpt-4.1-mini (OpenAI)
Jegham et al.
Input Wh/1kT
0.1
Output Wh/1kT
0.3
Proportional to GPT‑4.1 based on Jegham ratios. Mini models consume ~55–65% of full model energy. Jegham: 0.45–2.12 Wh/query (mini) vs 0.87–4.83 Wh/query (full).
claude-opus-4-6 (Anthropic)
Jegham et al.
Input Wh/1kT
0.3
Output Wh/1kT
0.75
No direct measurement available. Jegham benchmarked Claude 3.7 Sonnet at 0.95–5.67 Wh/query. Opus is a larger, more capable model — estimated at ~2× Sonnet energy consumption. Anthropic has not published per-token energy data.
claude-sonnet-4-6 (Anthropic)
Jegham et al.
Input Wh/1kT
0.15
Output Wh/1kT
0.45
Based on Jegham Claude 3.7 Sonnet data (0.95–5.67 Wh/query), adjusted for architecture improvements in newer versions. Sonnet 4.6 replaces Sonnet 4.5 with identical API pricing ($3/$15 per MTok) and similar output speed (~57 vs ~63 tok/s), indicating the same compute tier. Per-task energy is likely lower due to 70% token efficiency gains (Anthropic), but per-token energy is assumed equivalent until independent benchmarks are published.
o3 (OpenAI)
Jegham et al. / Epoch AI
Input Wh/1kT
0.65
Output Wh/1kT
3.5
Jegham Table 4: 1.18 Wh (short), 5.15 Wh (medium), 12.22 Wh (long) per query. Highest energy model in the study. Reasoning models generate 2.5–10× hidden chain-of-thought tokens per Epoch AI, which are not visible in API usage but consume energy. The high output Wh/1kT reflects this reasoning overhead.
o4-mini (OpenAI)
Jegham et al.
Input Wh/1kT
0.225
Output Wh/1kT
0.75
Not benchmarked in Jegham. Estimated from o3-mini data (0.67–3.53 Wh/query), adjusted for architecture improvements in the o4 generation.
gemini-3-pro-preview (Google)
Google official measurement
Input Wh/1kT
0.065
Output Wh/1kT
0.2
Google's official measurement: 0.24 Wh per median Gemini text prompt. Most efficient large model due to custom TPU hardware and 33× efficiency improvement over 12 months (to May 2025). Google does not publish input/output token breakdown — the split is estimated based on typical input:output ratios.
sonar-pro (Perplexity)
Estimated
Input Wh/1kT
0.35
Output Wh/1kT
0.75
No official energy data from Perplexity. Higher than pure chat models because each query involves web search + retrieval + generation pipeline. Estimated proportionally above Claude class models based on the additional compute required for real-time web retrieval.
sonar-deep-research (Perplexity)
Estimated
Input Wh/1kT
2.0
Output Wh/1kT
5.0
No official data. Deep research involves multi-step iterative research loops with multiple LLM calls, web searches, and synthesis. Estimated at 5–7× sonar-pro energy based on the iterative multi-call nature of the deep research workflow.
text-embedding-3-large (OpenAI)
EcoLogits
Input Wh/1kT
0.015
Output Wh/1kT
0.0
Encoder-only architecture with no output generation. Estimated at ~1/15th of GPT‑4.1 energy based on API pricing ratio ($0.13/MTok vs $2.00/MTok). EcoLogits estimate for a ~1–2B parameter model confirms this order of magnitude.

Cross-Validation of Energy Estimates

Multiple independent sources converge on a 0.2–0.4 Wh range for a standard text query to a frontier model, with reasoning models consuming 5–70× more depending on complexity.

Convergence Table: Standard Text Queries

Source
Model / Scenario
Energy (Wh)
Method
Google 2025 (measured)
Gemini Apps text, median
0.24
Production measurement, full stack
Epoch AI 2025 (estimated)
GPT‑4o, 500 output tokens
~0.3
Bottom-up FLOP-based, H100
Oviedo/Microsoft 2025
Frontier >200B, H100
0.34 (median, IQR 0.18–0.67)
Token throughput estimation
Jegham et al. 2025
GPT‑4o short prompt (100/300 tokens)
0.42
Infrastructure-aware API benchmarking
Jegham et al. 2025
GPT‑4.1 short prompt
0.92
Same methodology
Caravaca et al. 2025
Llama 3.1 405B (batched)
0.354
Direct GPU measurement, batch=100
Caravaca et al. 2025
Phi‑4 14B (batched)
0.017
Direct GPU measurement

Convergence Table: Reasoning Models

Source
Model / Scenario
Energy (Wh)
Overhead vs Standard
Jegham et al.
o3 short prompt
7.03
~7.6× vs GPT‑4.1
Jegham et al.
o3 long prompt
39.2
Highest measured
Jegham et al.
o4-mini (high)
2.92
~6.9× vs GPT‑4.1 mini
Jegham et al.
Claude 3.7 Sonnet Extended Thinking
3.49
~4.2× vs standard Sonnet
Jegham et al.
DeepSeek‑R1 short prompt
23.8
Most energy-intensive short
Oviedo/Microsoft
Test-time compute / reasoning
4.32
~13× baseline
Muxup 2026
DeepSeek‑R1 (8k/1k, H200)
3.32–3.74
Hardware-dependent
Muxup 2026
DeepSeek‑R1 (1k/8k, high output)
15.0–16.3
Output-length dependent

Jegham et al. Full Model Benchmark (Short Prompt: 100 input / 300 output tokens)

Model
Energy (Wh)
± Error
LLaMA‑3.2 1B
0.070
0.011
LLaMA‑3.2-vision 11B
0.071
0.011
GPT‑4.1 nano
0.103
0.037
LLaMA‑3.3 70B
0.247
0.032
GPT‑4.1 mini
0.421
0.197
GPT‑4o (Mar '25)
0.421
0.127
o1-mini
0.631
0.205
Claude‑3.7 Sonnet
0.836
0.102
o3-mini
0.850
0.336
GPT‑4.1
0.918
0.498
GPT‑4 Turbo
1.656
0.389
o4-mini (high)
2.916
1.605
Claude‑3.7 Sonnet ET
3.490
0.304
o1
4.446
1.779
GPT‑4.5
6.723
1.207
o3
7.026
3.663
DeepSeek‑R1
23.815
2.160
Note: Jegham v6 (November 2025) does NOT include GPT‑4.1 (full), Claude Opus 4, Claude Sonnet 4, or Gemini 3 Pro — these models were released after data collection. Our per-token estimates for these models are derived from the closest benchmarked equivalents and architectural reasoning. As of February 2026, no published paper provides direct energy measurements for GPT‑4.1, Claude Opus 4, Claude Sonnet 4, or Gemini 3 Pro.

Energy Breakdown by Component (Google 2025)

Google's technical paper (arXiv:2508.15734) provides the only published component-level energy breakdown for production AI inference:

Component
Share
TPUs / GPUs
58%
CPU and memory
24%
Operational redundancy
10%
Data center overhead (cooling, etc.)
8%

Output vs Input Token Energy Ratio

Multiple sources confirm that output tokens are significantly more energy-intensive than input tokens:

Source
Output:Input Ratio
Method
Caravaca et al. 2025
~11×
Direct GPU measurement
Couch 2026 (via pricing proxy)
~5×
API pricing ratio
SemiAnalysis (cited by Couch)
~15× (smaller scales)
Industry analysis
EcoLogits parametric model
Varies by batch size
Regression model

The 5:1 ratio used as our baseline (from Couch/API pricing) is a conservative estimate — direct measurement suggests the true ratio may be higher.

Real-World Comparison References

Used in the tooltip to provide human-understandable context for each metric.

Energy Comparisons

Reference
Value (Wh)
Source
Validated
HD streaming video (1 sec)
0.033
IEA 2020, Carbon Brief 2020
0.12 kWh/hr ÷ 3600 = 0.033 Wh/s. IEA and Carbon Brief confirm ~0.12 kWh/hr for HD streaming including DC + network + device.
A Google search
0.3
Google official (2009), Google 2022 Environmental Report, Jegham et al. 2025
Consistently cited since 2009. Google's 2022 report reaffirmed 0.3 Wh per search. Represents server-side only (excludes device energy).
Charging a smartphone
19
~5000mAh × 3.8V = 19 Wh
EnergySage confirms ~19 Wh for average phone battery capacity. Modern phones: 4,000–5,000 mAh at 3.7–3.85V nominal = 14.8–19.3 Wh.

Logic: < 120s → "≈ Xs streaming video"; < 50 searches → "≈ X Google searches"; else → "≈ X% phone charge"

Water Comparisons

Reference
Value (mL)
Source
Validated
A single water drop
0.05
US Pharmacopeia (USP) standard: 20 drops/mL
Wikipedia "Drop (unit)" confirms USP definition. Actual drops vary (0.027–0.043 mL from medicine droppers); 0.05 mL is the accepted convention.
A teaspoon of water
5
Standard culinary measure
Standard definition: 1 tsp = 5 mL. No ambiguity.

Logic: < 100 drops → "≈ X water drops"; else → "≈ X teaspoons"

CO₂ Comparisons

Reference
Value (gCO₂)
Source
Validated
Petrol car per meter
0.130
EEA 2024 — European on-road fleet (~130 g/km est.)
EEA 2024: new car fleet average = 108 g CO₂/km. EU 2025 target: 93.6 g/km. On-road fleet (all ages) is higher due to older vehicles. We estimate ~130 g/km.
A Google search
0.2
Google official (2009), 2022 Environmental Report
Google self-reported: 0.2 gCO₂ per search query (server-side). Consistently cited since 2009 and reaffirmed in 2022.

Note on car CO₂: The 130 g/km value is an estimate for the current European on-road fleet (all vehicle ages). EEA reported the 2024 new-car fleet average at 108 g CO₂/km, down from 160–170 g/km in 2015–2019 due to EV adoption and EU emissions standards (Regulation 2019/631). The total on-road fleet average is higher because older, less efficient vehicles remain in service (average European car age ~12 years). EU 2025 target: 93.6 g/km; 2030: 49.5 g/km; 2035: 0 g/km (zero tailpipe). The EPA US fleet average is ~249 g/km — if a US-only comparison were needed, the value should be 0.249 gCO₂/m.

Logic: < 100m → "≈ car driving Xm"; else → "≈ X Google searches"

Primary Energy Comparisons

Reference
Value (kJ)
Source
Validated
A wooden match burned
1
Chemical energy of match head
Literature range: 1.05–2.14 kJ. 1 BTU = 1.055 kJ. Our value of 1 kJ is at the lower end but defensible as an order-of-magnitude approximation.
A food Calorie (kcal)
4.184
Thermodynamic definition
Exact: 1 food Calorie (kcal) = 4.184 kJ. Internationally standardized since 1948. No uncertainty.
Boiling a cup of water
300 (0.3 MJ)
~250 mL × 80°C rise × 4.184 J/g°C
Standard thermodynamics: 250g × 80K × 4.184 J/(g·K) = 83,680 J ≈ 84 kJ. With kettle efficiency (~85%): ~99 kJ. We use 300 kJ to represent the full primary energy cost including generation losses (~3×).

Logic: < 50 matches → "≈ X matches burned"; < 100 Cal → "≈ X food Calories"; else → "≈ boiling X cups of water"

ADPe Comparisons

ADPe is expressed in kg Sb eq (antimony equivalent), an abstract characterization factor. To make it tangible, we convert to the equivalent mass of copper that would need to be mined to cause the same mineral depletion, using the CML-IA characterization factor for copper (1.4 × 10⁻³ kg Sb eq/kg Cu).

Reference
Value (kg Sb eq/kg)
Source
Copper mining (per kg)
1.4 × 10⁻³
CML-IA 2016, Leiden University
Smartphone (Fairphone 5)
1.25 × 10⁻³
Fairphone 5 LCA 2024 (Fraunhofer IZM)
Gold mining (per kg)
~52.2
CML-IA 2016 (reference only)

Logic: Convert kg Sb eq → equivalent copper mass: copperKg = adpKgSb / 1.4e‑3. Display as mg or g of copper mined (mg minimum for readability).

Context: Manufacturing one smartphone (1.25 × 10⁻³ kg Sb eq, Fairphone 5 LCA) causes the same mineral depletion as generating ~3,000 kWh of electricity — roughly one year of average European household electricity. A typical LLM query's ADPe equates to mining 0.001–0.05 mg of copper.

Additional Reference Values (not used in tooltip)

Reference
Value
Source
LED bulb for 1 second
0.0028 Wh
10W LED / 3600s
Sending an email
0.005 Wh
IEA / Berners-Lee 2020
A tablespoon of water
15 mL
Standard measure
A sip of water
37 mL
Average measured
Sending a text message
0.014 gCO₂
Berners-Lee 2020
Sending an email
2 gCO₂
Berners-Lee 2020

Uncertainty Methodology

There is currently no single standardized uncertainty methodology for LLM energy estimation, though progress is being made. The SCI for AI specification (ISO/IEC 21031:2024, ratified December 2025 by the Green Software Foundation) provides standardized functional units, system boundaries, and reporting requirements for AI carbon accounting — a critical step toward comparable, reproducible environmental impact reporting. Different sources currently use different approaches:

Source
Uncertainty Approach
Typical Range
Jegham et al. 2025
Confidence intervals from hardware config uncertainty
±10–50% (model-dependent)
Oviedo/Microsoft 2025
IQR reporting (median + interquartile range)
IQR: 0.18–0.67 Wh on 0.34 median
Ozcan et al. 2025
Heuristic power-law (gamma = 0.7, not empirically fit)
Acknowledged uncertainty
Google 2025
Median-based reporting to avoid outlier distortion
Production measurement, narrowest
EcoLogits
Parametric regression from LLM Perf Leaderboard
Model-parameter-dependent

Our Uncertainty Ranges

Each model in our system defines uncertaintyMin and uncertaintyMax multipliers (e.g., 0.5–1.5 means the true value could be 50%–150% of the nominal estimate). These ranges are applied uniformly to all 5 metrics.

Rationale for ±30–50% default range:

  • Jegham et al. error bars range from ±15% (well-known models like GPT‑4o: 0.42 ± 0.13 = ±31%) to ±52% (o3: 7.03 ± 3.66)
  • Oviedo/Microsoft IQR spans roughly ±50% of median (0.18–0.67 on 0.34)
  • Caravaca et al. found batch size alone causes 36× variation (Llama 405B: 21.7 Wh single vs 0.6 Wh batched), but production systems always batch
  • Nature Scientific Reports 2024 identifies hardware, geography, and utilization as the primary uncertainty sources
  • Non-production estimates overstate by 4–20× (Oviedo/Microsoft), suggesting our academic-derived estimates may be conservatively high

Models with more data points (GPT‑4o, Claude 3.7 Sonnet) have tighter ranges; models extrapolated from architectural reasoning (Claude Opus 4, Gemini 3 Pro) have wider ranges.

Key Caveats & Limitations

  1. Limited official data. Only Google has published official per-query energy data (0.24 Wh for Gemini, August 2025). Microsoft Research (Oviedo et al.) provides the next most rigorous estimate (0.34 Wh median for frontier models). OpenAI's single data point (0.34 Wh for ChatGPT) came from a CEO statement without methodology. All other per-token figures are academic estimates with ~20–50% margin of error.
  2. Hidden reasoning tokens. Reasoning models (o3, o4-mini) generate hidden chain-of-thought tokens that dramatically increase actual energy consumption. Jegham et al. measured o3 at 7.03 Wh (short) to 39.2 Wh (long) — up to 70× more than efficient models. Oviedo/Microsoft found test-time compute causes a 13× energy increase. Kumar et al. (OverThink) showed adversarial attacks can inflate reasoning chains by 18–46×, proportionally increasing energy. The internal reasoning tokens are not visible via the API but consume compute. This makes per-visible-token metrics potentially misleading for these models.
  3. "100% renewable" claims. AWS and Microsoft use annual Renewable Energy Certificate (REC) matching — purchasing certificates equal to annual consumption from any location and time period. This is an accounting solution, not an engineering one: a REC from a solar farm in Arizona at noon can "offset" fossil-fueled consumption in New York at midnight. Peer-reviewed research found REC-based claims lead to "an inflated estimate of the effectiveness of mitigation efforts." Google's 66% CFE (carbon-free energy, measured hourly on the same regional grid) is more conservative and transparent. Greenpeace found AWS meeting only 12% of its renewable commitment physically in Virginia; grid actual renewable is <5%. Dominion Energy (Virginia) mix: ~33% nuclear, ~33% gas, ~25% coal, ~4–6% renewable.
  4. GreenPT transparency gap. GreenPT has not published per-token energy figures despite marketing as "green AI." Their primary environmental advantage comes from running on the French nuclear grid (21.7 gCO₂/kWh direct in 2024, ~27 gCO₂eq/kWh lifecycle in 2025 per RTE France — among the lowest in the world) and Scaleway's efficient data centers (PUE 1.25, adiabatic cooling), rather than from demonstrated model-level efficiency.
  5. Perplexity: no sustainability data. No environmental reports, no energy consumption figures, no climate commitments. Their infrastructure multipliers are assumed from AWS (their primary cloud provider). The additional energy cost of web search + retrieval in each query is estimated, not measured.
  6. Model version drift. Jegham et al. benchmarked Claude 3.7 Sonnet and o3, not Claude Opus 4.6 / Sonnet 4.6 or GPT‑4.1. Our per-token estimates for current models are derived from the closest benchmarked equivalents and architectural reasoning. As of February 2026, no published paper provides direct energy measurements for GPT‑4.1, Claude Opus 4.6, Claude Sonnet 4.6, or Gemini 3 Pro. Sonnet 4.6 inherits Sonnet 4.5's energy factors based on identical API pricing ($3/$15 per MTok) and similar output throughput (~57 tok/s vs ~63 tok/s).
  7. Batch size and utilization. Energy per query varies dramatically with server utilization. Caravaca et al. found 36× reduction from single-prompt to batch-100 for Llama 405B (21.7 → 0.6 Wh). Jegham et al. assumes batch size 8; real production loads vary. Oviedo/Microsoft warns non-production estimates overstate energy by 4–20×.
  8. Water accounting. WUE figures include both Scope 1 (on-site cooling: evaporative towers, adiabatic systems) and Scope 2 (off-site: water consumed by electricity generation). Per IEA 2023, two-thirds of total data center water is indirect/off-site. Provider-reported WUE values: AWS 0.15 L/kWh (2024, best-in-class), Microsoft 0.30 L/kWh (FY2024), Google ~1.0 L/kWh (global on-site). Li et al. (CACM 2025) provide the most comprehensive framework for total water footprint estimation.
  9. Input/output split. The per-1k-token split between input and output is estimated for most models. Caravaca et al. found output tokens have ~11× greater energy impact than input tokens (direct measurement). Couch 2026 uses a 5:1 ratio based on API pricing. SemiAnalysis suggests ~15× at smaller scales. The true ratio varies by model architecture and is not published by any provider.
  10. PE and ADPe precision. Lifecycle factors are grid-level averages from ADEME Base Empreinte / ecoinvent databases, as confirmed by EcoLogits' JOSS paper and methodology documentation. They do not capture provider-specific optimizations (e.g., Google's on-site solar) or time-of-day variations in grid composition. EcoLogits confirms they use ADEME specifically due to "a lack of open, up-to-date alternatives" for PE and ADPe. The CML-IA characterization method (Leiden University) underpins all ADPe calculations; latest revision: van Oers et al. 2020.
  11. Jevons Paradox. (Jegham v6) As AI becomes more efficient and cheaper, total resource consumption may actually increase. De Vries (Patterns/Cell Press, 2025) projects AI systems produced 32.6–79.7 million tonnes CO₂ and consumed 312.5–764.6 billion liters of water in 2025 alone. Google's total emissions rose 11% to 11.5M tonnes CO₂ in 2024 (+51% from 2019), mostly Scope 3.
  12. France carbon intensity validated by Scaleway. Our CIF of 0.065 gCO₂/Wh (65 gCO₂/kWh) for Scaleway matches their own Environmental Footprint Calculator published value for PAR-2 (DC5). Scaleway uses a location-based methodology per ADEME PCR guidelines with EMBER electricity mix data, deliberately excluding their 100% renewable Guarantees of Origin. This figure is higher than RTE France 2024 grid intensity (21.7 gCO₂eq/kWh direct, 30.2 lifecycle) because ADEME's regulatory average for France (52 gCO₂e/kWh, multi-year) is structurally higher than single-year RTE figures, and the PUE multiplier (1.25) further increases effective carbon per useful kWh.
  13. Embodied emissions not included. Our calculations cover only operational energy (use phase). Hardware manufacturing (embodied) emissions are significant — per Boavizta and EcoLogits methodology, the manufacturing phase of server hardware contributes additional GWP, PE, and ADPe that are amortized over the equipment's useful life. Fairphone LCA data shows manufacturing accounts for ~100% of ADPe in consumer electronics; similar dominance applies to data center hardware. NVIDIA's HGX H100 Product Carbon Footprint reports 1,312 kg CO₂e cradle-to-gate per system. TechInsights (2026) projects GPU manufacturing emissions to grow ~16× from 2024 to 2030 due to AI chip demand, underscoring the growing importance of embodied carbon in total AI lifecycle assessments.
  14. No provider publishes per-token energy. Google provides per-query data (but not per-token breakdown). All per-token values are derived from per-query benchmarks divided by estimated token counts. Couch 2026 provides the most explicit derivation methodology.

Calculation Formula

For a given model with inputTokens input tokens and outputTokens output tokens:

energyWh = (inputTokens / 1000) × inputWhPer1kTokens + (outputTokens / 1000) × outputWhPer1kTokens waterMl = energyWh × waterMlPerWh co2eG = energyWh × co2eGPerWh primaryEnergyMj = energyWh × primaryEnergyMjPerWh adpKgSb = energyWh × adpKgSbPerWh

All multipliers (water, CO₂, PE, ADPe) are infrastructure-level factors applied uniformly to the total energy consumed. Water, CO₂, PE, and ADPe depend on the electricity grid and provider infrastructure, not on the model architecture.

Each metric also has min/max uncertainty estimates derived from the model's uncertaintyMin and uncertaintyMax multipliers (e.g., 0.5–1.5 means the true value could be 50%–150% of the nominal estimate).

Displayed Metrics

The environmental impact tooltip shows 5 metrics per message:

Metric
Label
Subtitle
Unit
Source
Energy
Energy
Electricity consumption
Wh / mWh / µWh
Per-model energy factors
GHG Emissions
GHG Emissions
Effect on global warming
g / mg CO₂e
Energy × CIF
Water
Water
Water consumption
mL / µL
Energy × WUE
Primary Energy
Primary Energy
Use of natural energy resources
MJ / kJ / J
Energy × PE factor (ADEME)
Abiotic Resources
Abiotic Resources
Use of metals and minerals
ng / pg Sb eq
Energy × ADPe factor (ADEME)

All 5 metrics include real-world comparisons: Energy (streaming video / Google searches / phone charge), GHG (car driving / Google searches), Water (water drops / teaspoons), Primary Energy (matches / food Calories / boiling water), and Abiotic Resources (copper mining equivalent).

The inline footer below each message shows Energy, GHG (CO₂), and Water for compactness (matching tooltip order).

Rendez le développement durable plus facile, plus intelligent et plus efficace.
Inscrivez-vous à notre newsletter

Inscrivez-vous à notre newsletter.

Tenez-moi au courant des nouvelles fonctionnalités de Muuvment.

Vous pouvez vous désinscrire à tout moment.
Merci ! Votre candidature a été reçue !
Oups ! Une erreur s'est produite lors de l'envoi du formulaire.
Muuvment
info@muuvment.com
+1 866 645 2040
Des solutions
Muuvment IQObjectif du muséeOBJECTIFLaboratoires Muvment
L'entreprise
À propos de MuuvmentContactRessourcesFAQ
Légal
Politique de confidentialitéTermes et conditions
© 2025 Musée
Préférences relatives aux cookies