How We Track AI's Environmental Impact
A plain-language guide to the numbers in your tooltip
Why We Do This
Every AI message uses electricity. That electricity requires water for cooling, produces carbon emissions depending on the power grid, and draws on natural resources through the infrastructure that generates it.
These costs are real but invisible. A single AI response typically uses less energy than a few seconds of streaming video — small enough to seem trivial, but significant at scale. In 2025, AI systems worldwide produced an estimated 30–80 million tonnes of CO₂ and consumed 300–750 billion litres of water.
We believe you should be able to see the environmental cost of each message, just as you'd expect to see the price of something before you buy it. Not to make you feel guilty — but to help you make informed choices, like picking a lighter model when a heavier one isn't needed.
The Five Metrics
Below each AI response you'll see a compact footer showing three key metrics: energy, CO₂, and water. Tap to expand the tooltip for all five, each with a real-world comparison to make the numbers tangible.
The electricity consumed to generate the response. This is the foundation — every other metric flows from it.
The carbon dioxide released by the power grid to generate that electricity. Varies hugely by location — France's nuclear grid produces ~5× less CO₂ per watt-hour than the US average.
Water consumed for data center cooling (both on-site and in the power plants that generate electricity). Includes the hidden "off-site" water that most providers don't report.
The total energy extracted from nature — fossil fuels, nuclear fuel, wind, solar — to produce the electricity used. Roughly 2.7× the direct electricity, once you account for generation and transmission losses.
Depletion of non-renewable minerals and metals — copper, lithium, rare earths — used in the power generation infrastructure. Expressed in antimony equivalent (a standard measure for comparing mineral depletion).
Every metric includes a min–max uncertainty range showing how confident we are in the estimate. More on that below.
How We Calculate Them
The calculation has two parts: how much energy the model uses, and what that energy costs the environment based on where it runs.
Step 1: Energy
We know which model generated the response and how many tokens it processed (input) and generated (output). Generating output is significantly more energy-intensive than processing input — research shows roughly 5–11× more — so we account for them separately. Each model has a per-token energy factor derived from academic benchmarks.
Step 2: Infrastructure multipliers
We multiply the energy by factors specific to the data center where the model runs:
- Carbon intensity — how dirty or clean the local power grid is
- Water usage — how much water the cooling system consumes
- Primary energy & minerals — the upstream cost of generating that electricity
The same model produces very different environmental impacts depending on where it runs. A query to GreenPT (running on France's low-carbon nuclear grid) produces about 5× less CO₂ than the same query to a model on the US grid — even if both models use similar energy.
Which models do we track?
Where the Numbers Come From
We don't make these numbers up — and we don't just use one source. Our estimates are built from a combination of academic research, provider sustainability reports, and open-source tools, cross-validated against each other.
Key sources
When multiple sources exist for the same number, they tend to agree. Independent estimates for a standard AI query consistently land in the 0.2–0.4 Wh range — giving us confidence the ballpark is right, even if individual model estimates carry meaningful uncertainty.
What We Don't Know
We believe in showing our work — including the parts we're less sure about. Here's what you should know:
Every estimate includes an uncertainty range. The min–max values in the tooltip aren't decoration. They typically span ±30–50% of the central estimate, reflecting real scientific uncertainty about how much energy each model actually consumes. Models with more published data (like GPT‑4o) have tighter ranges; models we've had to estimate from architectural reasoning (like Claude Opus 4) have wider ones.
No AI company publishes per-token energy data. Google is the only provider to have released per-query energy measurements. Everyone else's numbers are derived from academic benchmarks and inference. We update our factors as better data becomes available.
"Reasoning" models are especially uncertain. Models like o3 generate hidden chains of thought that are invisible to us but consume real energy. The gap between our estimate and reality could be larger for these models.
"100% renewable" doesn't always mean what it sounds like. AWS and Microsoft claim 100% renewable energy through certificate purchasing — but these certificates can come from a solar farm on a different continent at a different time of day. Google's 66% "carbon-free energy" metric (measured hourly, on the same grid) is more honest. We use the actual grid carbon intensity, not the marketing claims.
We only count operational energy. Manufacturing the GPU hardware that runs these models has its own significant carbon and mineral footprint. An NVIDIA H100 system produces over 1,300 kg of CO₂ just to build. We don't include this in per-message calculations — it's a known gap we may address in future.
Our numbers may be conservatively high. Academic benchmarks tend to measure energy under controlled conditions, not in the highly optimised production environments that cloud providers run. Microsoft Research found that non-production estimates can overstate real energy use by 4–20×. If anything, our estimates likely err on the side of caution — which we think is the right direction to err in.
Want More Detail?
This page covers the essentials. If you're interested in the full picture — every data source, every assumption, cross-validation tables, and 39 academic references — see our detailed methodology.
If you spot an error or know of better data, we'd love to hear about it. The field of AI environmental measurement is young and fast-moving, and we're committed to updating our factors as the science improves.