On water and AI: towards sustainable practices

ai-bath.jpeg
Figure 1: This image was generated by the author using DALL-E. Knowingly (hypocritically?) using AI in order to get his point across.

We live in a world of finite resources and a finite capacity to deal with our propensity to pollute her atmosphere for the sake of our creature comforts. As AI practitioners it should, therefore, give us pause that the authors behind the initial estimate that Large Language Models (LLMs) consume 500 ml per 10-50 queries now seem to think 2 L would be closer to the mark [1,2]. There is, however, still an AI baby (full of only partially realised potential) in this particular bath water. So it's worthwhile considering what we as AI practioners (e.g., developers, architects, and leaders) can do to minimise the environmental impact of our solutions.

AI water consumption

What makes LLMs so "thirsty" in the first place? The 500 ml (now 2 L) estimate largely takes into account the amount of water needed in generating power for the LLM (scope-2) as well as the amount of water needed to cool the computers that the LLMs run on in the data centre (scope-1). A third source of water consumption by AI considers water consumed during the full supply chain (from raw material extraction to delivery) for the fabrication of AI hardware (e.g., GPUs). This scope-3 water consumption is incredibly difficult to ascertain (though it may be considerable) and, as such, is not included in the 2 L figure. Both scope-1 and scope-2 water consumption are highly situationally dependent.

Scope-2 water consumption is largely dependent on the power source. Most water is consumed by thermoelectric power plants (these accounted for 73% of the electricity generated in the US in 2021[3]) which use the water for cooling. Wind and solar energy have very low water consumption.

The figure for scope-1 water consumption also comes with some assumptions. Water circulates around the computer components in a closed loop that is, no water is added or lost. Through heat-exchange the heat from the closed loop is transferred to a cooling unit often using a cooling tower (which expends water through evaporation). Cooling tower systems often use potable water (hence extract water from their surroundings that may otherwise be used for consumption) in order to avoid blockages and/or bacterial growth. Alternative cooling solutions that are less (or minimally) reliant on water do exist (e.g., solar cooling, geothermal cooling).

This all means that amount of water consumed by LLMs is inextricably tied to regional and data-centre specific factors. As the world starts to shift to renewable energy sources and alternative cooling systems, data-centre water consumption will presumably (on data-centre level) also lessen. That isn't to say, however, that there isn't a pressing issue to address in realising sustainable AI solutions.

The larger issue

Heat generation and, consequently, water consumption are part and parcel of "normal" data centre computing and, as such, not exclusive to AI computing. However, the one thing that LLMs (and other deep-learning based AI) add to the mix is reliance on GPUs for both training and inference (i.e., usage). GPUs can be particularly power hungry (thus require more scope-1 water through power generation as well as scope-2 water through extra generated heat). In fact, in a recent study into the carbon footprint of the BLOOM LLM[4] they estimate that during inference, CPU compute accounts for about 2% of the total power consumption, while GPU takes up a whopping 75.3% (the remaining source being RAM at 22%).

Even so, water consumption by LLMs is part of a larger problem. Computing inefficiency is wasteful (AI or not). Inefficient code generates more CPU load, which leads to more energy use and, in data centres, more water usage.

Put succinctly:

Turn on JavaScript to view comments, or follow this link to the Mastodon thread

Compounding the problem is the fact that time spent refactoring an established (and working) code base to make it more computationally efficient is sometimes difficult to justify to stakeholders. How much time should you invest? Is there a financial business case (probably yes, but how do you quantify that?)? Consequently, (and admittedly anecdotally), large amounts of "quick-and-dirty" inefficient code remain in production and consume more power and water than is necessary: "There is nothing so permanent as a temporary solution".

LLM enthusiasts might, legitimately, point to the fact that idling PCs are also wasteful. Thus, while someone is sat staring at a monitor waiting for inspiration, valuable computing resources burn away silently. To my knowledge no one has yet examined the extent to which this (the power saved by humans more efficiently executing tasks with the help of an LLM) offsets the power consumed by the LLM inference involved. But however much this may be, it should not exempt us from the duty to ensure our AI solutions do not unnecessarily consume resources.

Sustainable practices

When pressed for comment on GPT-3's high water consumption OpenAI's response was that they continue to invest in ways to make their models more efficient[1]. Indeed, along with more sustainable data-centres (see above) initiatives to optimise current LLM architectures are likely to bear fruit so long as future LLM architectures (or model sizes) do not introduce new inefficiencies. We are currently witnessing a nigh on Cambrian Explosion of LLMs. Some of these dwarf even ChatGPT in terms of scale while others are small enough to run on edge devices and/or lend themselves to CPU-based inference. Thus, when it comes to model selection, AI practitioners are certainly not spoiled for choice.

Choosing the right model for the right job isn't the only lever at the disposal of the AI engineer, however. In many use-cases LLMs are, ultimately, only components in a broader (AI/IT) architecture. While it might be tempting to use LLMs in multiple places for quick and easy results, expending a bit more thought and, yes, effort on hybrid AI solutions might yield approximately equal (or sometimes even superior) results, at much less cost to the environment. These approaches typically even have the added benefit of being more explainable than LLMs alone.

We would also do well to recognise that not every problem needs a power-hungry AI to solve[cf. 5,6]. I am convinced that pushing technology for the sake of technology is ultimately sell-defeating. Real solutions addressing actual (business/consumer/patient/etc.) problems will have the most staying power. As AI practitioners we may have one or two extra tools in our toolbox to help us build those solutions, but we should be mindful for over-reliance on those tools (and the resource burden the entail).

To sum up, as practitioners we can:

  • carefully choose our use-cases: Does it really need/warrant AI? Are there less compute-intensive methods we could use?
  • carefully evaluate AI components in the solution: Is each necessary or merely a short-cut?
  • run and train our models in data-centres powered with renewables (in so far as we have the choice)
  • opt for the smallest model that can do the job
  • consider CPU-based inference if time-constraints allow

References

1. Sellman, M., & Vaughan, A. (2024). “Thirsty” chatgpt uses four times more water than previously thought. The Times. https://www.thetimes.com/uk/technology-uk/article/thirsty-chatgpt-uses-four-times-more-water-than-previously-thought-bc0pqswdr
2. Li, P., Yang, J., Islam, M. A., & Ren, S. (2023). Making ai less “thirsty”: Uncovering and addressing the secret water footprint of ai models. https://arxiv.org/abs/2304.03271
3. U.S. Energy Information Administration. (2023). U.s. electric power sector continues water efficiency gains. https://www.eia.gov/todayinenergy/detail.php?id=56820
4. Luccioni, A. S., Viguier, S., & Ligozat, A.-L. (2022). Estimating the carbon footprint of bloom, a 176b parameter language model. https://arxiv.org/abs/2211.02001
5. Redmann, A., & FitzPatrick, I. (2023). Conversational ai: Llms are not all you need. Dixit: Tijdschrift over Toegepaste Taal- En Spraaktechnologie, 20, 5–8. https://notas.nl/dixit/dixit_2023_conversational_ai.pdf
6. FitzPatrick, I. (2022). Large language models: geen silver bullet. Dixit: Tijdschrift over Toegepaste Taal- En Spraaktechnologie, 19, 28–29. https://notas.nl/dixit/dixit_2022_tst_en_big-models.pdf