Let’s first estimate GPT-4's energy consumption. According to unverified information leaks, GPT-4 was trained on about 25,000 Nvidia A100 GPUs for 90–100 days [2].
Let’s assume the GPUs were installed in Nvidia HGX servers which can host 8 GPUs each, meaning 25,000 / 8 = 3,1...