G
11

Found out GPT-4 uses like 8,000 GPUs to train and my jaw dropped

I was reading a data center blog last night and saw that number - I remember when training a decent model on a single GPU felt like a big deal. How fast do you think that training pipeline will shrink in the next 5 years?
2 comments

Log in to join the discussion

Log In
2 Comments
jordan_webb49
Holy crap, 8,000 GPUs? That's insane.
2
perry.jessica
Wait, is everyone just going to glaze over the fact that they probably skimped on the actual hardware to afford that many GPUs? @jordan_webb49, you think 8,000 GPUs is a flex, but what if they're all last-gen models running at half speed because of bad cooling? I've seen companies brag about numbers like that before and then you find out they're running 4-year-old cards they got for pennies on the dollar. Not to mention the power bill has to be astronomical. If they're not using some cheap energy source, the whole thing is a money pit. And honestly, who even needs 8,000 GPUs for a reasonable task? Sounds like a vanity project more than anything practical. I'd rather have 200 really good GPUs with proper support than a mountain of them that crash every other hour. So yeah, I'm not impressed by the number. Show me the results and the uptime, then we'll talk.
10