Because it involved scaling in chip area needed for FP8. AI community realized t...

Because it involved scaling in chip area needed for FP8. AI community realized that FP8 training is possible few years back so the transistors given for FP8 was scaled. Overall I think transistors grow just by ~50% per generation so most of the gains comes from removing FP32/FP64 share which were dominant 10 years back, but there is only some point it could go to.