This is also the only place Nvidia are getting competitive pressure - from the likes of Groq (and likely but less published from Cerebras) with higher inferance T/s and concurrency utilization/batching [1] so if this proves to be the true then the case for big chip systems (on todays specs) will be harder.
[1]https://twitter.com/swyx/status/1760065636410274162?t=rpbcr8...