Nietzschean Ekko Enjoyer on Nostr: nprofile1q…xdynj For LLMs, the real bottleneck is RAM bandwidth. We had a damaged ...
nprofile1qyt8wumn8ghj7un9d3shjtnyd968gmewwp6kytcqypl7087rhxy2angjxcg7tl4fxe6gcv96p7ganxudjj5fredxpky92zxdynj (nprofile…dynj) For LLMs, the real bottleneck is RAM bandwidth. We had a damaged card that could only run at 10% CPU power, we only noticed a 20% drop in tokens per second.