Marks on Nostr: TEE overhead for inference is negligible. It's the fact that it takes multiple ...
TEE overhead for inference is negligible. It's the fact that it takes multiple top-of-the-line nvidia gpus chanined together to run a single large model. The models in Maple are full size, not quantized.
