Join Nostr
2026-05-08 02:54:21 UTC
in reply to

Marks on Nostr: TEE overhead for inference is negligible. It's the fact that it takes multiple ...

TEE overhead for inference is negligible. It's the fact that it takes multiple top-of-the-line nvidia gpus chanined together to run a single large model. The models in Maple are full size, not quantized.