I could probably improve further with a closer eye on cache usage, >1 sample buffer ...

2026-05-17 04:08:46 UTC

I could probably improve further with a closer eye on cache usage, >1 sample buffer size, SIMD optimizations, and maybe some kind of GPU offloading node

Oh and maybe lowering the worker thread wait time so that it's not spending 1+ms just waiting around for the buffer to be filled

Author Public Key