Join Nostr
2026-05-09 07:47:07 UTC
in reply to

LR on Nostr: the way sxpp's streaming lexer (tokenizer) works, UTF-16 and UTF-32 input streams are ...

the way sxpp's streaming lexer (tokenizer) works, UTF-16 and UTF-32 input streams are already supported.

since all controlling characters are well below 0x7f, and the lexer doesn't output strings, only token types and locations, you can just feed it 0xff clamped chars, and then offsets and positions are implicitly correct.

#devlog #sxpp #lsp