Zig-budoux: Wrap that Japanese and Chinese text properly

Cloudef · February 25, 2024, 5:42pm

I ported budoux to zig (and C)
It lets you find “word” boundaries in japanese and chinese, useful for word wrapping and perhaps even for indexing if you remove the particles.

z1fire · February 25, 2024, 6:24pm

Very interesting. What would the use cases be for this. Maybe doing some analysis on word frequency?

Cloudef · February 25, 2024, 6:25pm

The main use case is for proper “word wrapping” for which I already use it for on the frontend, but on the backend I’m testing it out as a core component for creating search index for japanese text.

Cloudef · February 25, 2024, 6:29pm

Examples from my frontend:

Without budoux (standard browser word wrap)

With budoux

z1fire · February 25, 2024, 9:52pm

I see it, very nice!