utf8proc_charwidth returns 1 for é character #170

markand · 2020-01-27T09:31:58Z

Hi,

I'm sorry if I misunderstood the purpose of utf8proc_charwidth, I thought it'd return the number of bytes required for a given codepoint.

Example, the letter é in UTF-8 is two bytes long (value of 233 in unicode) but utf8proc_charwidth(233) always return 1.

Did I miss something or there is no function to get the number of bytes a codepoint require?

The text was updated successfully, but these errors were encountered:

PallHaraldsson · 2020-01-27T11:11:45Z

I think you have this in mind:

julia> ncodeunits("é")
2

I think the other has to do with with on the screen, if I recall to support Chinese.

stevengj · 2020-01-28T11:42:00Z

charwidth is an approximate on-screen width in fixed-width fonts, not the number of code units.

stevengj closed this as completed Jan 28, 2020

markand mentioned this issue Jan 28, 2020

add new utf8proc_codepoint_units function #172

Open

Provide feedback