unicode

unicode#

Provides various unicode helper functions.

Currently only supports UTF-8, but that might change in the future.

Functions

int utf8decode(const char *text, long *unicode_value)#

Try to read a UTF-8 character from a C-style string.

Invalid UTF-8 sequences are reported as being zero bytes long.

Parameters:
  • text – The text string to read a Unicode character from.

  • unicode_value – If provided, the actual Unicode value will be stored here.

Returns:

The number of bytes used to represent the UTF-8 code, or zero on error.

int utf8encode(long value, char *output)#

Write a unicode character value to a string in UTF-8 format.

Invalid UTF-8 sequences are reported as being zero bytes long.

NOTE: A NULL terminator byte is automatically added to the end of the output.

Parameters:
  • value – The unicode character to be encoded as a UTF-8 sequence.

  • output – If provided, the character will be written in UTF-8 format here.

Returns:

The number of bytes used to represent the UTF-8 code, or zero on error.