c - Print UTF-8 string in ICU -
i discovered icu's ustdio.h
, thought fun test. didn't take long see wasn't quite right.
python 3 supports utf-8 in string literals, statement like
print("90°")
is valid.
icu (in c api) provides u_printf()
, u_printf_u()
, latter of designed whatever uchar
on system's implementation, @ to the lowest degree utf-16.
in effort test, tried printing out special character, grade symbol.
u_printf("90%c\n", 0xb0);
printed 90�
, did following:
u_printf(u8"90%c\n", 0xb0); u_printf("90°\n"); u_printf(u8"90°\n"); u_printf_u(u"90%c\n", 0x00b0);
however, declaring character in utf-16 string literal got desired result.
u_printf_u(u"90°\n"); $ ./a.out 90°
i stick this, want utf-8 compliance; seems superior system. why aren't utf-8 string literals c11 compatible icu's u_printf()
?
i able navigate issue creating string literal unicode characters in , passing char *
argument printf()
.
the next code prints line josé 90°\n
4 times.
char *s = u8"josé 90°"; (int = 0; < strlen(s); ++i) putchar(s[i]); putchar('\n'); printf("%s\n", s); u_printf("%s\n", s); uerrorcode error = u_zero_error; u_init(&error); uchar *s16 = malloc(256*sizeof(uchar)); u_strfromutf8(s16, 256, null, s, strlen(s), &error); u_printf_u(u"%s\n", s16); free(s16);
the buffer s16
can used u_strtoutf8()
go , compatible utf-8 functions. internal things in icu appear prefer utf-16 (i guess it's easier parse), you'll need convert before converting utf-8 homecoming caller.
c string encoding utf-8 icu
No comments:
Post a Comment