Monday 15 July 2013

encoding - Why perl cannot show all types of UTF8 characters -



encoding - Why perl cannot show all types of UTF8 characters -

i strangling perl script made supposed handle ipa characters (international phonetic alphabet). worked utf8 encoding, perl file , std in/out follows:

#!/usr/local/bin/perl utilize utf8; binmode(stdout, ":utf8"); #treat if utf-8 binmode(stdin, ":encoding(utf8)"); #actually check if utf-8

however when run little test:

my %ipachar = ( "69" => "i", "65" => "e", "25b" => "ɛ", "" => "ɛ̃", "" => "œ̃", "153" => "œ", "259" => "ə", "f8" => "ø", "79" => "y", "75" => "u", "6f" => "o", "254" => "ɔ", "" => "ɔ̃", "e3" => "ɑ̃", "251" => "ɑ", "61" => "a", "6a" => "j", "265" => "ɥ", "77" => "w", "6e" => "n", "272" => "ɲ", "14b" => "ŋ", "261" => "ɡ", "6b" => "k", "6d" => "m", "62" => "b", "70" => "p", "76" => "v", "66" => "f", "64" => "d", "74" => "t", "292" => "ʒ", "283" => "ʃ", "7a" => "z", "73" => "s", "281" => "ʁ", "6c" => "l", "" => "h", "294" => "ʔ", "2e" => ".", "280" => "ʀ", "1dd" => "ǝ", "72" => "r", "3b5" => "ε", "67" => "g", "25c" => "ɜ", "2d0" => "ː", "2c8" => "ˈ", "2b0" => "ʰ", "26a" => "ɪ" ); foreach $k ( sort keys(%ipachar) ) { print "\n[$k] /$ipachar{$k}/"; }

all characters not printed properly. weird since characters "ä" or "ø" or "ε" appear properly, cannot manage create other specific characters working e.g "ʃ","ɜ",....

if help appreciate!!!

thanks reading,

simon

are looking @ output of programme on console or in editor?

even if programme generating right character codes symbols want, have using font supports symbols display text; otherwise display won't create sense.

it can useful open text file using browser, web browsers have accommodate pretty much official encoding, , able render contents of file correctly.

a quick search found this list of fonts back upwards ip symbols. if utilize 1 of should able see output properly.

i highly recommend gnu unifont, has best coverage of unicode character set of font know. it's sans-serif font.

update

it worries me your definition of %ipachar hash has multiple keys in set null or empty string "". it's valid hash key, nature of hashes means can have only one element key. officially, value of hash element $ipachar{''} undefined in situation. in practice set last value in list has same key, in case $ipachar{''} = 'h'.

perl encoding utf-8 ipa

No comments:

Post a Comment