You're viewing a comment by Peteris Krumins and its responses.

September 29, 2009, 10:42

Roman, I haven't figured that out yet. I looked at this:

>>> import sys
>>> sys.getdefaultencoding()
'ascii'

Seems like the default encoding is 'ascii', but when I do:

>>> u = u'\u5554'
>>> print u
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'latin-1' codec can't encode character u'\u5554' in position 0: ordinal not in range(256)

It says it tried to encode it to latin-1 but encountered a char that could not be represented with this encoding.

But this works:

>>> print u.encode('utf-8')
啔

Update:

>>> import locale
>>> locale.getpreferredencoding()
'ISO-8859-1'

ISO-8859-1 is Latin-1.

Another update:

$ echo $LANG
en_US
$ LANG=en_US.UTF-8
$ python
>>> import locale
>>> locale.getpreferredencoding()
'UTF-8'
>>> u = u'\u5555'
>>> print u
啕

That explains it.

Reply To This Comment

(why do I need your e-mail?)

(Your twitter name, if you have one. (I'm @pkrumins, btw.))

Type the word "rocket_158": (just to make sure you're a human)

Please preview the comment before submitting to make sure it's OK.