Discussion:
Chinese language support of Python?
Martin v. Loewis
2002-07-07 08:08:17 UTC
Permalink
I don't think that's going to work (caveat: I use PyQt which has different
conventions). If you absolutely want to have Chinese characters in your
root.title(unicode('???', 'utf-8')
The problem is that this won't work in IDLE.
* Actually I still think it would be great to be able to have sourcefiles
This can only happen after PEP 263 is adopted, otherwise, it will be
difficult to find out which bytes denote letters. Even then, it will
be difficult to find out when two identifiers are equal - __dict__
dictionaries would need to allow Unicode strings as keys.

Notice that this only a step towards what ChinesePython is doing,

http://www.python.org/doc/NonEnglish.html#chinese

which changes all the keywords to allow you to type Chinese-based
keywords instead of the traditional English-based keywords.
That this would make my source code unreadable for a lot other people, tant
pis, I still would like the power. Just as I want the power to do a quick
sys.setAppDefaultEncoding('utf-8') to make sure this application sees all
its strings as encoded in utf-8.
It could not guarantee this. If you read a byte string from some
external source, it might well not be UTF-8, and Python had no way to
find out.

Regards,
Martin
David LeBlanc
2002-07-07 06:49:28 UTC
Permalink
This may not be of much help, but Tk, the library behind Tkinter, is quite
popular for displaying Asian languages and there was a substantial body of
work done. Some changes where made to standard Tk to accomodate Asian
language display. You might find it necessary to build a custom version of
Tk for Tkinter to link with and possibly also modify some of the Tkinter
wrapper. You should be able to get more information about Asian language
support in Tk by asking on the comp.lang.tcl newsgroup.

You can also find some good information on Tk at http://www.tcl.tk/.
You can also find a pointer to a "traditional Chinese" page from
http://tcl.sourceforge.net/faqs/tcl/

Then, of course, there's Chinese Python at
http://chinesepython.cosoft.org.cn/cgi-bin/cgb/home.html (in chinese - also
has a Sourceforge page at http://sourceforge.net/projects/chinesepython/ in
english).

There are other links about tkinter and chinese at yahoo - I just entered
tkinter and chinese (without the word "and").

Note: while Tcl is a nice language, I do not suggest you abandon Python for
Tcl, especially if more then basic MS-Windows support is important to you. I
personally think Python might have a speed edge too.

David LeBlanc
Seattle, WA USA
-----Original Message-----
From: python-list-admin at python.org
[mailto:python-list-admin at python.org]On Behalf Of Leon Wang
Sent: Saturday, July 06, 2002 22:55
To: python-list at python.org
Subject: Re: Chinese language support of Python?
Hi, I got the Chinese displayed correctly in window title without
root.title(u'\u4e2d\u6587')
But still can not put Chinese directly as string in source, I can not
live with so much \u... for a whole Chinese sensence/paragraph, it's
impossible to read and edit them :(
However, I can print Chinese string (normal string, without u prefix
and \u codes) in console with command line python.exe. How can I let
Tkinter accept that?
martin at v.loewis.de (Martin v. Loewis) wrote in message
root.title('中文') # this is Chinese
[...]
If you use _real_ unicode -- for instance
root.title(u'\u028A\u0288') # no chinese, because of lacking
fonts, but IPA
then everything works fine -- at least, with my window
manager, on my OS.
It's more likely that the OP meant
root.title(u'\u4e2d\u6587')
Regards,
Martin
--
http://mail.python.org/mailman/listinfo/python-list
Boudewijn Rempt
2002-07-07 06:22:10 UTC
Permalink
Hi, I got the Chinese displayed correctly in window title without
root.title(u'\u4e2d\u6587')
But still can not put Chinese directly as string in source, I can not
live with so much \u... for a whole Chinese sensence/paragraph, it's
impossible to read and edit them :(
However, I can print Chinese string (normal string, without u prefix
and \u codes) in console with command line python.exe. How can I let
Tkinter accept that?
I don't think that's going to work (caveat: I use PyQt which has different
conventions). If you absolutely want to have Chinese characters in your
source files*, you can do something like the following**:

root.title(unicode('???', 'utf-8')

Note that you _will_ have to construct a unicode object, not an ordinary
string, since ordinary strings are just containers for bytes, one character
per byte. If you want the system to understand what you mean.

You can find out which encodings are available by inspecting the
python/lib/encodings directory (or, python\lib\encodings): you can use
any encoding instead of the 'utf-8'. Of course, the string must then
be in the right encoding, too.

There are some errors in my handling of this topic in my book, but it might
still be useful to you:

http://www.opendocspublishing.com/pyqt/index.lxp?lxpwrap=c2029%2ehtm

errata:

http://www.valdyas.org/python/book.html

The paper version has nice pictures that are quite useful in this chapter.

* Actually I still think it would be great to be able to have sourcefiles
in utf-8, not limited to unicode strings. I want to type:

def ??():
pass

That this would make my source code unreadable for a lot other people, tant
pis, I still would like the power. Just as I want the power to do a quick
sys.setAppDefaultEncoding('utf-8') to make sure this application sees all
its strings as encoded in utf-8.

** Note that this posting is encoded in utf-8. If you see gibberish instead
of a friendly greeting, then either the message is mangled, or your
newsreader can't handle the encoding, or you don't have the fonts to show
Chinese.
--
Boudewijn Rempt | http://www.valdyas.org
Leon Wang
2002-07-07 12:26:08 UTC
Permalink
I installed ChineseCodecs1.2.0 into lib/encodings, it converts GB2312
(simplified Chinese) to UTF-8, and I can use this:

root.title(unicode('??',"eucgb2312_cn"))

Great! I can put whole raw Chinese string in source now!
Before that, I also tried ChinesePython, a Chinese translation version
of Python 2.1, it even enabled this:

root.title('??') #directly put Chinese in normal string! The Best!!

But a little pity, it translated all prompt/error messages into BIG5
(Traditional Chinese), I can not view them in my GB Windows
environment, and no GB version available now yet. I have to uninstall
it and adopted the first solution.

More pity: the "Python GUI" utility -- IDLE, can not handle Chinese
string in source file(seems bit7 removed), neither the Python2.2.1
from python.org nor above chinesepython versions. If I open my source
with IDLE and save back, all Chinese string will be changed, this
means I cannot use it even for edit. Then, how can I debug the script
in GUI?

Thanks!
Leon Wang
Hi, I got the Chinese displayed correctly in window title without
root.title(u'\u4e2d\u6587')
But still can not put Chinese directly as string in source, I can not
live with so much \u... for a whole Chinese sentence/paragraph, it's
impossible to read and edit them :(
However, I can print Chinese string (normal string, without u prefix
and \u codes) in console with command line python.exe. How can I let
Tkinter accept that?
I don't think that's going to work (caveat: I use PyQt which has different
conventions). If you absolutely want to have Chinese characters in your
root.title(unicode('???', 'utf-8')
Note that you _will_ have to construct a unicode object, not an ordinary
string, since ordinary strings are just containers for bytes, one character
per byte. If you want the system to understand what you mean.
You can find out which encodings are available by inspecting the
python/lib/encodings directory (or, python\lib\encodings): you can use
any encoding instead of the 'utf-8'. Of course, the string must then
be in the right encoding, too.
There are some errors in my handling of this topic in my book, but it might
http://www.opendocspublishing.com/pyqt/index.lxp?lxpwrap=c2029%2ehtm
http://www.valdyas.org/python/book.html
The paper version has nice pictures that are quite useful in this chapter.
* Actually I still think it would be great to be able to have sourcefiles
pass
That this would make my source code unreadable for a lot other people, tant
pis, I still would like the power. Just as I want the power to do a quick
sys.setAppDefaultEncoding('utf-8') to make sure this application sees all
its strings as encoded in utf-8.
** Note that this posting is encoded in utf-8. If you see gibberish instead
of a friendly greeting, then either the message is mangled, or your
newsreader can't handle the encoding, or you don't have the fonts to show
Chinese.
Leon Wang
2002-07-06 14:19:45 UTC
Permalink
How can enable Chinese language support of Python? In IDLE, even can
not save the source file if contain any >128 ASCII code charactors. I
want to set the Window title in Chinese, but the bit7 is masked by
Tkinter:

from Tkinter import *
from Tkconstants import *

def filenew():
print 'filenew'
def fileopen():
print 'fileopen'
def fileexit():
print 'fileexit'
def helpabout():
print 'helpabout'
root=Tk()
menu = Menu(root)
root.config(menu=menu)
root.title('中文') # this is Chinese

filemenu = Menu(menu)
menu.add_cascade(label="File", menu=filemenu)
filemenu.add_command(label="New", command=filenew)
filemenu.add_command(label="Open...", command=fileopen)
filemenu.add_separator()
filemenu.add_command(label="Exit", command=fileexit)

helpmenu = Menu(menu)
menu.add_cascade(label="Help", menu=helpmenu)
helpmenu.add_command(label="About...", command=helpabout)

#frame=Frame(root)
#frame.master.title('ETUS')
#frame.pack()
mainloop()
Leon Wang
2002-07-07 22:25:53 UTC
Permalink
I found the best option: pythonwin, in win32 extension module,
including source editor and debugger!
Summarize the python for Chinese installation:
1) Python package from python.org
2) Win32all module from
http://starship.python.net/crew/mhammond/win32/
3) ChineseCodecs module from
ftp://freebsd.sinica.edu.tw/pub/ycheng/python/ChineseCodecs1.2.0.tar.gz

I think these are the best solution so far. Use this to display
root.title(unicode('中文',"eucgb2312_cn"))
root.title('中文')
Thanks for all of your help!!
Leon Wang
- Don't use IDLE to edit Python source code (but, say, notepad), and
only put Chinese text into string literals.
- Set the default encoding in site.py to the encoding you want to use.
- Apply patch
http://sourceforge.net/tracker/index.php?func=detail&aid=508973&group_id=9579&atid=309579
which allows you to declare the source encoding for IDLE.
In either case, you cannot use Chinese in Unicode literals. Instead,
you should always use
unicode("chinese string", "chinese encoding")
For portability, and if your editors support it, I recommend to use
UTF-8 as the "chinese encoding".
Regards,
Martin
Martin v. Loewis
2002-07-06 20:52:26 UTC
Permalink
root.title('中文') # this is Chinese
[...]
If you use _real_ unicode -- for instance
root.title(u'\u028A\u0288') # no chinese, because of lacking fonts, but IPA
then everything works fine -- at least, with my window manager, on my OS.
It's more likely that the OP meant

root.title(u'\u4e2d\u6587')

Regards,
Martin
Martin v. Loewis
2002-07-07 08:01:05 UTC
Permalink
But still can not put Chinese directly as string in source, I can not
live with so much \u... for a whole Chinese sensence/paragraph, it's
impossible to read and edit them :(
This is a known problem, and it will be addressed with PEP 263
(http://www.python.org/peps/pep-0263.html).

Meanwhile, you have the following options:

- Don't use IDLE to edit Python source code (but, say, notepad), and
only put Chinese text into string literals.
- Set the default encoding in site.py to the encoding you want to use.
- Apply patch
http://sourceforge.net/tracker/index.php?func=detail&aid=508973&group_id=9579&atid=309579

which allows you to declare the source encoding for IDLE.

In either case, you cannot use Chinese in Unicode literals. Instead,
you should always use

unicode("chinese string", "chinese encoding")

For portability, and if your editors support it, I recommend to use
UTF-8 as the "chinese encoding".

Regards,
Martin
Leon Wang
2002-07-07 05:54:30 UTC
Permalink
Hi, I got the Chinese displayed correctly in window title without
change the default encoding in site.py by:

root.title(u'\u4e2d\u6587')

But still can not put Chinese directly as string in source, I can not
live with so much \u... for a whole Chinese sensence/paragraph, it's
impossible to read and edit them :(
However, I can print Chinese string (normal string, without u prefix
and \u codes) in console with command line python.exe. How can I let
Tkinter accept that?
root.title('中文') # this is Chinese
[...]
If you use _real_ unicode -- for instance
root.title(u'\u028A\u0288') # no chinese, because of lacking fonts, but IPA
then everything works fine -- at least, with my window manager, on my OS.
It's more likely that the OP meant
root.title(u'\u4e2d\u6587')
Regards,
Martin
Wenshan Du
2002-07-08 04:10:11 UTC
Permalink
hi,
I think this problem is simple.
Try MBCSP 1.0
http://www.dohao.org/python/mbcsp
or visit http://www.dohao.org/python
glace
2002-07-08 09:08:34 UTC
Permalink
There will not be a GB version for chinesepython. GB support is
invoked by the '-g' flag when you starts the chinesepython
interpreter.

{'-b':"for BIG5(default)", '-g':"for GB"}.

For source files stored with different encodings, "#--GBK--" and
"#--BIG5--" magic comment strings should be place in sources to
indicate the respective encodings used.

Try again ^_^
glace
Post by Leon Wang
I installed ChineseCodecs1.2.0 into lib/encodings, it converts GB2312
root.title(unicode('中文',"eucgb2312_cn"))
Great! I can put whole raw Chinese string in source now!
Before that, I also tried ChinesePython, a Chinese translation version
root.title('中文') #directly put Chinese in normal string! The Best!!
But a little pity, it translated all prompt/error messages into BIG5
(Traditional Chinese), I can not view them in my GB Windows
environment, and no GB version available now yet. I have to uninstall
it and adopted the first solution.
More pity: the "Python GUI" utility -- IDLE, can not handle Chinese
string in source file(seems bit7 removed), neither the Python2.2.1
from python.org nor above chinesepython versions. If I open my source
with IDLE and save back, all Chinese string will be changed, this
means I cannot use it even for edit. Then, how can I debug the script
in GUI?
Thanks!
Leon Wang
Boudewijn Rempt
2002-07-06 19:25:28 UTC
Permalink
Post by Leon Wang
How can enable Chinese language support of Python? In IDLE, even can
not save the source file if contain any >128 ASCII code charactors. I
want to set the Window title in Chinese, but the bit7 is masked by
root.title('中文') # this is Chinese
Well, this isn't Chinese -- at least not when it arrived at my
server, but probably not even when it originated with you, because
I see your headers advertise ISO-8859-1 as the encoding. It's plain
string that contains an assortment of ampersands, hash marks, semicolons
and numbers in an order that hasn't much meaning.

If you use _real_ unicode -- for instance
root.title(u'\u028A\u0288') # no chinese, because of lacking fonts, but IPA
then everything works fine -- at least, with my window manager, on my OS.
--
Boudewijn Rempt | http://www.valdyas.org
Continue reading on narkive:
Loading...