Hi Robin and all
w2k+sp4, Py2.5.2, wxPy2.8.8.0.pre20080608
I think there are a few bugs in STC, in the UTF8-Raw-text methods.
In a wx.STC class
self.ClearAll()
self.AddText('éléphant')
r = self.GetText()
print type(r), len(r), repr(r)
r = self.GetTextUTF8()
print type(r), len(r), repr(r)
r = self.GetTextRaw()
print type(r), len(r), r, repr(r)
returns
<type 'unicode'> 8 u'\xe9l\xe9phant'
<type 'str'> 9 '\xc3\xa9l\xc3\xa9phan'
<type 'str'> 9 éléphan '\xc3\xa9l\xc3\xa9phan'
An utf-8 'é' counts for 2 bytes, so the len(<str>) should be 10 and not 9.
It seems, it's not the last char which is not read, but the last byte.
self.ClearAll()
self.AddText('élé')
r = self.GetText()
print type(r), len(r), repr(r)
r = self.GetTextUTF8()
print type(r), len(r), repr(r)
r = self.GetTextRaw()
print type(r), len(r), r, repr(r)
returns
<type 'unicode'> 3 u'\xe9l\xe9'
<type 'str'> 4 '\xc3\xa9l\xc3'
<type 'str'> 4 élà '\xc3\xa9l\xc3'
The failing methods are GetText***() and GetCur***Line(). The others
UTF8-Raw text methods are working fine. Eg
r = self.GetSelectedText()
print type(r), len(r), repr(r)
r = self.GetSelectedTextUTF8()
print type(r), len(r), repr(r)
r = self.GetSelectedTextRaw()
print type(r), len(r), r, repr(r)
returns
<type 'unicode'> 8 u'\xe9l\xe9phant'
<type 'str'> 10 '\xc3\xa9l\xc3\xa9phant'
<type 'str'> 10 éléphant '\xc3\xa9l\xc3\xa9phant'
I did not test the previous releases.
Jean-Michel Fauth, Switzerland
_______________________________________________
wxpython-dev mailing list
[email protected]
http://lists.wxwidgets.org/mailman/listinfo/wxpython-dev