All the Irssi bugs should go here, please select the proper version when reporting the bug.
Please see the Irssi core http://www.irssi.org/ChangeLog for recent updates.
Please see the Irssi core http://www.irssi.org/ChangeLog for recent updates.
FS#545 - Latin1-characters in channel name break with recode
Attached to Project:
Irssi core bugs
Opened by Petteri Aimonen (jpa) - Saturday, 24 November 2007, 19:39 GMT+2
Last edited by Emanuele Giaquinta (ayin) - Tuesday, 04 November 2008, 19:48 GMT+2
Opened by Petteri Aimonen (jpa) - Saturday, 24 November 2007, 19:39 GMT+2
Last edited by Emanuele Giaquinta (ayin) - Tuesday, 04 November 2008, 19:48 GMT+2
|
DetailsTo reproduce, type /eval /join #test\366 , which is a way to join channel '#testö' encoded with ISO-8859-1. The channel appears as two windows. One window shows received messages, but you can only send messages in the other window.
The bug only occurs when recode is enabled. |
This task depends upon
Closed by Emanuele Giaquinta (ayin)
Tuesday, 04 November 2008, 19:48 GMT+2
Reason for closing: Fixed
Additional comments about closing: fixed in r4867
Tuesday, 04 November 2008, 19:48 GMT+2
Reason for closing: Fixed
Additional comments about closing: fixed in r4867
ayin: tried, with svn checked out this morning. I see no change in behaviour whatsoever, channels still break as described above.
Another thing - with iso8859-15 it doesn't work. Haven't checked the source, but this sounds like some clumsy workaround. There are other charsets than just iso8859-1 and utf-8.
The code currently performs a conversion, for both incoming/outgoing messages, on the full message.
I'll try to reproduce the problems.
It would be nice though to give warning or to have a workaround for users accustomed to using /eval join (and who have added channels to config file with /eval channel add), instead of just breaking up.
client 1 (term_charset ISO-8859-15)
/join #test\366
client 2 (term_charset UTF-8)
/recode add foo ISO-8859-15
/join #test\303\266
It works fine, and creates no additional windows.
I wonder though how it should work if I have both utf-8 and latin1-named channels on the same server. Not as important for me, but during transitional period to utf-8 I think this is going to be quite common.
- associating an encoding to the channel
- converting the line selectively so that the channel name is converted using the associated encoding and the rest of the line using a server/channel preference for text encoding.
Other than the work involved, this adds quite a bit of overhead to message processing and i'm not sure it is worth. All the irc clients that i know support only a server encoding preference with which the whole line is converted.
Note that you can also specify a recode preference for a channel, but it does not play well if the channel includes 8bit bytes because the name would be stored in your encoding. An alternative could be to allow /recode to accept hex escapes so you could do /recode add #test\366 ISO-8859-15. This of course would mean that the whole line is converted with the channel name encoding, but i think the aproximation 'channel name encoding == text encoding in the channel' should work in most cases.
What do you think?
Currently there is also trouble when recode settings are changed on the fly. Would it be possible to store the actual byte-string used for channel name when transmitted to server somewhere along with the displayed channel name, and use it for all server communications (and compare to it when receiving messages)? This way, recoding would be done only on join, and it might be possible to also have something like /join -charset UTF-8 #testö or /join UTF-8/#testö.
If all channel name recoding is done in one place, a prefix like UTF-8/ would allow using it for any command (/channel add, /part, etc.). The displayed name could have the associated prefix if it was given on /join, thus solving the problem of having both UTF-8/#testö and ISO8859-15/#testö open at the same time.