summaryrefslogtreecommitdiff
path: root/ChangeLog
diff options
context:
space:
mode:
authorGiuseppe Bilotta <giuseppe.bilotta@gmail.com>2007-02-20 23:02:35 +0000
committerGiuseppe Bilotta <giuseppe.bilotta@gmail.com>2007-02-20 23:02:35 +0000
commit397b61df257f72a8ce90792985f76497ba735da4 (patch)
tree7b8321eab08498376d537178ebe7ed57dfc23713 /ChangeLog
parent1572836f8c2888742b4f65da7dc6f66735f94bc1 (diff)
Use ASCII KCODE to prevent problems like missing characters or matching failures when clients send messages in something else than UTF-8
Diffstat (limited to 'ChangeLog')
-rw-r--r--ChangeLog10
1 files changed, 10 insertions, 0 deletions
diff --git a/ChangeLog b/ChangeLog
index 358aab5f..403e8c41 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -6,6 +6,16 @@
<yaohan.chen@gmail.com>. People take turns to continue a chain of
words by saying words that begin with the final letter(s) of the
previous word.
+ * IRC messages are not UTF-8: Most of the string processing across
+ rbot is done against IRC messages, which do not have a well-defined
+ encoding. Although many clients are now using UTF-8, there is no
+ guarantee that an arbitrary string received from IRC will be UTF-8
+ encoded. We have to force ASCII (byte-wise/charset agnostic) matching
+ because otherwise some strings can give problems: in particular, for
+ example, the bytesequence "\340\350\354\362\371" (that is the aeiou
+ vowels, each with a grave accent) will cause the string to be
+ considered up to the "\354" (i with grave accent) only: so either the
+ rest of the message is ignored, or the matching fails.
2007-02-18 Giuseppe Bilotta <giuseppe.bilotta@gmail.com>