This is an implementation of the original Netscape library, which is currently used by Mozilla.
Installation:
1. Construct an instance of org.mozilla.universalchardet.UniversalDetector.
2. Feed some data (typically several thousands bytes) to the detector by calling UniversalDetector.handleData().
3. Notify the detector of the end of data by calling UniversalDetector.dataEnd().
4. Get the detected encoding name by calling UniversalDetector.getDetectedCharset().
5. Don't forget to call UniversalDetector.reset() before you reuse the detector instance.
Here are some key features of "juniversalchardet":
Chinese:
· ISO-2022-CN
· BIG5
· EUC-TW
· GB18030
· HZ-GB-23121
Cyrillic:
· ISO-8859-5
· KOI8-R
· WINDOWS-1251
· MACCYRILLIC
· IBM866
· IBM855
Greek:
· ISO-8859-7
· WINDOWS-1253
Hebrew:
· ISO-8859-8
· WINDOWS-1255
Japanese:
· ISO-2022-JP
· SHIFT_JIS
· EUC-JP
Korean:
· ISO-2022-KR
· EUC-KR
Unicode:
· UTF-8
· UTF-16BE / UTF-16LE
· UTF-32BE / UTF-32LE / X-ISO-10646-UCS-4-34121 / X-ISO-10646-UCS-4-21431
Others:
· WINDOWS-1252