private static final
byte[] UTF_BOM =
new byte[]{(
byte) 0xEF,(
byte) 0xBB,(
byte) 0xBF};
/**
* 判断并移除UTF-8的BOM头
*/
public static InputStream utf8filte(InputStream
in) {
try {
PushbackInputStream pis =
new PushbackInputStream(
in,3);
byte[] header =
new byte[3];
pis.read(header,0,3);
if(header[0] != UTF_BOM[0] || header[1] != UTF_BOM[1] || header[2] != UTF_BOM[2]) {
pis.unread(header,0,3);
}
return pis;
}
catch (IOException e) {
throw Lang.wrapThrow(e);
}
}
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }
在用记事本之类的程序将文本文件保存为UTF-8格式时,记事本会在文件头前面加上几个不可见的字符(EF BB BF),就是所谓的BOM(Byte Order Mark)。JDK1.5之前的Reader都不能处理BOM,解析这种格式的xml文件时,会抛出异常:Content is not allowed in prolog. 据说JDK1.6已经解决了这个bug。(参考http://www.uuzone.com/blog/mao/98921.htm )