XML解析中Bom导致错误的问题分析与解决
发布时间:2020-12-16 06:15:53 所属栏目:百科 来源:网络整理
导读:错误信息:org.dom4j.DocumentException:Error on line 1of document:Content is not allowed in prolog. Nested exception: Content is not allowed in prolog. XML编码错误: 左边报错的XML,右边正常的xml文件,比较工具Beyond Compare 4 解决办法: 1、
错误信息:org.dom4j.DocumentException:Error on line 1of document:Content is not allowed in prolog.Nested exception: Content is not allowed in prolog. XML编码错误: 解决办法: 2、对于webService接收来的xmlString的处理,使用如下方法,修改xml字符串 /** * 检查xml字符串是否有非法前缀 * @param xmlStr * @return */
public String checkXMLStr(String xmlStr){
StringBuilder sb= new StringBuilder(xmlStr);
int index = sb.indexOf("<?xml");
if(index > 0){
sb.delete(0,index);
xmlStr = sb.toString();
}else if(index == -1){
xmlStr = "";
}
return xmlStr;
}
3、为了程序的健壮性,可以在读文件的时候,加入判断,判断是否有Bom,有的话,在生成字符串的时候,将其删除,方法如下: /** * 检查byte数组 是否有BOM头 * UTF8文件都有一个3字节的头,为“EF BB BF”(称为BOM--Byte Order Mark) * @param bytes * @return */
private static boolean CheckBOM( byte[] bytes )
{
boolean isBOM = false;
{
if(bytes.length >3){
if( 0xef == (bytes[0] & 0xff)
&& 0xbb == (bytes[1] & 0xff)
&& 0xbf == (bytes[2] & 0xff) ){
isBOM = true;
}
}
}
//System.out.println("是否有BOM:"+isBOM);
return isBOM;
}
/** * 将文件读取为UTF-8编码字符串 * @param filePath * @return */
public String getXMLFileText(String filePath) {
String retXMLStr = "";
byte[] bt = fileToByteArray(filePath);
//加入一个判断,文件流是否含有Bom,有就删除
if( CheckBOM(bt) ){
try {
retXMLStr = new String(bt,3,bt.length -3,"utf-8");
} catch (UnsupportedEncodingException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}else{
try {
retXMLStr = new String(bt,0,bt.length,"utf-8");
} catch (UnsupportedEncodingException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
//return checkXMLStr(retXMLStr);
return retXMLStr;
}
// 将文件读成byte[]数组
public byte[] fileToByteArray(String filePath) {
filePath = filePath.replaceAll("\","/");
File file = null;
FileInputStream fileInputStream = null;
BufferedInputStream in = null;
ByteArrayOutputStream out = null;
byte[] bt = null;
try {
file = new File(filePath);
if (!file.exists() || file.isDirectory()) {
return null;
}
fileInputStream = new FileInputStream(file);
in = new BufferedInputStream(fileInputStream);
out = new ByteArrayOutputStream();
byte[] temp = new byte[1024 * 1024]; //每次读取 1M
int size = 0;
while ((size = in.read(temp)) != -1) {
out.write(temp,size);
}
bt = out.toByteArray();
// for(int i = 0; i < bt.length; i++)
} catch (Exception e) {
e.printStackTrace();
} finally {
try {
fileInputStream.close();
in.close();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
return bt;
}
问题分析: 参考链接:http://blog.sina.com.cn/s/blog_6d5d8b580100txon.html (编辑:李大同) 【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容! |