将xml数据加载到hive表中:org.apache.hadoop.hive.ql.metadata.
发布时间:2020-12-16 23:18:24 所属栏目:百科 来源:网络整理
导读:我正在尝试将 XML数据加载到Hive中,但是我收到了一个错误: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {“xmldata”:””} 我使用的xml文件是: ?xml version="1.0" encodin
我正在尝试将
XML数据加载到Hive中,但是我收到了一个错误:
我使用的xml文件是: <?xml version="1.0" encoding="UTF-8"?> <catalog> <book> <id>11</id> <genre>Computer</genre> <price>44</price> </book> <book> <id>44</id> <genre>Fantasy</genre> <price>5</price> </book> </catalog> 我使用的配置单元查询是: 1) Create TABLE xmltable(xmldata string) STORED AS TEXTFILE; LOAD DATA lOCAL INPATH '/home/user/xmlfile.xml' OVERWRITE INTO TABLE xmltable; 2) CREATE VIEW xmlview (id,genre,price) AS SELECT xpath(xmldata,'/catalog[1]/book[1]/id'),xpath(xmldata,'/catalog[1]/book[1]/genre'),'/catalog[1]/book[1]/price') FROM xmltable; 3) CREATE TABLE xmlfinal AS SELECT * FROM xmlview; 4) SELECT * FROM xmlfinal WHERE id ='11 直到第二次查询一切都很好,但当我执行第三次查询时,它给了我错误: 错误如下: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"xmldata":"<?xml version="1.0" encoding="UTF-8"?>"} at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:159) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:417) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332) at org.apache.hadoop.mapred.Child$4.run(Child.java:268) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1438) at org.apache.hadoop.mapred.Child.main(Child.java:262) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"xmldata":"<?xml version="1.0" encoding="UTF-8"?>"} at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:675) at org.apache.hadoop.hive.ql.exec FAILED: Execution Error,return code 2 from org.apache.hadoop.hive.ql.exec.MapRedTask 那么哪里出错了?我也在使用正确的xml文件. 谢谢, 解决方法
错误原因:
1)case-1 :(你的情况) – xml内容被逐行送入hive. 输入xml: <?xml version="1.0" encoding="UTF-8"?> <catalog> <book> <id>11</id> <genre>Computer</genre> <price>44</price> </book> <book> <id>44</id> <genre>Fantasy</genre> <price>5</price> </book> </catalog> 检查蜂巢: select count(*) from xmltable; // return 13 rows - means each line in individual row with col xmldata 错误原因: XML被读作13件不统一.所以XML无效 2)case-2:xml内容应该作为singleString – XpathUDFs工作 input.xml中 <?xml version="1.0" encoding="UTF-8"?><catalog><book><id>11</id><genre>Computer</genre><price>44</price></book><book><id>44</id><genre>Fantasy</genre><price>5</price></book></catalog> 检查蜂巢: select count(*) from xmltable; // returns 1 row - XML is properly read as complete XML. 意思是: xmldata = <?xml version="1.0" encoding="UTF-8"?><catalog><book> ...... </catalog> 然后像这样应用你的xpathUDF select xpath(xmldata,'xpath_expression_string' ) from xmltable (编辑:李大同) 【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容! |