加入收藏 | 设为首页 | 会员中心 | 我要投稿 李大同 (https://www.lidatong.com.cn/)- 科技、建站、经验、云计算、5G、大数据,站长网!
当前位置: 首页 > 百科 > 正文

将xml数据加载到hive表中:org.apache.hadoop.hive.ql.metadata.

发布时间:2020-12-16 23:18:24 所属栏目:百科 来源:网络整理
导读:我正在尝试将 XML数据加载到Hive中,但是我收到了一个错误: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {“xmldata”:””} 我使用的xml文件是: ?xml version="1.0" encodin
我正在尝试将 XML数据加载到Hive中,但是我收到了一个错误:

java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {“xmldata”:””}

我使用的xml文件是:

<?xml version="1.0" encoding="UTF-8"?>
<catalog>
<book>
  <id>11</id>
  <genre>Computer</genre>
  <price>44</price>
</book>
<book>
  <id>44</id>
  <genre>Fantasy</genre>
  <price>5</price>
</book>
</catalog>

我使用的配置单元查询是:

1) Create TABLE xmltable(xmldata string) STORED AS TEXTFILE;
LOAD DATA lOCAL INPATH '/home/user/xmlfile.xml' OVERWRITE INTO TABLE xmltable;

2) CREATE VIEW xmlview (id,genre,price)
AS SELECT
xpath(xmldata,'/catalog[1]/book[1]/id'),xpath(xmldata,'/catalog[1]/book[1]/genre'),'/catalog[1]/book[1]/price')
FROM xmltable;

3) CREATE TABLE xmlfinal AS SELECT * FROM xmlview;

4) SELECT * FROM xmlfinal WHERE id ='11

直到第二次查询一切都很好,但当我执行第三次查询时,它给了我错误:

错误如下:

java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"xmldata":"<?xml version="1.0" encoding="UTF-8"?>"}
    at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:159)
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:417)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332)
    at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1438)
    at org.apache.hadoop.mapred.Child.main(Child.java:262)
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error    while processing row {"xmldata":"<?xml version="1.0" encoding="UTF-8"?>"}
    at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:675)
    at org.apache.hadoop.hive.ql.exec

FAILED: Execution Error,return code 2 from org.apache.hadoop.hive.ql.exec.MapRedTask

那么哪里出错了?我也在使用正确的xml文件.

谢谢,
斯里

解决方法

错误原因:

1)case-1 :(你的情况) – xml内容被逐行送入hive.

输入xml:

<?xml version="1.0" encoding="UTF-8"?>
<catalog>
<book>
  <id>11</id>
  <genre>Computer</genre>
  <price>44</price>
</book>
<book>
  <id>44</id>
  <genre>Fantasy</genre>
  <price>5</price>
</book>
</catalog>

检查蜂巢:

select count(*) from xmltable;  // return 13 rows - means each line in individual row with col xmldata

错误原因:

XML被读作13件不统一.所以XML无效

2)case-2:xml内容应该作为singleString – XpathUDFs工作
引用语法:所有函数都遵循以下形式:xpath_(xml_string,xpath_expression_string).* source

input.xml中

<?xml version="1.0" encoding="UTF-8"?><catalog><book><id>11</id><genre>Computer</genre><price>44</price></book><book><id>44</id><genre>Fantasy</genre><price>5</price></book></catalog>

检查蜂巢:

select count(*) from xmltable; // returns 1 row - XML is properly read as complete XML.

意思是:

xmldata   = <?xml version="1.0" encoding="UTF-8"?><catalog><book> ...... </catalog>

然后像这样应用你的xpathUDF

select xpath(xmldata,'xpath_expression_string' ) from xmltable

(编辑:李大同)

【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容!

    推荐文章
      热点阅读