如何从xml文件创建R数据帧
发布时间:2020-12-16 07:59:04 所属栏目:百科 来源:网络整理
导读:我有一个XML文档文件.该文件的一部分如下所示: -attr attrlablCOUNTY/attrlabl attrdefCounty abbreviation/attrdef attrtypeText/attrtype attwidth1/attwidth atnumdec0/atnumdec -attrdomv -edom edomvC/edomv edomvdClackamas County/edomvd edomvds/ /
我有一个XML文档文件.该文件的一部分如下所示:
-<attr> <attrlabl>COUNTY</attrlabl> <attrdef>County abbreviation</attrdef> <attrtype>Text</attrtype> <attwidth>1</attwidth> <atnumdec>0</atnumdec> -<attrdomv> -<edom> <edomv>C</edomv> <edomvd>Clackamas County</edomvd> <edomvds/> </edom> -<edom> <edomv>M</edomv> <edomvd>Multnomah County</edomvd> <edomvds/> </edom> -<edom> <edomv>W</edomv> <edomvd>Washington County</edomvd> <edomvds/> </edom> </attrdomv> </attr> 从这个XML文件中,我想创建一个带有attrlabl,attrdef,attrtype和attrdomv列的R数据框.请注意,attrdomv列应包含category变量的所有级别.数据框应如下所示: attrlabl attrdef attrtype attrdomv COUNTY County abbreviation Text C Clackamas County; M Multnomah County; W Washington County 我有一个不完整的代码,如下所示: doc <- xmlParse("taxlots.shp.xml") dataDictionary <- xmlToDataFrame(getNodeSet(doc,"//attrlabl")) 你能完成我的R代码吗?我感谢任何帮助!
假设这是正确的taxlots.shp.xml文件:
<attr> <attrlabl>COUNTY</attrlabl> <attrdef>County abbreviation</attrdef> <attrtype>Text</attrtype> <attwidth>1</attwidth> <atnumdec>0</atnumdec> <attrdomv> <edom> <edomv>C</edomv> <edomvd>Clackamas County</edomvd> <edomvds/> </edom> <edom> <edomv>M</edomv> <edomvd>Multnomah County</edomvd> <edomvds/> </edom> <edom> <edomv>W</edomv> <edomvd>Washington County</edomvd> <edomvds/> </edom> </attrdomv> </attr> 你几乎在那里: doc <- xmlParse("taxlots.shp.xml") xmlToDataFrame(nodes=getNodeSet(doc1,"//attr"))[c("attrlabl","attrdef","attrtype","attrdomv")] attrlabl attrdef attrtype attrdomv 1 COUNTY County abbreviation Text CClackamas CountyMMultnomah CountyWWashington County 但是最后一个字段没有您想要的格式.为此,需要一些额外的步骤: step1 <- xmlToDataFrame(nodes=getNodeSet(doc1,"//attrdomv/edom")) step1 edomv edomvd edomvds 1 C Clackamas County 2 M Multnomah County 3 W Washington County step2 <- paste(paste(step1$edomv,step1$edomvd,sep=" "),collapse="; ") step2 [1] "C Clackamas County; M Multnomah County; W Washington County" cbind(xmlToDataFrame(nodes= getNodeSet(doc1,"attrtype")],attrdomv= step2) attrlabl attrdef attrtype attrdomv 1 COUNTY County abbreviation Text C Clackamas County; M Multnomah County; W Washington County (编辑:李大同) 【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容! |