使用xslt 2.0拆分大型xml文件

发布时间：2020-12-16 23:07:27 所属栏目：百科来源：网络整理

导读：我有这个源xml文件. DATA DATASET KE action="create" AUSVa/A BUSVb/B CUSV10/C /KE KE .... /KE /DATASET /DATA 元素“KE”存在约30000次.我想创建每5000个“KE”一个新的XML文件.在30000 KE-elements的情况下,结果必须是6个单独的xml文件,结构是源xml的副

我有这个源xml文件.

<DATA>
    <DATASET>      
      <KE action="create">
         <A>USVa</A>
         <B>USVb</B>
         <C>USV10</C>             
      </KE>
      <KE>
       ....
      </KE>
    </DATASET>
   </DATA>

元素“KE”存在约30000次.我想创建每5000个“KE”一个新的XML文件.在30000 KE-elements的情况下,结果必须是6个单独的xml文件,结构是源xml的副本.

我如何用XSLT 2.0实现这一点？我正在使用saxonhe9-5-1-3j.非常感谢 …

解决方法

使用XSLT 2.0功能xsl：for-each-group和KE元素的位置模数.然后,使用xsl：result-document元素生成输出文档.

我的示例XSLT代码为3个KE元素组创建了一个新的结果文档.将此数字调整为“5000”以输入XML.

样式表

1感谢@Martin Honnen,简化了样式表. 2再次编辑,@ michael.hor257k建议.

<?xml version="1.0" encoding="utf-8"?>

<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:output method="xml" indent="yes"/>

<xsl:template match="/DATA">
  <xsl:apply-templates/>
</xsl:template>

<xsl:template match="DATASET">
  <xsl:for-each-group select="KE" group-starting-with="KE[(position() -1)mod 3 = 0]">
     <xsl:variable name="file" select="concat('ke',position(),'.xml')"/>
     <xsl:result-document href="{$file}">
        <DATA>
           <DATASET>
              <xsl:copy-of select="current-group()"/>
           </DATASET>
        </DATA>
     </xsl:result-document>
  </xsl:for-each-group>
</xsl:template>

</xsl:stylesheet>

您得到以下输出(为方便起见,我已对KE元素进行编号,样式表不依赖于n属性).

输出：ke1.xml

<?xml version="1.0" encoding="UTF-8"?>
<DATA>
 <DATASET>
  <KE n="1" action="create">
     <A>USVa</A>
     <B>USVb</B>
     <C>USV10</C>
  </KE>
  <KE n="2" action="create">
     <A>USVa</A>
     <B>USVb</B>
     <C>USV10</C>
  </KE>
  <KE n="3" action="create">
     <A>USVa</A>
     <B>USVb</B>
     <C>USV10</C>
  </KE>
 </DATASET>
</DATA>

输出：ke2.xml

<?xml version="1.0" encoding="UTF-8"?>
<DATA>
 <DATASET>
  <KE n="4" action="create">
     <A>USVa</A>
     <B>USVb</B>
     <C>USV10</C>
  </KE>
  <KE n="5" action="create">
     <A>USVa</A>
     <B>USVb</B>
     <C>USV10</C>
  </KE>
  <KE n="6" action="create">
     <A>USVa</A>
     <B>USVb</B>
     <C>USV10</C>
  </KE>
 </DATASET>
</DATA>

其他输出文档看起来一样.

（编辑：李大同）

【声明】本站内容均来自网络，其相关言论仅代表作者个人观点，不代表本站立场。若无意侵犯到您的权利，请及时与联系站长删除相关内容!