加入收藏 | 设为首页 | 会员中心 | 我要投稿 李大同 (https://www.lidatong.com.cn/)- 科技、建站、经验、云计算、5G、大数据,站长网!
当前位置: 首页 > 百科 > 正文

正则表达式获取HTML标记innerHTML

发布时间:2020-12-14 01:14:20 所属栏目:百科 来源:网络整理
导读:在参考下: http://www.imkevinyang.com/2010/07/javajs%e5%a6%82%e4%bd%95%e4%bd%bf%e7%94%a8%e6%ad%a3%e5%88%99%e8%a1%a8%e8%be%be%e5%bc%8f%e5%8c%b9%e9%85%8d%e5%b5%8c%e5%a5%97html%e6%a0%87%e7%ad%be.html 的基础上,根据自己的需求加以改进,获取标记

在参考下: http://www.imkevinyang.com/2010/07/javajs%e5%a6%82%e4%bd%95%e4%bd%bf%e7%94%a8%e6%ad%a3%e5%88%99%e8%a1%a8%e8%be%be%e5%bc%8f%e5%8c%b9%e9%85%8d%e5%b5%8c%e5%a5%97html%e6%a0%87%e7%ad%be.html

的基础上,根据自己的需求加以改进,获取标记中的所有内容信息

测试数据:

<div style="background-color:gray;" id="footer">
<a id="gotop" href="#" onclick="MGJS.goTop();return false;">Top</a>
<a id="powered" href="WordPresshttp://wordpress.org/">WordPress</a>
<div id="copyright">
Copyright &copy; 2009 简单生活 —— Kevin Yang的博客 </div>
<div id="themeinfo">
Theme by <a href="mg12http://www.neoease.com/">mg12</a>.
Valid <a href="XHTML" rel="nofollow">http://validator.w3.org/check?uri=referer">XHTML 1.1</a>
and <a href="CSS" rel="nofollow">http://jigsaw.w3.org/css-validator/">CSS 3</a>
</div>
<div/>
<p/>
</div>

修改过后的正则:

<(?<HtmlTag>[w]+)[^>]*s[iI][dD]=(?<Quote>["']?)footer(?(Quote)k<Quote>)[^>]*?(/>|>(?<innerHtml>((?<Nested><k<HtmlTag>[^>]*>)|</k<HtmlTag>>(?<-Nested>)|[sS]*?)*)</k<HtmlTag>>)

主要改正二处:

1,加了命名组innerHTML获取内部所有内容

2,最后部分>)|[.*?)*)</k<HtmlTag>>)改成了[sS]以适应多行


C# 调用代码:

  // Regex match
            RegexOptions options = RegexOptions.None;
            Regex regex = new Regex(@"<(?<HtmlTag>[w]+)[^>]*s[iI][dD]=(?<Quote>[""']?)footer(?(Quote)k<Quote>)[^>]*?(/>|>(?<innerHtml>((?<Nested><k<HtmlTag>[^>]*>)|</k<HtmlTag>>(?<-Nested>)|[sS]*?)*)</k<HtmlTag>>)",options);
            string input = @"<div style=""background-color:gray;"" id=""footer"">
    <a id=""gotop"" href=""#"" onclick=""MGJS.goTop();return false;"">Top</a>
    <a id=""powered"" href=""http://wordpress.org/"">WordPress</a>
    <div id=""copyright"">
        Copyright &copy; 2009 简单生活 —— Kevin Yang的博客    </div>
    <div id=""themeinfo"">
        Theme by <a href=""http://www.neoease.com/"">mg12</a>.
 Valid <a href=""http://validator.w3.org/check?uri=referer"">XHTML 1.1</a>
        and <a href=""http://jigsaw.w3.org/css-validator/"">CSS 3</a>
    </div>
<div/>
<p/>
 </div> ";

            // Check for match
            bool isMatch = regex.IsMatch(input);
            if (isMatch)
            {
                // TODO: Do something with result
                MessageBox.Show(input,"IsMatch");
            }

            // Get match
            Match match = regex.Match(input);

            // Get matches
            MatchCollection matches = regex.Matches(input);
            for (int i = 0; i != matches.Count; ++i)
            {
                // TODO: Do something with result
                MessageBox.Show(matches[i].Value,"Match");
            }

            // Numbered groups
            for (int i = 0; i != match.Groups.Count; ++i)
            {
                Group group = match.Groups[i];

                // TODO: Do something with result
                MessageBox.Show(group.Value,"Group: " + i);
            }

            // Named groups
            string groupA = match.Groups["HtmlTag"].Value;
            string groupB = match.Groups["innerHtml"].Value;

            // TODO: Do something with result
            MessageBox.Show(groupA,"Group: HtmlTag");
            MessageBox.Show(groupB,"Group: innerHtml");



??
??

(编辑:李大同)

【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容!

    推荐文章
      热点阅读