正则表达式获取HTML标记innerHTML
在参考下: http://www.imkevinyang.com/2010/07/javajs%e5%a6%82%e4%bd%95%e4%bd%bf%e7%94%a8%e6%ad%a3%e5%88%99%e8%a1%a8%e8%be%be%e5%bc%8f%e5%8c%b9%e9%85%8d%e5%b5%8c%e5%a5%97html%e6%a0%87%e7%ad%be.html 的基础上,根据自己的需求加以改进,获取标记中的所有内容信息 测试数据: <div style="background-color:gray;" id="footer"> 修改过后的正则: <(?<HtmlTag>[w]+)[^>]*s[iI][dD]=(?<Quote>["']?)footer(?(Quote)k<Quote>)[^>]*?(/>|>(?<innerHtml>((?<Nested><k<HtmlTag>[^>]*>)|</k<HtmlTag>>(?<-Nested>)|[sS]*?)*)</k<HtmlTag>>) 主要改正二处: 1,加了命名组innerHTML获取内部所有内容 2,最后部分>)|[.*?)*)</k<HtmlTag>>)改成了[sS]以适应多行 C# 调用代码: // Regex match RegexOptions options = RegexOptions.None; Regex regex = new Regex(@"<(?<HtmlTag>[w]+)[^>]*s[iI][dD]=(?<Quote>[""']?)footer(?(Quote)k<Quote>)[^>]*?(/>|>(?<innerHtml>((?<Nested><k<HtmlTag>[^>]*>)|</k<HtmlTag>>(?<-Nested>)|[sS]*?)*)</k<HtmlTag>>)",options); string input = @"<div style=""background-color:gray;"" id=""footer""> <a id=""gotop"" href=""#"" onclick=""MGJS.goTop();return false;"">Top</a> <a id=""powered"" href=""http://wordpress.org/"">WordPress</a> <div id=""copyright""> Copyright © 2009 简单生活 —— Kevin Yang的博客 </div> <div id=""themeinfo""> Theme by <a href=""http://www.neoease.com/"">mg12</a>. Valid <a href=""http://validator.w3.org/check?uri=referer"">XHTML 1.1</a> and <a href=""http://jigsaw.w3.org/css-validator/"">CSS 3</a> </div> <div/> <p/> </div> "; // Check for match bool isMatch = regex.IsMatch(input); if (isMatch) { // TODO: Do something with result MessageBox.Show(input,"IsMatch"); } // Get match Match match = regex.Match(input); // Get matches MatchCollection matches = regex.Matches(input); for (int i = 0; i != matches.Count; ++i) { // TODO: Do something with result MessageBox.Show(matches[i].Value,"Match"); } // Numbered groups for (int i = 0; i != match.Groups.Count; ++i) { Group group = match.Groups[i]; // TODO: Do something with result MessageBox.Show(group.Value,"Group: " + i); } // Named groups string groupA = match.Groups["HtmlTag"].Value; string groupB = match.Groups["innerHtml"].Value; // TODO: Do something with result MessageBox.Show(groupA,"Group: HtmlTag"); MessageBox.Show(groupB,"Group: innerHtml");
??
??
(编辑:李大同) 【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容! |