php – 从preg_match_all()突出显示主题字符串中的匹配结果
我试图用preg_match_all()返回的$matches数组突出显示主题字符串.让我从一个例子开始:
preg_match_all("/(.)/","abc",$matches,PREG_OFFSET_CAPTURE | PREG_SET_ORDER); 这将返回: Array ( [0] => Array ( [0] => Array ( [0] => a [1] => 0 ) [1] => Array ( [0] => a [1] => 0 ) ) [1] => Array ( [0] => Array ( [0] => b [1] => 1 ) [1] => Array ( [0] => b [1] => 1 ) ) [2] => Array ( [0] => Array ( [0] => c [1] => 2 ) [1] => Array ( [0] => c [1] => 2 ) ) ) 在这种情况下,我想要做的是突出显示整体消耗的数据和每个反向引用. 输出应如下所示: <span class="match0"> <span class="match1">a</span> </span> <span class="match0"> <span class="match1">b</span> </span> <span class="match0"> <span class="match1">c</span> </span> 另一个例子: preg_match_all("/(abc)/",PREG_OFFSET_CAPTURE | PREG_SET_ORDER); 应该返回: <span class="match0"><span class="match1">abc</span></span> 我希望这很清楚. 我想强调整体消费数据并突出显示每个反向引用. 提前致谢.如果有任何不清楚的地方,请询问. 注意:它不能破坏html.正则表达式和输入字符串都是代码未知的并且是完全动态的.因此搜索字符串可以是html,匹配的数据可以包含类似html的文本,但不包含. 解决方法
这似乎对我迄今为止所抛出的所有例子都是正确的.请注意,我已经从HTML-mangling部分中删除了抽象突出显示部分,以便在其他情况下重用:
<?php /** * Runs a regex against a string,and return a version of that string with matches highlighted * the outermost match is marked with [0]...[/0],the first sub-group with [1]...[/1] etc * * @param string $regex Regular expression ready to be passed to preg_match_all * @param string $input * @return string */ function highlight_regex_matches($regex,$input) { $matches = array(); preg_match_all($regex,$input,PREG_OFFSET_CAPTURE | PREG_SET_ORDER); // Arrange matches into groups based on their starting and ending offsets $matches_by_position = array(); foreach ( $matches as $sub_matches ) { foreach ( $sub_matches as $match_group => $match_data ) { $start_position = $match_data[1]; $end_position = $start_position + strlen($match_data[0]); $matches_by_position[$start_position]['START'][] = $match_group; $matches_by_position[$end_position]['END'][] = $match_group; } } // Now proceed through that array,annotoating the original string // Note that we have to pass through BACKWARDS,or we break the offset information $output = $input; krsort($matches_by_position); foreach ( $matches_by_position as $position => $matches ) { $insertion = ''; // First,assemble any ENDING groups,nested highest-group first if ( is_array($matches['END']) ) { krsort($matches['END']); foreach ( $matches['END'] as $ending_group ) { $insertion .= "[/$ending_group]"; } } // Then,any STARTING groups,nested lowest-group first if ( is_array($matches['START']) ) { ksort($matches['START']); foreach ( $matches['START'] as $starting_group ) { $insertion .= "[$starting_group]"; } } // Insert into output $output = substr_replace($output,$insertion,$position,0); } return $output; } /** * Given a regex and a string containing unescaped HTML,return a blob of HTML * with the original string escaped,and matches highlighted using <span> tags * * @param string $regex Regular expression ready to be passed to preg_match_all * @param string $input * @return string HTML ready to display :) */ function highlight_regex_as_html($regex,$raw_html) { // Add the (deliberately non-HTML) highlight tokens $highlighted = highlight_regex_matches($regex,$raw_html); // Escape the HTML from the input $highlighted = htmlspecialchars($highlighted); // Substitute the match tokens with desired HTML $highlighted = preg_replace('#[([0-9]+)]#','<span class="match1">',$highlighted); $highlighted = preg_replace('#[/([0-9]+)]#','</span>',$highlighted); return $highlighted; } 注意:正如hakra在聊天中向我指出的那样,如果正则表达式中的一个子组可以在一个整体匹配中多次出现(例如’/ a(b | c)/’),preg_match_all只会告诉你最后一个那些匹配 – 所以highlight_regex_matches(‘/ a(b | c)/’,’abc’)返回'[0] ab [1] c [/ 1] [/ 0]’不'[0] a [1] b [/ 1] [1] c [/ 1] [/ 0]’正如您所期望/想要的那样.之外的所有匹配组仍然可以正常工作,因此highlight_regex_matches(‘/ a((b | c))/’,’abc’)给出'[0] a [1] b [2] c [/ 2] [ / 1] [/ 0]’这仍然是正则表达式如何匹配的一个很好的指示. (编辑:李大同) 【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容! |