问题描述
|
所以我需要剥离类
tip
的span
标签。
这样便是<span class=\"tip\">
和对应的</span>
,以及其中的所有内容...
我怀疑需要一个正则表达式,但是我非常讨厌这个。
笑...
<?PHP
$string = \'April 15,2003\';
$pattern = \'/(\\w+) (\\d+),(\\d+)/i\';
$replacement = \'${1}1,$3\';
echo preg_replace($pattern,$replacement,$string);
?>
没有错误...但是
<?PHP
$str = preg_replace(\'<span class=\"tip\">.+</span>\',\"\",\'<span class=\"RSS-title\"></span><span class=\"RSS-link\">linkylink</span><span class=\"RSS-id\"></span><span class=\"RSS-content\"></span><span class=\\\"RSS-newpost\\\"></span>\');
echo $str;
?>
给我错误:
Warning: preg_replace() [function.preg-replace]: UnkNown modifier \'.\' in <A FILE> on line 4
以前,错误发生在第二行的);
,但是现在。
解决方法
这是“适当的”方法(改编自此答案)。
输入:
<?php
$str = \'<div>lol wut <span class=\"tip\">remove!</span><span>don\\\'t remove!</span></div>\';
?>
码:
<?php
function recurse(&$doc,&$parent) {
if (!$parent->hasChildNodes())
return;
for ($i = 0; $i < $parent->childNodes->length; ) {
$elm = $parent->childNodes->item($i);
if ($elm->nodeName == \"span\") {
$class = $elm->attributes->getNamedItem(\"class\")->nodeValue;
if (!is_null($class) && $class == \"tip\") {
$parent->removeChild($elm);
continue;
}
}
recurse($doc,$elm);
$i++;
}
}
// Load in the DOM (remembering that XML requires one root node)
$doc = new DOMDocument();
$doc->loadXML(\"<document>\" . $str . \"</document>\");
// Iterate the DOM
recurse($doc,$doc->documentElement);
// Output the result
foreach ($doc->childNodes->item(0)->childNodes as $node) {
echo $doc->saveXML($node);
}
?>
输出:
<div>lol wut <span>don\'t remove!</span></div>
,一个简单的正则表达式,例如:
<span class=\"tip\">.+</span>
无法正常工作,问题是,如果在尖端跨度内打开和关闭了另一个跨度,则您的正则表达式将以其终止而不是尖端终止。基于DOM的工具(如注释中链接的工具)将真正提供更可靠的答案。
根据我在下面的评论,在PHP中使用正则表达式时需要添加模式定界符。
<?php
$str = preg_replace(\'\\<span class=\"tip\">.+</span>\\\',\"\",\'<span class=\"rss-title\"></span><span class=\"rss-link\">linkylink</span><span class=\"rss-id\"></span><span class=\"rss-content\"></span><span class=\\\"rss-newpost\\\"></span>\');
echo $str;
?>
可能会稍微成功一些。请查看相关功能的文档页面。
,现在没有正则表达式,也没有繁重的XML解析:
$html = \' ... <span class=\"tip\"> hello <span id=\"x\"> man </span> </span> ... \';
$tag = \'<span class=\"tip\">\';
$tag_close = \'</span>\';
$tag_familly = \'<span\';
$tag_len = strlen($tag);
$p1 = -1;
$p2 = 0;
while ( ($p2!==false) && (($p1=strpos($html,$tag,$p1+1))!==false) ) {
// the tag is found,now we will search for its corresponding closing tag
$level = 1;
$p2 = $p1;
$continue = true;
while ($continue) {
$p2 = strpos($html,$tag_close,$p2+1);
if ($p2===false) {
// error in the html contents,the analysis cannot continue
echo \"ERROR in html contents\";
$continue = false;
$p2 = false; // will stop the loop
} else {
$level = $level -1;
$x = substr($html,$p1+$tag_len,$p2-$p1-$tag_len);
$n = substr_count($x,$tag_familly);
if ($level+$n<=0) $continue = false;
}
}
if ($p2!==false) {
// delete the couple of tags,the farest first
$html = substr_replace($html,\'\',$p2,strlen($tag_close));
$html = substr_replace($html,$p1,$tag_len);
}
}