function translate($params) {
$xmldata = '<?xml version="1.0" encoding="UTF-8" ?><root>' . html_entity_decode($params['data']) . '</root>';
$lang = ucfirst(strtolower($params['lang']));
if (simplexml_load_string($xmldata) === FALSE) {
return $params['data'];
} else {
$langxmlobj = new SimpleXMLElement($xmldata);
if ($langxmlobj -> $lang) {
return ($langxmlobj -> $lang);
} else {
return $params['data'];
}
}
}
哪个适用于以下字符串:
$params['data'] = '<English>Hello</English><french>Bonjour</french>';
$params['lang'] = 'English';
print translate($params);
它输出:
Hello
但……
当字符串中包含任何其他标记时:
$params['data'] = '<English><h1>Hello</h1></English><french><h1>Bonjour</h1></french>';
$params['lang'] = 'English';
它没有输出任何东西;
我希望它输出:
<h1>Hello</h1> or any other tag within the <LanguageQuotes>
拉出我的头发;任何的想法 ?
VERSION2:
当字符串如下时它不起作用:
$data = '<french><li><span class="pull-right">25 GB</span>Espace disque</french><English><li><span class="pull-right">25 GB</span>disk Space</English>
<french><li><span class="pull-right">YES</span>PHP 5, MysqL 5</french><English><li><span class="pull-right">YES</span>PHP 5, MysqL 5</English>
<french><li><span class="pull-right">100</span>Bases de données</french><English><li><span class="pull-right">100</span>Databases</English>
<french><li><span class="pull-right">∞</span>E-Mails</french><English><li><span class="pull-right">∞</span>E-mails</English>';
解决方法:
你的问题有两个部分.
将数据加载到XML中
这里的主要问题是它不是有效的XML片段,而是HTML片段与某些特定标签的混合.幸运的是DOMDocument可以加载(和修复)HTML.默认情况下,这不会将数据加载为UTF-8,您需要添加指定编码的元标记.
$data = '<french><li><span class="pull-right">25 GB</span>Espace disque</french><English><li><span class="pull-right">25 GB</span>disk Space</English>
<french><li><span class="pull-right">YES</span>PHP 5, MysqL 5</french><English><li><span class="pull-right">YES</span>PHP 5, MysqL 5</English>
<french><li><span class="pull-right">100</span>Bases de données</french><English><li><span class="pull-right">100</span>Databases</English>
<french><li><span class="pull-right">∞</span>E-Mails</french><English><li><span class="pull-right">∞</span>E-mails</English>';
$html_data =
'<head><Meta http-equiv="Content-Type" content="text/html; charset=utf-8"></head>
<body>'.$data.'</body>';
libxml_use_internal_errors(TRUE);
$dom = new DOMDocument();
$dom->loadHtml($html_data);
$dom->formatOutput = TRUE;
echo $dom->saveXml();
输出:
<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
<html>
<body>
<french>
<li><span class="pull-right">25 GB</span>Espace disque</li>
</french>
<english>
<li><span class="pull-right">25 GB</span>disk Space</li>
</english>
<french>
<li><span class="pull-right">YES</span>PHP 5, MysqL 5</li>
</french>
<english>
<li><span class="pull-right">YES</span>PHP 5, MysqL 5</li>
</english>
...
</body>
</html>
如您所见,它保留语言名称元素,但将所有名称转换为小写.如果它们丢失,它总是添加html和body元素,但这不是问题.
从XML获取数据
一种可能性是获取body元素并将其导入SimpleXML:
$xpath = new DOMXpath($dom);
$root = simplexml_import_dom($xpath->evaluate('/html/body')->item(0));
var_dump($root);
输出:
object(SimpleXMLElement)#4 (2) {
["french"]=>
array(4) {
[0]=>
object(SimpleXMLElement)#3 (1) {
["li"]=>
object(SimpleXMLElement)#12 (1) {
["span"]=>
string(5) "25 GB"
}
}
...
}
["english"]=>
array(4) {
[0]=>
object(SimpleXMLElement)#5 (1) {
["li"]=>
object(SimpleXMLElement)#12 (1) {
["span"]=>
string(5) "25 GB"
}
}
...
或直接获取节点并将其保存为HTML片段:
$xpath = new DOMXpath($dom);
$string = '';
foreach ($xpath->evaluate('/html/body/*[name() = "english"]/*') as $node) {
$string .= $dom->saveHtml($node);
}
echo $string;
输出:
<li>
<span class="pull-right">25 GB</span>disk Space</li><li>
<span class="pull-right">YES</span>PHP 5, MysqL 5</li><li>
<span class="pull-right">100</span>Databases</li><li>
<span class="pull-right">∞</span>E-mails</li>