我正在尝试加载解析Google Weather API响应(中文回复).
// This code fails with the following error
$xml = simplexml_load_file('http://www.google.com/ig/api?weather=11791&hl=zh-CN');
( ! ) Warning: simplexml_load_string()
[function.simplexml-load-string]:
Entity: line 1: parser error : Input
is not proper UTF-8, indicate encoding
! Bytes: 0xB6 0xE0 0xD4 0xC6 in
C:\htdocs\weather.PHP on line 11
为什么加载此响应失败?
如何编码/解码响应,以便simplexml正确加载它?
<?PHP
$googleData = file_get_contents('http://www.google.com/ig/api?weather=11102&hl=zh-CN');
$xml = simplexml_load_string($googleData);
( ! ) Warning: simplexml_load_string()
[function.simplexml-load-string]:
Entity: line 1: parser error : Input
is not proper UTF-8, indicate encoding
! Bytes: 0xB6 0xE0 0xD4 0xC6 in
C:\htdocs\test4.PHP on line 3 Call
Stack
Time Memory Function Location 1 0.0020 314264 {main}(
) ..\test4.PHP:0
2 0.1535 317520 simplexml_load_string
( string(1364) ) ..\test4.PHP:3( ! ) Warning: simplexml_load_string()
[function.simplexml-load-string]:
t_system
data=”SI”/>( ! ) Warning: simplexml_load_string()
[function.simplexml-load-string]: ^ in
C:\htdocs\test4.PHP on line 3 Call
Stack
Time Memory Function Location 1 0.0020 314264 {main}(
) ..\test4.PHP:0
2 0.1535 317520 simplexml_load_string
( string(1364) ) ..\test4.PHP:3
解决方法:
这里的问题是SimpleXML没有查看HTTP标头来确定文档中使用的字符编码,只是假设它是UTF-8,即使Google的服务器确实将其宣传为
Content-Type: text/xml; charset=GB2312
您可以编写一个函数,使用超级秘密魔术变量$http_response_header查看该标题,并相应地转换响应.像这样的东西:
function sxe($url)
{
$xml = file_get_contents($url);
foreach ($http_response_header as $header)
{
if (preg_match('#^Content-Type: text/xml; charset=(.*)#i', $header, $m))
{
switch (strtolower($m[1]))
{
case 'utf-8':
// do nothing
break;
case 'iso-8859-1':
$xml = utf8_encode($xml);
break;
default:
$xml = iconv($m[1], 'utf-8', $xml);
}
break;
}
}
return simplexml_load_string($xml);
}