对于我的生活,我无法理解实体处理的
XML :: Twig文档.
my $tidy = HTML::Tidy->new({ 'indent' => 1,'break-before-br' => 1,'output-xhtml' => 0,'output-xml' => 1,'char-encoding' => 'raw',}); $str = "foo bar"; $xml = $tidy->clean("<xml>$str</xml>");
产生:
<html> <head> <Meta content="tidyp for Linux (v1.02),see www.w3.org" name="generator" /> <title></title> </head> <body>foo bar</body> </html>
& nbsp;中的XML :: Twig(可以理解)barfs.我想做一些转换,通过XML :: Twig运行它:
my $twig = XML::Twig->new( twig_handlers => {... handlers ...} ); $twig->parse($xml);
&tw; 上的$twig->解析行barfs,但我无法弄清楚如何添加& nbsp;元素编程.我尝试过这样的事情:
my $entity = XML::Twig::Entity->new("nbsp"," "); $twig->entity_list->add($entity); $twig->parse($xml);
……但没有快乐.
请帮忙=)
解决方法
use strict; use XML::Twig; my $doctype = '<?xml version="1.0" encoding="utf-8"?><!DOCTYPE html [<!ENTITY nbsp " ">]>'; my $xml = '<html><head><Meta content="tidyp for Linux (v1.02),see www.w3.org" name="generator" /><title></title></head><body>foo bar</body></html>'; my $xTwig = XML::Twig->new(); $xTwig->safe_parse($doctype . $xml) or die "Failure to parse XML : $@"; print $xTwig->sprint();