如果 xml doc 包含 namesapcecontext,XPathFactoryImpl 无法识别根节点

问题描述

我对 XML 和 Saxon API 很陌生,在这里我使用 Saxon 10.3 HE jar 从 XML 文件提取数据。在这里,我想从使用日期函数的活动 country_information 节点中提取国家/地区属性。 示例输入 XML:

<person xmlns="urn:my.poctest.com">
                  <country_information>
                     <country>FRA</country>
                     <end_date>9999-12-31</end_date>
                     <start_date>2009-12-01</start_date>
                  </country_information>
                  <country_information>
                     <country>FRA</country>
                     <end_date>9999-12-31</end_date>
                     <start_date>2009-12-01</start_date>
                  </country_information>             
               </person>

代码

import java.io.IOException;
import java.io.StringReader;
import java.util.Iterator;
import java.util.Map;

import javax.xml.namespace.NamespaceContext;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;
import javax.xml.xpath.XPath;
import javax.xml.xpath.XPathConstants;
import javax.xml.xpath.XPathExpression;
import javax.xml.xpath.XPathExpressionException;
import javax.xml.xpath.XPathFactory;
import javax.xml.xpath.XPathFactoryConfigurationException;

import org.w3c.dom.Document;
import org.w3c.dom.Element;
import org.w3c.dom.NodeList;
import org.xml.sax.InputSource;
import org.xml.sax.SAXException;

import net.sf.saxon.xpath.XPathFactoryImpl;

public class SaxonPoc {

    public static void main(String[] args) throws SAXException,IOException,ParserConfigurationException,XPathExpressionException,XPathFactoryConfigurationException {
        String xml = " <person xmlns=\"urn:my.poctest.com\">\r\n"
                + "       <country_information>\r\n"
                + "          <country>FRA</country>\r\n"
                + "          <end_date>9999-12-31</end_date>\r\n"
                + "          <start_date>2020-02-24</start_date>\r\n"
                + "       </country_information>\r\n" 
                + "       <country_information>\r\n"
                + "          <country>USA</country>\r\n"
                + "          <end_date>2020-02-23</end_date>\r\n"
                + "          <start_date>2009-12-01</start_date>\r\n"
                + "       </country_information>             \r\n" 
                + "       </person>";
        Document doc = SaxonPoc.getDocument(xml,false);
        NodeList matches = (NodeList) SaxonTest.getXpathExpression("//person",null).evaluate(doc,XPathConstants.NODESET);
        if (matches != null) {
            Element node = (Element) matches.item(0);
            XPath xPath1 = SaxonPoc.getXpath(null);
            String xPathStatement = "/person/country_information[xs:date(start_date) le current-date() and  xs:date(end_date) ge current-date()]/country";
            NodeList childNodes = (NodeList) xPath1.evaluate(xPathStatement,node,XPathConstants.NODESET);
            if (childNodes.getLength() > 0) {
                String nodeName = childNodes.item(0).getFirstChild().getNodeName();
                System.out.println("Node :" + nodeName);
                String value = childNodes.item(0).getTextContent();
                System.out.println("Country Name :" + value);
            }

        }
        System.out.println("Finished");

    }

    public static Document getDocument(String xml,boolean isNamespaceAware)
            throws SAXException,ParserConfigurationException {
        DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
        factory.setNamespaceAware(isNamespaceAware);
        DocumentBuilder builder = factory.newDocumentBuilder();
        InputSource is = new InputSource(new StringReader(xml));
        return builder.parse(is);
    }

    public static XPath getXpath(Map<String,String> namespaceMappings) throws XPathFactoryConfigurationException {
        XPathFactory xpathFactory = new XPathFactoryImpl();
        XPath xpath = xpathFactory.newXPath();
        NamespaceContext nsc = new NamespaceContext() {

            @Override
            public String getNamespaceURI(String prefix) {
                return (null != namespaceMappings) ? namespaceMappings.get(prefix) : null;
            }

            @Override
            public String getPrefix(String namespaceURI) {
                return null;
            }

            @Override
            public Iterator getPrefixes(String namespaceURI) {
                return null;
            }

        };
        xpath.setNamespaceContext(nsc);

        return xpath;
    }

    public static XPathExpression getXpathExpression(String xpathExpr,Map<String,String> namespaceMappings)
            throws XPathExpressionException,XPathFactoryConfigurationException {
        XPath xpath = getXpath(namespaceMappings);
        return xpath.compile(xpathExpr);
    }

}

我正面临一个空指针,因为它无法找到根节点 person 一个 XML 文档。如果我删除 xmlns="urn:my.poctest.com" 然后它能够​​获得根路径,但在稍后阶段,它会因 javax.xml.xpath 而失败。 XPathExpressionException:net.sf.saxon.trans.XPathException:尚未声明命名空间前缀“xs”。如果我从 XML 文档和 NamespaceContext 实现中删除命名空间,那么它工作正常。但实际上我不想删除这两个东西。

有人可以在这里指出我,我做错了什么吗?提前致谢!!

解决方法

您可能想知道最新版本的 Saxon 包含执行选项

((net.sf.saxon.xpath.XPathEvaluator)XPath).getStaticContext()
    .setUnprefixedElementMatchingPolicy(
       UnprefixedElementMatchingPolicy.ANY_NAMESPACE))

这会导致 XPath 表达式中不带前缀的元素名称仅匹配本地名称,而不管名称空间如何。

这主要是为 HTML 引入的,对于 HTML DOM 中的元素是否在命名空间中存在完全混淆;但它在更普遍的情况下更有用,您真的不关心命名空间,只是希望它们不在那里让您的生活变得痛苦。

相关问答

Selenium Web驱动程序和Java。元素在(x,y)点处不可单击。其...
Python-如何使用点“。” 访问字典成员?
Java 字符串是不可变的。到底是什么意思?
Java中的“ final”关键字如何工作?(我仍然可以修改对象。...
“loop:”在Java代码中。这是什么,为什么要编译?
java.lang.ClassNotFoundException:sun.jdbc.odbc.JdbcOdbc...