Xerces-C ++读写UTF-8文档

问题描述

在使用Xerces C ++读取和保存XML文档时,我无法从UTF-16切换到UTF-8。 (使用V3.1.1)

我正在尝试通过TCHAR(wchar)读取UTF-8 XML字符串,但是它始终无法加载,我不知道为什么。我认为这是因为我如何处理TCHAR(代码是我所做工作的摘要):

    reader->impl = DOMImplementationRegistry::getDOMImplementation(XMLString::transcode("LS"));
    reader->parser = ((DOMImplementationLS*)impl)->createLSParser(DOMImplementationLS::MODE_SYNCHRONOUS,0);

    if (reader->parser->getDomConfig()->canSetParameter(XMLUni::fgDOMValidate,false))
            reader->parser->getDomConfig()->setParameter(XMLUni::fgDOMValidate,false);
    if(reader->parser->getDomConfig()->canSetParameter(XMLUni::fgDOMWRTWhitespaceInElementContent,false))
            reader->parser->getDomConfig()->setParameter(XMLUni::fgDOMWRTWhitespaceInElementContent,false);

    //create input dominputsource from the XML string
    DOMLSInput* dis = NULL;
    XMLCh * xmlch_string =  transcode(xml_string);
    unsigned int str_len  = XMLString::stringLen(xmlch_string);
    XMLByte * bytes_of_string = (XMLByte *) xmlch_string; 
    DOMLSInput* dis = new Wrapper4InputSource(
                new MemBufInputSource(bytes_of_string,4*str_len,TEXT("buffer_for_reading_xml")));

    try {
            reader->document = reader->parser->parse(dis); // document = XERCES_CPP_NAMESPACE::DOMDocument*
    }catch (...) {
            cout << "Exception parsing the document. \n";
            return false;
    }

解析DOMLSInput后,文档为空,我什么也看不到

在保存XML文档时,我总是得到一个UTF-16,但我不知道如何更改。这是我在做什么:

    XERCES_CPP_NAMESPACE::DOMDocument*  doc;
    TCHAR *         str             = NULL;
    bool            result          = false;
    XMLWriter *     writer          = NULL;
    size_t          len             = 0;
    XMLCh *         xmlch_string    = NULL;

    writer->impl = DOMImplementationRegistry::getDOMImplementation(XMLString::transcode("LS"));
    writer->m_Document = impl->createDocument();

    writer->setSystem(m_System);
    doc = writer->m_Document;
    result = writer->write(doc,roots);

    ////////////////////////////////////
    // One of the write functions of writer->write:

    DOMElement*         rootNode;
    const TCHAR *   checkout_id;
    rootNode = doc->createElement(ROOT_TAG);
    try{doc->appendChild(rootNode);}
    catch(...){return false;}

    rootNode->setAttribute(XML_ID_ATTRIBUTE,transcode(m_System->system_id()));

    checkout_id = getSystemCheckoutId();
    rootNode->setAttribute(XML_CHECKOUT_ID_ATTRIBUTE,transcode(checkout_id));
    //add xml version to the root
    rootNode->setAttribute(XML_FORMAT_VERSION_ATTRIBUTE,g_FORMAT_VERSION);
    ////////////////////////////////////
    try
    {
        // get a serializer,an instance of DOMWriter
        /*while new class DOMLSSerializer is used here instead of DOMWriter;
        Difference between them is that the initialisation of pointer and the output */ 
        XMLCh tempStr[256];
        XMLString::transcode("LS",tempStr,99);
        DOMImplementation *impl = DOMImplementationRegistry::getDOMImplementation(tempStr);
        DOMLSSerializer *theSerializer = ((DOMImplementationLS*)impl)->createLSSerializer();
        

        // set feature if the serializer supports the feature/mode
        if (theSerializer->getDomConfig()->canSetParameter(XMLUni::fgDOMWRTSplitCdataSections,false))
            theSerializer->getDomConfig()->setParameter(XMLUni::fgDOMWRTSplitCdataSections,false);

        if (theSerializer->getDomConfig()->canSetParameter(XMLUni::fgDOMWRTFormatPrettyPrint,true))
            theSerializer->getDomConfig()->setParameter(XMLUni::fgDOMWRTFormatPrettyPrint,true);

        // write to a XMLCh string 
        xmlch_string = theSerializer->writetoString(doc);

解决方法

暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!

如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。

小编邮箱:dio#foxmail.com (将#修改为@)