pdfbox使用TrueTypeFont.getUnicodeCmap获取ArrayIndexOutOfBoundsException

问题描述

我正在使用pdfBox将pdf文件呈现为图像,但是运行TrueTypeFont.getUnicodeCmap方法时出现Arrayindexoutofboundsexception,cmapTable为空,cmapTable.getCmaps()[0]导致超出范围,这是调用堆栈

java.lang.Arrayindexoutofboundsexception: 0
at org.apache.fontBox.ttf.TrueTypeFont.getUnicodeCmap(TrueTypeFont.java:566)
at org.apache.pdfBox.pdmodel.font.PDCIDFontType2.<init>(PDCIDFontType2.java:183)
at org.apache.pdfBox.pdmodel.font.PDCIDFontType2.<init>(PDCIDFontType2.java:70)
at org.apache.pdfBox.pdmodel.font.PDFontFactory.createDescendantFont(PDFontFactory.java:125)
at org.apache.pdfBox.pdmodel.font.PDType0Font.<init>(PDType0Font.java:128)
at org.apache.pdfBox.pdmodel.font.PDFontFactory.createFont(PDFontFactory.java:83)
at org.apache.pdfBox.pdmodel.PDResources.getFont(PDResources.java:123)
at org.apache.pdfBox.contentstream.operator.text.SetFontAndSize.process(SetFontAndSize.java:60)
at org.apache.pdfBox.contentstream.PDFStreamEngine.processOperator(PDFStreamEngine.java:815)
at org.apache.pdfBox.contentstream.PDFStreamEngine.processstreamOperators(PDFStreamEngine.java:472)
at org.apache.pdfBox.contentstream.PDFStreamEngine.processstream(PDFStreamEngine.java:446)
at org.apache.pdfBox.contentstream.PDFStreamEngine.processpage(PDFStreamEngine.java:149)
at org.apache.pdfBox.rendering.PageDrawer.drawPage(PageDrawer.java:189)
at org.apache.pdfBox.rendering.PDFRenderer.renderPage(PDFRenderer.java:208)
at org.apache.pdfBox.rendering.PDFRenderer.renderImage(PDFRenderer.java:139)
at org.apache.pdfBox.rendering.PDFRenderer.renderImageWithDPI(PDFRenderer.java:80)

这是pdfBox中的getUnicodeCmap方法

public CmapSubtable getUnicodeCmap(boolean isstrict) throws IOException
{
    CmapTable cmapTable = getCmap();
    if (cmapTable == null)
    {
        if (isstrict)
        {
            throw new IOException("The TrueType font does not contain a 'cmap' table");
        }
        else
        {
            return null;
        }
    }

    CmapSubtable cmap = cmapTable.getSubtable(CmapTable.PLATFORM_UNICODE,CmapTable.ENCODING_UNICODE_2_0_FULL);
    if (cmap == null)
    {
        cmap = cmapTable.getSubtable(CmapTable.PLATFORM_UNICODE,CmapTable.ENCODING_UNICODE_2_0_BMP);
    }
    if (cmap == null)
    {
        cmap = cmapTable.getSubtable(CmapTable.PLATFORM_WINDOWS,CmapTable.ENCODING_WIN_UNICODE_BMP);
    }
    if (cmap == null)
    {
        // Microsoft's "Recommendations for OpenType Fonts" says that "Symbol" encoding
        // actually means "Unicode,non-standard character set"
        cmap = cmapTable.getSubtable(CmapTable.PLATFORM_WINDOWS,CmapTable.ENCODING_WIN_SYMBOL);
    }
    if (cmap == null)
    {
        if (isstrict)
        {
            throw new IOException("The TrueType font does not contain a Unicode cmap");
        }
        else
        {
            // fallback to the first cmap (may not be Unicode,so may produce poor results)
            cmap = cmapTable.getCmaps()[0];
        }
    }
    return cmap;
}

我发现该文件包含具有自定义编码的字体,并且在pdfBox代码中有一个“温馨”的注释:回退到第一个cmap(可能不是Unicode,因此可能会产生不良结果)。因此,我怀疑自定义字体编码导致了此问题,对吗?

pdf fonts encoding

解决方法

暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!

如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。

小编邮箱:dio#foxmail.com (将#修改为@)