Apache POI-删除外部参考书

问题描述

我们在系统中收到excel文件,下载它们后,出于安全原因,我需要删除对外部工作簿的所有引用。

我编写了以下函数,以删除外部链接以及从引用该书的单元格中删除内容

public static byte[]  cleanFile(byte[] excelByteArrayworkbook) throws IOException {
ByteArrayOutputStream bos = new ByteArrayOutputStream();

try (InputStream targetStream = new ByteArrayInputStream(excelByteArrayworkbook)){
    try (XSSFWorkbook workbook = new XSSFWorkbook(targetStream)) {

        XSSFEvaluationWorkbook evalWorkbook = XSSFEvaluationWorkbook.create((XSSFWorkbook) workbook);

        Iterator<Sheet> sheetIterator = workbook.iterator();
        int i = 0;
        while (sheetIterator.hasNext()) {
            Sheet sheet = sheetIterator.next();
            EvaluationSheet evalSheet = evalWorkbook.getSheet(i);
            i++;
            for (Row row : sheet) {
                for (Cell cell : row) {
                    
                    if (cell.getCellType() == Cell.CELL_TYPE_FORMULA) {
                        EvaluationCell evaluationCell = evalSheet.getCell(cell.getRowIndex(),cell.getColumnIndex());
                        try {
                            Ptg[] formulaTokens = evalWorkbook.getFormulaTokens(evaluationCell);

                            for (Ptg formulaToken : formulaTokens) {
                                int externalSheetIndex = -1;
                                if (formulaToken instanceof Ref3DPtg) {
                                    Ref3DPtg refToken = (Ref3DPtg) formulaToken;
                                    externalSheetIndex = refToken.getExternSheetIndex();
                                } else if (formulaToken instanceof Area3DPtg) {
                                    Area3DPtg refToken = (Area3DPtg) formulaToken;
                                    externalSheetIndex = refToken.getExternSheetIndex();
                                } else if (formulaToken instanceof Ref3DPxg) {
                                    Ref3DPxg refToken = (Ref3DPxg) formulaToken;
                                    externalSheetIndex = refToken.getExternalWorkbookNumber();
                                } else if (formulaToken instanceof Area3DPxg) {
                                    Area3DPxg refToken = (Area3DPxg) formulaToken;
                                    externalSheetIndex = refToken.getExternalWorkbookNumber();
                                }

                                if (externalSheetIndex >= 0) {
                                    cell.setCellFormula(null);
                                }
                                else if (cell.getCellFormula().contains("[")) {
                                    cell.setCellFormula(null);
                                }
                            }
                        } catch (Exception e) {
                            cell.setCellFormula(null);
                        }

                    }

                }
            }
        }

        List<ExternalLinksTable> links = workbook.getExternalLinksTable();
        links.forEach(link -> {
            if(link.getCTExternalLink().isSetDdeLink())
                link.getCTExternalLink().unsetDdeLink();
            
            if(link.getCTExternalLink().isSetExtLst())
                link.getCTExternalLink().unsetExtLst();
            
            if(link.getCTExternalLink().isSetoleLink())
                link.getCTExternalLink().unsetoleLink();
            
            if(link.getCTExternalLink().isSetExternalBook())
                link.getCTExternalLink().unsetExternalBook();
            
            link.getCTExternalLink().setNil();
        });
         
        try {
            workbook.write(bos);
        } finally {
            bos.close();
        }
    }
    
    
}

try {
    bos.close();
} catch (IOException e) { 
}

return bos.toByteArray();

}

但是,我获取文件无效。 Excel提示错误消息 Excel error message

我正在使用的Apache POI版本是3.13。我找不到可以删除外部链接的示例

解决方法

我找到了答案。该代码未删除“名称管理器”区域中的引用。

我在函数中添加了以下内容

        List<XSSFName> nameUsingExternalBook = new ArrayList<XSSFName>();
        for (int j = 0; j < workbook.getNumberOfNames(); j++) {
            XSSFName name= workbook.getNameAt(j);
            if(name.getRefersToFormula() != null && name.getRefersToFormula().contains("[")) {
                nameUsingExternalBook.add(name);
            }
            
        }
        
        nameUsingExternalBook.forEach(name->{
            workbook.removeName(name.getNameName());
        });
        

因此新的excel文件格式正确