问题描述
我们需要在 MarkLogic 中写入和读取 excel 文件,但是在从 MarkLogic 读取 excel 文件时出现异常,
我们要将检索到的文件传递给 apache.poi 给出的 XSSFWorkbook.java。
我已尝试使用以下代码将 Excel 文件写入 MarkLogic,
DatabaseClient client = databaseClientService.getContentClient();
String contains = new String(Files.readAllBytes(Paths.get("src/test/resources/TestExcelEntity.xlsx")));
BytesHandle bytesHandle = new BytesHandle();
bytesHandle.setMimetype("application/vnd.openxmlformats-officedocument.spreadsheetml.sheet");
bytesHandle.setFormat(Format.BINARY);
bytesHandle.set(contains.getBytes());
BinaryDocumentManager manager = client.newBinaryDocumentManager();
manager.writeAs("/test/binaryDoc.xlsx",bytesHandle);
FileHandle fileHandle = new FileHandle();
fileHandle.setMimetype("application/vnd.openxmlformats-officedocument.spreadsheetml.sheet");
fileHandle.setFormat(Format.BINARY);
File file = manager.read("/test/binaryDoc.xlsx",fileHandle).get();
XSSFWorkbook workbook = new XSSFWorkbook(file)
我可以在临时位置看到下载的文件,但是当我打开下载的 excel 文件时,我可以看到错误消息“文件已损坏,无法打开”相同的错误消息从qconsole下载就可以看到了。
由于“/test/binaryDoc.xlsx”文件未正确下载/读取,因此 XSSFWorkbook.java 失败并出现异常。
org.apache.poi.openxml4j.exceptions.InvalidOperationException: Can't open the specified file input stream from file: 'C:\Users\SHIVLI~1\AppData\Local\Temp\tmp9485717536946276215.vnd.openxmlformats-officedocument.spreadsheetml.sheet'
at org.apache.poi.openxml4j.opc.ZipPackage.openZipEntrySourceStream(ZipPackage.java:162)
at org.apache.poi.openxml4j.opc.ZipPackage.<init>(ZipPackage.java:149)
at org.apache.poi.openxml4j.opc.OPCPackage.open(OPCPackage.java:277)
at org.apache.poi.openxml4j.opc.OPCPackage.open(OPCPackage.java:186)
at org.apache.poi.xssf.usermodel.XSSFWorkbook.<init>(XSSFWorkbook.java:325)
at com.ucbos.appdata.MLSample.test(MLSample.java:55)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at org.springframework.test.context.junit4.statements.RunBeforeTestExecutionCallbacks.evaluate(RunBeforeTestExecutionCallbacks.java:74)
at org.springframework.test.context.junit4.statements.RunAfterTestExecutionCallbacks.evaluate(RunAfterTestExecutionCallbacks.java:84)
at org.springframework.test.context.junit4.statements.RunBeforeTestMethodCallbacks.evaluate(RunBeforeTestMethodCallbacks.java:75)
at org.springframework.test.context.junit4.statements.RunAfterTestMethodCallbacks.evaluate(RunAfterTestMethodCallbacks.java:86)
at org.springframework.test.context.junit4.statements.SpringRepeat.evaluate(SpringRepeat.java:84)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
at org.springframework.test.context.junit4.SpringJUnit4ClassRunner.runchild(SpringJUnit4ClassRunner.java:251)
at org.springframework.test.context.junit4.SpringJUnit4ClassRunner.runchild(SpringJUnit4ClassRunner.java:97)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
at org.junit.runners.ParentRunner.runchildren(ParentRunner.java:288)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
at org.springframework.test.context.junit4.statements.RunBeforeTestClassCallbacks.evaluate(RunBeforeTestClassCallbacks.java:61)
at org.springframework.test.context.junit4.statements.RunAfterTestClassCallbacks.evaluate(RunAfterTestClassCallbacks.java:70)
at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
at org.springframework.test.context.junit4.SpringJUnit4ClassRunner.run(SpringJUnit4ClassRunner.java:190)
at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
at com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:69)
at com.intellij.rt.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:33)
at com.intellij.rt.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:220)
at com.intellij.rt.junit.JUnitStarter.main(JUnitStarter.java:53)
Caused by: java.io.FileNotFoundException: C:\Users\SHIVLI~1\AppData\Local\Temp\tmp9485717536946276215.vnd.openxmlformats-officedocument.spreadsheetml.sheet (The system cannot find the file specified)
at java.base/java.io.FileInputStream.open0(Native Method)
at java.base/java.io.FileInputStream.open(FileInputStream.java:219)
at java.base/java.io.FileInputStream.<init>(FileInputStream.java:157)
at org.apache.poi.openxml4j.opc.ZipPackage.openZipEntrySourceStream(ZipPackage.java:159)
... 35 more
更新 - 尝试 BytesHandle 将文档作为 byte[] 读取,然后将其写入文件系统,但仍然出现相同的错误“文件已损坏且无法打开” .
BytesHandle readHandle = new BytesHandle();
readHandle.setMimetype("application/vnd.openxmlformats-officedocument.spreadsheetml.sheet");
readHandle.setFormat(Format.BINARY);
readHandle.set(BYTES_BINARY);
byte[] file = manager.read("/test/binaryDoc.xlsx",readHandle).get();
File outputFile = new File("outputFile.xlsx");
OutputStream os = new FileOutputStream(outputFile);
os.write(file);
os.close();
有人能帮我解决这个问题吗?
解决方法
从描述来看,问题似乎是文档检索和写入操作系统无法正常工作,因为它显示文件已损坏。我不是 Java 开发人员,但您似乎正在尝试访问该文档,就好像它是常规文档而不是二进制文件一样。对于二进制文件,您似乎需要流式传输二进制文件或使用 com.marklogic.client.io.BytesHandle
在 Reading Content From A Binary Document 中,它显示了几个示例。以下示例看起来最接近您要执行的操作:
byte[] buf = docMgr.read(docID,new BytesHandle()).get();
我还建议不要将文档传递给 XSSFWorkbook.java,直到您可以验证将有效文件保存到临时位置,以简化故障排除过程。
,如果您只想读/写 xlsx 文件,请使用下面的 Class
将输入流表示为字节,而不是将二进制文件读为字符串。
InputStreamHandle handle = new InputStreamHandle();
handle.set(docStream);
docMgr.write(uri,handle);
请在进一步操作之前声明写入数据、控制流程和条件的有效性。
验证选项:
- 使用 Java 二进制包(测试框架中的常用工具)来断言写入的输入不会丢失:
> Task :fc-financial-asset:TypedWriteReadStreamTest.main()
Document /dmsdk/FXD.xlsx write completed.
Assert /dmsdk/FXD.xlsx Input Stream and File BYTE –
InputStream /dmsdk/FXD.xlsx bytes:
11614
Calculate /dmsdk/FXD.xlsx byte array:
11614
Read /dmsdk/FXD.xlsx file bytes:
11614
- 将
tmp*****.spreadsheetml.sheet
重命名为tmp*****.spreadsheetml.xlsx
,您应该可以打开有效的 excel。 -
save
或从 QConsole 验证文档。