问题描述
我正在尝试使用iText7库升级代码。 以前我使用iTextSharp库 但是看起来iText7是全新的 我尝试读取pdf文档,但是在“找不到Pdf标头”之间遇到异常。 这是我的代码
byte[] bytes = System.Convert.FromBase64String(UploadedFileByes);
MemoryStream memory = new MemoryStream(bytes);
BinaryReader BRreader = new BinaryReader(memory);
StringBuilder text = new StringBuilder();
iText.Kernel.Pdf.PdfReader iTextReader = new iText.Kernel.Pdf.PdfReader(memory);
iText.Kernel.Pdf.PdfDocument pdfDoc = new iText.Kernel.Pdf.PdfDocument(new iText.Kernel.Pdf.PdfReader(memory));
int numberofpages = pdfDoc.GetNumberOfPages();
for (int page = 1; page <= numberofpages; page++) {
iText.Kernel.Pdf.Canvas.Parser.Listener.ITextExtractionStrategy strategy = new iText.Kernel.Pdf.Canvas.Parser.Listener.SimpleTextExtractionStrategy();
string currentText = iText.Kernel.Pdf.Canvas.Parser.PdfTextExtractor.GetTextFromPage(pdfDoc.GetPage(page),strategy);
currentText = Encoding.UTF8.GetString(ASCIIEncoding.Convert(
Encoding.Default,Encoding.UTF8,Encoding.Default.GetBytes(currentText)));
text.Append(currentText);
}
我在做什么错了?
解决方法
我找到了解决方案。我使用了我定义的pdfreader,而不是创建一个新的。这是代码。希望对别人有帮助。
byte[] bytes = System.Convert.FromBase64String(UploadedFileByes); MemoryStream memory = new MemoryStream(bytes); BinaryReader BRreader = new BinaryReader(memory); StringBuilder text = new StringBuilder(); iText.Kernel.Pdf.PdfReader iTextReader = new iText.Kernel.Pdf.PdfReader(memory); iText.Kernel.Pdf.PdfDocument pdfDoc = new iText.Kernel.Pdf.PdfDocument(iTextReader); int numberofpages = pdfDoc.GetNumberOfPages(); for (int page = 1; page <= numberofpages; page++) { iText.Kernel.Pdf.Canvas.Parser.Listener.ITextExtractionStrategy strategy = new iText.Kernel.Pdf.Canvas.Parser.Listener.SimpleTextExtractionStrategy(); string currentText = iText.Kernel.Pdf.Canvas.Parser.PdfTextExtractor.GetTextFromPage(pdfDoc.GetPage(page),strategy); currentText = Encoding.UTF8.GetString(ASCIIEncoding.Convert( Encoding.Default,Encoding.UTF8,Encoding.Default.GetBytes(currentText))); text.Append(currentText); }