问题描述
我正在使用GemBox.Email和GemBox.Document将电子邮件转换为PDF。
这是我的代码:
static void Main()
{
MailMessage message = MailMessage.Load("input.eml");
DocumentModel document = new DocumentModel();
if (!string.IsNullOrEmpty(message.BodyHtml))
document.Content.LoadText(message.BodyHtml,LoadOptions.HtmlDefault);
else
document.Content.LoadText(message.BodyText,LoadOptions.TxtDefault);
document.Save("output.pdf");
}
该代码适用于EML文件,但不适用于MSG(MailMessage.BodyHtml
和MailMessage.BodyText
均为空)。
我如何也可以将其用于味精?
解决方法
问题发生在特定的MSG文件中,这些文件在RTF正文中没有HTML内容,而是它们具有原始RTF正文。
MailMessage
类当前不公开RTF主体的API(仅普通文本和HTML主体)。不过,您可以将其作为名为“ Body.rtf ”的Attachment
进行检索。
作为一个仅供参考,另一个问题是电子邮件的HTML正文中的图像未内联,因此,在导出为PDF时会丢失它们。
无论如何,请尝试使用以下内容:
static void Main()
{
// Load an email (or retrieve it with POP or IMAP).
MailMessage message = MailMessage.Load("input.msg");
// Create a new document.
DocumentModel document = new DocumentModel();
// Import the email's body to the document.
LoadBody(message,document);
// Save the document as PDF.
document.Save("output.pdf");
}
static void LoadBody(MailMessage message,DocumentModel document)
{
if (!string.IsNullOrEmpty(message.BodyHtml))
{
var htmlOptions = LoadOptions.HtmlDefault;
// Replace attached CID images to inlined DATA urls.
var htmlBody = ReplaceEmbeddedImages(message.BodyHtml,message.Attachments);
// Load HTML body to the document.
document.Content.End.LoadText(htmlBody,htmlOptions);
}
else if (message.Attachments.Any(a => a.FileName == "Body.rtf"))
{
var rtfAttachment = message.Attachments.First(a => a.FileName == "Body.rtf");
var rtfOptions = LoadOptions.RtfDefault;
// Get RTF body from the attachment.
var rtfBody = rtfOptions.Encoding.GetString(rtfAttachment.Data.ToArray());
// Load RTF body to the document.
document.Content.End.LoadText(rtfBody,rtfOptions);
}
else
{
// Load TXT body to the document.
document.Content.End.LoadText(message.BodyText,LoadOptions.TxtDefault);
}
}
static string ReplaceEmbeddedImages(string htmlBody,AttachmentCollection attachments)
{
var srcPattern =
"(?<=<img.+?src=[\"'])" +
"(.+?)" +
"(?=[\"'].*?>)";
// Iterate through the "src" attributes from HTML images in reverse order.
foreach (var match in Regex.Matches(htmlBody,srcPattern,RegexOptions.IgnoreCase).Cast<Match>().Reverse())
{
var imageId = match.Value.Replace("cid:","");
Attachment attachment = attachments.FirstOrDefault(a => a.ContentId == imageId);
if (attachment != null)
{
// Create inlined image data. E.g. "..."
ContentEntity entity = attachment.MimeEntity;
var embeddedImage = entity.Charset.GetString(entity.Content);
var embeddedSrc = $"data:{entity.ContentType};{entity.TransferEncoding},{embeddedImage}";
// Replace the "src" attribute with the inlined image.
htmlBody = $"{htmlBody.Substring(0,match.Index)}{embeddedSrc}{htmlBody.Substring(match.Index + match.Length)}";
}
}
return htmlBody;
}
有关更多信息(例如如何添加电子邮件标题和附件以输出PDF),请查看Convert Email to PDF示例。