如何在 for 循环中使用 MsgExtractor?

问题描述

我尝试使用模块“extract_msg”编写一个函数来从一封电子邮件中提取每个对话并将它们附加到 .csv 文件中。 (我使用的python模块来自https://github.com/TeamMsgExtractor/msg-extractor

这个 python 库将提取整个内容作为电子邮件正文。任何人都可以有解决这个问题的经验吗?非常感谢。

import extract_msg
import re

def extract_msgs(msgfile,extract_path):
    """
    msgfile: the directory of the msg files
    extract_path: the directory of the attachments we want to save from the emials
    """
    ext_ret = 0
    file_name_list = {}
    msg = extract_msg.Message(msgfile)
    subject = msg.subject
    msg_date = msg.date
    cc=msg.cc

    reciver = re.findall(r'<(.*)>',msg.to)
    try:
        if receiver:
            reciver = ','.join(reciver)
        else:

            reciver = re.findall(r'<(.*)>',reciver)[0]
    except IndexError:
       reciver = msg.to
    try:
       sender = re.findall(r'<(.*)>',msg.sender)[0]
    except:
       sender = msg.sender
    body = msg.body

    msg_attachment = msg.attachments
    if msg_attachment:
       for attachment in msg_attachment:
           attachment.save(customPath=extract_path)
           file_name_list[attachment.longFilename] = attachment.longFilename
       ext_ret = 1
  return body

当我调用这个函数时,我会得到以下信息:

msgfile="example.msg"
extract_path='example_firectory'
extract_msgs(msgfile,extract_path)

我得到的身体会是这样的:

Dear XXX,XXXXXX

Regards,XXXX

From: XXX <[email protected]>
Date: date2
To: XXXX <[email protected]>
Subject: sub2

Email 2

From: XXX <[email protected]>
Date: date3
To: XXXX <[email protected]>
Subject: sub3

Email 3

如何提取这样的信息:

Date   Sender   Receiver   Subject   Body
date1  sender1  receiver1  sub1      body1
...

解决方法

暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!

如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。

小编邮箱:dio#foxmail.com (将#修改为@)