问题描述
我正在尝试从电子邮件中提取一个zip文件附件。我当前的代码可以抓取zip文件,但是此后,我迷失了如何打开zip文件,输入密码以及将其中产生的.xlsx文件转换为熊猫数据框的问题。
import json
import os
import boto3
import email
import utils.mail_operations as mail
def lambda_handler(event,context):
region = os.environ["AWS_REGION"]
msg_id = event["messageId"]
email_content = mail.download_email(msg_id,region)
attachment_data = mail.parse_mail_attachments(email_content)
return {
'statusCode': 200,'body': 'Success'
}
下面是mail_operations.py文件,其中包含用于提取电子邮件和获取zip文件的操作。
import email
import boto3
import logging
import os
from botocore.exceptions import ClientError
def download_email(message_id,region_name):
workmail_message_flow = boto3.client('workmailmessageflow',region_name=region_name)
response = None
try:
response = workmail_message_flow.get_raw_message_content(messageId=message_id)
except ClientError as e:
if e.response['Error']['Code'] == 'ResourceNotFoundException':
logger.error(f"Message {message_id} does not exist. Messages in transit are no longer accessible after 1 day. \
See: https://docs.aws.amazon.com/workmail/latest/adminguide/lambda-content.html for more details.")
raise(e)
email_content = response['messageContent'].read()
return email_content
def parse_mail_attachments(email_content):
""" convert email content bytes and return a dictionary containing the filenames and
their corresponding data in bytes
Args:
email_content (Bytes): email content in bytes
Returns:
parsed_data: Dictionary containing filename as a key and data as bytes.
"""
attachment_data = {}
# Get the actual message
msg = email.message_from_bytes(email_content)
# Walk through the message
for part in msg.walk():
if part.get_content_maintype() == 'multipart':
continue
if part.get('Content-disposition') is None:
continue
# Get the filename of the attachment
filename = part.get_filename()
attachment_data[filename] = part.get_payload(decode=True)
return attachment_data
解决方法
暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!
如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。
小编邮箱:dio#foxmail.com (将#修改为@)