将条目从.csv文件转换回图形的方法

问题描述

我正在尝试使用Image将图形信息保存到.csv文件，但是我一直坚持将其转换回图形。它一直给我错误“ AttributeError：'str'对象没有属性' array_interface '”。

我想这意味着我从.csv文件提取的条目是一个字符串，需要转换为数组吗？

将图形转换为np数组的代码如下：

from PIL import Image
    img = np.array(Image.open(fig_file))

file_name = 'data.csv'
row_contents = [labels,img]

from csv import writer
def append_list_as_row(file_name,list_of_elem):
    # Open file in append mode
    with open(file_name,'a+',newline='') as write_obj:
        # Create a writer object from csv module
        csv_writer = writer(write_obj)
        # Add contents of list as last row in the csv file
        csv_writer.writerow(list_of_elem)

append_list_as_row(file_name,row_contents)

有问题的部分（将其转换回数字）看起来像这样：

import pandas as pd
df1 = pd.read_csv(file_name)
fig_array = df1.loc[1,"img"]
img = Image.fromarray(fig_array,'RGB')
img.save('test.png')

图像行导致错误。也许我不应该使用熊猫来找到入口吗？有修改的想法吗？我尝试了.to_numpy（），它不起作用。

非常感谢您！

解决方法

首先，如果可能的话，不要这样做。这太昂贵了。只需创建一个表（数据框），记录与每个文件相关联的标签即可，稍后可以查询。例如

| file_id | file_path | label     |
|---------|-----------|-----------|
| 1       | a.jpg     | fine-arts |
| 2       | b.png     | manga     |
| 3       | c.jpb     | whatever  |

如果您真的必须将图像编码为字符串，则base64 encoding是常见的处理方式。例如，jupyter notebook使用base64格式将图像嵌入html文件中，以便用户可以轻松共享结果图像。

第二，由于电子表格软件列宽的限制，仍然不建议将（标签，数据）对保存为csv文件。如果无法利用.csv格式，为什么要使用它？因此，在这种情况下，最好制作上面提到的查找表，以避免不必要的昂贵转换。

如果您仍在这样做，那么，这里是示例代码。小image来自debian homepage。可以验证数据是否已正确还原。

代码：

import numpy as np
from PIL import Image
import base64
import csv

# https://www.debian.org/Pics/openlogo-50.png
img_path = "/mnt/ramdisk/debian-openlogo-50.png"

img = np.array(Image.open(img_path))
img_encoded = base64.b64encode(img).decode("ascii")
label = "fine-arts"

# Simulate multiple records
data = [
    [label,img_encoded],[label,img_encoded]
]

# save
with open("/mnt/ramdisk/sav.csv","w+") as f:
    w = csv.writer(f)
    w.writerows(data)

# load
data_loaded = []
with open("/mnt/ramdisk/sav.csv") as f:
    r = csv.reader(f)
    for row in r:
        data_loaded.append(row)

# check data are unchanged after S/L
for i in range(3):
    for j in range(2):
        assert data[i][j] == data_loaded[i][j]

# decode the image (still need shape info)
r = base64.b64decode(data_loaded[0][1].encode("ascii"))
img_decoded = np.frombuffer(r,dtype=np.uint8).reshape((61,50,4))

# check image is restored correctly
import matplotlib.pyplot as plt
plt.imshow(img_decoded)
plt.show()

但是，如果使用较大的图像，例如Mona Lisa，则csv阅读器会抱怨：

_csv.Error: field larger than field limit (131072)

并且您仍然需要图像形状来恢复尺寸。因此，实际上需要存储图像形状的第三列。

csv csv csv figure image image image pandas pandas python