将txt数据拆分为Python中的列

问题描述

我有一个.txt数据集,格式如下:

01/01/2018 ['cat','bear','ant']
01/02/2018 ['horse','wolf','elephant']
01/03/2018 ['parrot','bird','fish]

我想使用PYTHON并将其设置为2列,格式如下:

  'Date'       'Animal'

01/01/2018       cat
01/01/2018       bear   
01/01/2018       ant
01/02/2018       horse
01/02/2018       wolf
01/02/2018       elephant
01/03/2018       parrot
01/03/2018       bird
01/03/2018       fish

(该txt文件实际上更长,但是为了更好地理解而进行了简化)。我不确定如何继续:read_csv或open(但随后它将像对象一样读取 )?。我应该设置定界符吗?我尝试了几件事,但没有任何效果

预先感谢

解决方法

使用熊猫创建表格:

import ast

import pandas as pd

dates = []
animals = []
lines = []

# Read file lines
with open('file.txt','r') as f:
    lines = f.readlines()

for l in lines:
    # Spliting date and animals
    date_string,animals_string = l.split(' ',maxsplit=1)
    # Safely evaluate animals list
    animals_list = ast.literal_eval(animals_string)
    # Duplicate date the amount of animals in that date
    dates.extend([date_string] * len(animals_list))
    # Append animals
    animals.extend(animals_list)

# Create dataframe for the dates and animals
df = pd.DataFrame({'Date': dates,'Animal': animals})

# Print the dataframe
print(df)

输出:

         Date    Animal
0  01/01/2018       cat
1  01/01/2018      bear
2  01/01/2018       ant
3  01/02/2018     horse
4  01/02/2018      wolf
5  01/02/2018  elephant
6  01/03/2018    parrot
7  01/03/2018      bird
8  01/03/2018      fish