用第二列作为分隔符分割数据框列

问题描述

我想通过使用同一行中第二列的值将一列拆分为两列,因此第二列值用作拆分分隔符。

我收到错误 TypeError: 'Series' objects are mutable,thus they cannot be hashed,它接收一个系列而不是单个值是有道理的,但我不确定如何隔离到第二列的单行值。

示例数据:

    title_location                    delimiter
0   Doctor - ABC - Los Angeles,CA    - ABC -
1   Lawyer - ABC - Atlanta,GA        - ABC -
2   Athlete - XYZ - Jacksonville,FL  - XYZ -

代码

bigdata[['title','location']] = bigdata['title_location'].str.split(bigdata['delimiter'],expand=True)

所需的输出

    title_location                    delimiter    title    location
0   Doctor - ABC - Los Angeles,CA    - ABC -      Doctor   Los Angeles,CA
1   Lawyer - ABC - Atlanta,GA        - ABC -      Lawyer   Atlanta,GA
2   Athlete - XYZ - Jacksonville,FL  - XYZ -      Athlete  Jacksonville,FL

解决方法

让我们尝试zip然后join返回

df = df.join(pd.DataFrame([x.split(y) for x,y in zip(df.title_location,df.delimiter)],index=df.index,columns=['Title','Location']))
df
Out[200]: 
                     title_location delimiter     Title           Location
0    Doctor - ABC - Los Angeles,CA   - ABC -   Doctor     Los Angeles,CA
1        Lawyer - ABC - Atlanta,GA   - ABC -   Lawyer         Atlanta,GA
2  Athlete - XYZ - Jacksonville,FL   - XYZ -  Athlete    Jacksonville,FL
,

试试apply

bigdata[['title','location']]=bigdata.apply(func=lambda row: row['title_location'].split(row['delimiter']),axis=1,result_type="expand")