如何使用wordnet拆分列名并查找字典含义?

问题描述

我要尝试使用以下数据来获取字典定义,但是仅当它是单个单词时才有效。我如何才能使它与多个单词一起使用?

代码

from nltk.stem import WordNetLemmatizer
from nltk.corpus import wordnet

columns = ['Sector','Community Name','Date','Community Center Point']

tmp = []

for x in columns:
    syns = (wordnet.synsets(x))
    tmp.append(syns[0].deFinition() if len(syns) > 0 else '')

输出

pd.DataFrame(tmp).T

                        0    1                               2      3
a plane figure bounded by       the specified day of the month  

第1列和第3列为空,因为“社区名称”和“社区中心点”包含多个单词。

所需的输出

                          0                                             1                                         2                                                                   3
Sector: [a plane figure...]   Community: [deFinition],Name: [deFinition]    Date: [the specified day of the month]  Community: [deFinition],Center: [deFinition],Point: [deFinition]

解决方法

from nltk.corpus import wordnet

columns = ['Sector','Community Name','Date','Community Center Point']

col_defs = []
for item in columns:
    tmp = []
    for word in item.split():
        syns = (wordnet.synsets(word))
        tmp.append(word+': '+syns[0].definition() if len(syns) > 0 else None)
    col_defs.append(','.join(tmp))

for x in col_defs:
    print(x)

输出:

Sector: a plane figure bounded by two radii and the included arc of a circle
Community: a group of people living in a particular local area,Name: a language unit by which a person or thing is known
Date: the specified day of the month
Community: a group of people living in a particular local area,Center: an area that is approximately central within some larger region,Point: a geometric element that has position but no extension