如何在Python中将Google Cloud Natural Language实体情感响应转换为JSON / dict?

问题描述

我正在尝试使用Google Cloud Natural Language API来分析实体情感。

from google.cloud import language_v1
import os 
os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = 'path/to/json'

client = language_v1.LanguageServiceClient()
text_content = 'Grapes are good. Bananas are bad.'

# Available types: PLAIN_TEXT,HTML
type_ = language_v1.Document.Type.PLAIN_TEXT

# Optional. If not specified,the language is automatically detected.
# For list of supported languages:
# https://cloud.google.com/natural-language/docs/languages
document = language_v1.Document(content=text_content,type_=language_v1.Document.Type.PLAIN_TEXT)

# Available values: NONE,UTF8,UTF16,UTF32
encoding_type = language_v1.EncodingType.UTF8
response = client.analyze_entity_sentiment(request = {'document': document,'encoding_type': encoding_type})

然后我从响应中打印出实体及其属性

for entity in response.entities:
    print('=' * 20)
    print(type(entity))
    print(entity)

====================
<class 'google.cloud.language_v1.types.language_service.Entity'>
name: "Grapes"
type_: OTHER
salience: 0.8335162997245789
mentions {
  text {
    content: "Grapes"
  }
  type_: COMMON
  sentiment {
    magnitude: 0.8999999761581421
    score: 0.8999999761581421
  }
}
sentiment {
  magnitude: 0.8999999761581421
  score: 0.8999999761581421
}

====================
<class 'google.cloud.language_v1.types.language_service.Entity'>
name: "Bananas"
type_: OTHER
salience: 0.16648370027542114
mentions {
  text {
    content: "Bananas"
    begin_offset: 17
  }
  type_: COMMON
  sentiment {
    magnitude: 0.8999999761581421
    score: -0.8999999761581421
  }
}
sentiment {
  magnitude: 0.8999999761581421
  score: -0.8999999761581421
}

现在,我想以JSON或字典格式存储整个响应,以便可以将其存储到数据库中的表或进行处理。我尝试遵循converting Google Cloud NLP API entity sentiment output to JSONHow can I JSON serialize an object from google's natural language API? (No __dict__ attribute),但没有用。

如果我使用

from google.protobuf.json_format import MessagetoDict,MessagetoJson 
result_dict = MessagetoDict(response)
result_json = MessagetoJson(response)

我说错了

>>> result_dict = MessagetoDict(response)
Traceback (most recent call last):
  File "/Users/pmehta/Anaconda-3/anaconda3/envs/nlp_36/lib/python3.6/site-packages/proto/message.py",line 555,in __getattr__
    pb_type = self._Meta.fields[key].pb_type
KeyError: 'DESCRIPTOR'

During handling of the above exception,another exception occurred:

Traceback (most recent call last):
  File "<stdin>",line 1,in <module>
  File "/Users/pmehta/Anaconda-3/anaconda3/envs/nlp_36/lib/python3.6/site-packages/google/protobuf/json_format.py",line 175,in MessagetoDict
    return printer._MessagetoJsonObject(message)
  File "/Users/pmehta/Anaconda-3/anaconda3/envs/nlp_36/lib/python3.6/site-packages/google/protobuf/json_format.py",line 209,in _MessagetoJsonObject
    message_descriptor = message.DESCRIPTOR
  File "/Users/pmehta/Anaconda-3/anaconda3/envs/nlp_36/lib/python3.6/site-packages/proto/message.py",line 560,in __getattr__
    raise AttributeError(str(ex))
AttributeError: 'DESCRIPTOR'

如何解析此响应以将其正确转换为json或dict?

解决方法

作为google-cloud-language 2.0.0 migration的一部分,proto-plus提供了响应消息,该响应消息包装了原始protobuf消息。 ParseDictMessageToDictprotobuf提供的方法,并且由于proto-plus封装了原始消息,因此这些protobuf方法不能再直接使用。

替换

from google.protobuf.json_format import MessageToDict,MessageToJson 
result_dict = MessageToDict(response)
result_json = MessageToJson(response)

使用

import json
result_json = response.__class__.to_json(response)
result_dict = json.loads(result_json)
result_dict
,

tl; dr 可接受的解决方案不是从头至尾的替代品。为了恢复原始行为,您需要执行以下操作:

from google.protobuf.json_format import MessageToDict
result_dict = MessageToDict(response.__class__.pb(response))

我本人经历了此之后,想指出to_jsonMessageToDict的变化很大。对于including_default_value_fields,参数use_integers_for_enumsFalse默认为MessageToDict,对于True,它们现在默认为to_json

在此处了解更多信息:https://github.com/googleapis/proto-plus-python/blob/5c14cbaf21e3864a247e0183480903e7640e5460/proto/message.py#L372

这里引用了to_json的正式实现:

def to_json(cls,instance,*,use_integers_for_enums=True) -> str:
    """Given a message instance,serialize it to json

    Args:
        instance: An instance of this message type,or something
            compatible (accepted by the type's constructor).
        use_integers_for_enums (Optional(bool)): An option that determines whether enum
            values should be represented by strings (False) or integers (True).
            Default is True.

    Returns:
        str: The json string representation of the protocol buffer.
    """
    return MessageToJson(
        cls.pb(instance),use_integers_for_enums=use_integers_for_enums,including_default_value_fields=True,)

相关问答

Selenium Web驱动程序和Java。元素在(x,y)点处不可单击。其...
Python-如何使用点“。” 访问字典成员?
Java 字符串是不可变的。到底是什么意思?
Java中的“ final”关键字如何工作?(我仍然可以修改对象。...
“loop:”在Java代码中。这是什么,为什么要编译?
java.lang.ClassNotFoundException:sun.jdbc.odbc.JdbcOdbc...