NLP神经网络的递归深度超出/内核死亡级联张量

问题描述

更新

我设法通过创建一个既包含要素（评论）又包含标签（总体评分）的列表来解决该问题，然后使用映射/应用（如果使用熊猫数据框）将其转换为张量。那时，我使用了Tensorflow的from_tensor_slices方法来准备对特征/标签进行训练。

原始问题

我目前正在从事一个NLP项目，以帮助学习Python / Tensorflow。我的程序接受评论，对其进行编码，将其转换为张量和张量数据集，然后将其输入到神经网络中。我遇到的问题是“ RecursionError：调用Python对象时超出了最大递归深度”，这是由于将张量连接到单个张量数据集中而引起的。

当我尝试从数据集中访问元素时（通过迭代对象或通过训练网络），将出现递归错误。

我所做的事情：

如果我将处理的评论总数从最初的9000 ish减少到1500，则效果很好。

如果我使用

import sys 
sys.setrecursionlimit(10000)

然后juypter内核死亡，而不是给我递归错误。

相关代码（我认为）

#encode the text

encoded_reviews=[]
for j in trimmed_review:
    encoded_reviews.append(encoder.encode(j))

#creating tensorflow datasets for training
def labeler(review,rating):
    return review,rating
#pairing the labels (good/bad game) with the encoded reviews
encoded_review_rating_list=[]
for i,j in enumerate(encoded_reviews):
    encoded_review_dataset = tf.data.Dataset.from_tensors(tf.cast(j,dtype='int64'))
    encoded_review_rating_list.append(encoded_review_dataset.map(lambda x: labeler(x,ratings[i])))

 #Combine the list of review:score sets into a single tensor dataset.
encoded_review_ratings = encoded_review_rating_list[0]
#test_var_tensor=tf.constant()
for single_dataset in encoded_review_rating_list[1:]:
    encoded_review_ratings=encoded_review_ratings.concatenate(single_dataset)

#Shuffle the datasets to avoid any biases.
buffer_size = len(encoded_reviews)
all_labeled_data = encoded_review_ratings.shuffle(
    buffer_size,reshuffle_each_iteration=False)

##Split the encoded words into training and test datasets,take size amount of data that goes into the training set
training_ratio=0.6
take_size= round(len(encoded_reviews)*training_ratio)
batch_size=30

#Organizing our training and validation data,the padded shapes are set to the longest review (as specified by None keywords)
train_data = all_labeled_data.take(take_size)
train_data = train_data.padded_batch(batch_size,padded_shapes=((None,),(1,)))

test_data = all_labeled_data.skip(take_size)
test_data = test_data.padded_batch(batch_size,)))

访问数据集中的张量的错误代码

next_feature,next_label = next(iter(test_data))
        
print (next_feature,next_label)



```---------------------------------------------------------------------------
RecursionError                            Traceback (most recent call last)
<ipython-input-8-e941c005ed79> in <module>
----> 1 next_feature,next_label = next(iter(test_data))
      2 
      3 print (next_feature,next_label)

~\anaconda3\envs\tf-gpu\lib\site-packages\tensorflow_core\python\data\ops\dataset_ops.py in __iter__(self)
    416     if (context.executing_eagerly()
    417         or ops.get_default_graph()._building_function):  # pylint: disable=protected-access
--> 418       return iterator_ops.OwnedIterator(self)
    419     else:
    420       raise RuntimeError("__iter__() is only supported inside of tf.function "

~\anaconda3\envs\tf-gpu\lib\site-packages\tensorflow_core\python\data\ops\iterator_ops.py in __init__(self,dataset,components,element_spec)
    592           context.context().device_spec.device_type != "CPU"):
    593         with ops.device("/cpu:0"):
--> 594           self._create_iterator(dataset)
    595       else:
    596         self._create_iterator(dataset)

~\anaconda3\envs\tf-gpu\lib\site-packages\tensorflow_core\python\data\ops\iterator_ops.py in _create_iterator(self,dataset)
    598   def _create_iterator(self,dataset):
    599     # pylint: disable=protected-access
--> 600     dataset = dataset._apply_options()
    601 
    602     # Store dataset reference to ensure that dataset is alive when this iterator

~\anaconda3\envs\tf-gpu\lib\site-packages\tensorflow_core\python\data\ops\dataset_ops.py in _apply_options(self)
    356 
    357     dataset = self
--> 358     options = self.options()
    359     if options.experimental_threading is not None:
    360       t_options = options.experimental_threading

~\anaconda3\envs\tf-gpu\lib\site-packages\tensorflow_core\python\data\ops\dataset_ops.py in options(self)
    347     options = Options()
    348     for input_dataset in self._inputs():
--> 349       input_options = input_dataset.options()
    350       if input_options is not None:
    351         options = options.merge(input_options)

... last 1 frames repeated,from the frame below ...

~\anaconda3\envs\tf-gpu\lib\site-packages\tensorflow_core\python\data\ops\dataset_ops.py in options(self)
    347     options = Options()
    348     for input_dataset in self._inputs():
--> 349       input_options = input_dataset.options()
    350       if input_options is not None:
    351         options = options.merge(input_options)

RecursionError: maximum recursion depth exceeded while calling a Python object

解决方法

为社区的利益在答案部分提供解决方案。谢谢@Accommodator的更新。

我设法通过创建一个包含两者的列表来解决该问题功能（评论）和标签（总体评分），然后使用映射 /应用（如果使用熊猫数据框）以将其转换为张量。那时，我使用了from_tensor_slices方法 tensorflow使特征/标签准备好进行训练

jupyter-notebook nlp python tail-recursion tensorflow

NLP神经网络的递归深度超出/内核死亡级联张量

问题描述

解决方法

相关问答