tf 通用句子编码器运行我们的记忆

问题描述

我正在使用 tensorflow 的通用句子编码器 (https://tfhub.dev/google/universal-sentence-encoder/4) 训练模型来计算文本之间的相似性。当我减少large_data目录中的文本文件数量时,程序正常运行。但是,现在那里有很多文本文件并且程序崩溃了。这是我的代码

import pandas as pd
import tensorflow_hub as hub
import numpy as np

module_url = 'https://tfhub.dev/google/universal-sentence-encoder/4' # Pre-trained model URL
model = hub.load(module_url)

documents = [open('large_data/' + f + '.txt').read() for f in text_files]

message_embeddings = model(documents)
corr = np.inner(message_embeddings,message_embeddings)

df = pd.DataFrame(data=corr,index=documents,columns=documents)
df.to_csv('matrix.csv')

在 large_data 文件夹中,有大约 20K 个文本文件(它们很大,每个大约 5K 字)。

这里有一些日志:

2021-05-21 13:11:51.172674: I tensorflow/compiler/jit/xla_cpu_device.cc:41] Not creating XLA devices,tf_xla_enable_xla_devices not set
2021-05-21 13:11:51.172973: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (onednN) to use the following cpu instructions in performance-critical operations:  SSE4.1 SSE4.2 AVX
To enable them in other operations,rebuild TensorFlow with the appropriate compiler flags.
2021-05-21 13:11:51.175407: I tensorflow/core/common_runtime/process_util.cc:146] Creating new thread pool with default inter op setting: 2. Tune using inter_op_parallelism_threads for best performance.
2021-05-21 13:11:54.710993: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:116] None of the MLIR optimization passes are enabled (registered 2)
2021-05-21 13:11:55.186770: I tensorflow/core/platform/profile_utils/cpu_utils.cc:112] cpu Frequency: 2593520000 Hz
sys:1: DtypeWarning: Columns (105,106) have mixed types.Specify dtype option on import or set low_memory=False.
2021-05-21 13:14:41.195210: W tensorflow/core/common_runtime/bfc_allocator.cc:433] Allocator (mklcpu) ran out of memory trying to allocate 16.76GiB (rounded to 17995920128)requested by op StatefulPartitionedCall/StatefulPartitionedCall/text_preprocessor/add_bigrams/concat
Current allocation summary follows.
2021-05-21 13:14:41.195296: I tensorflow/core/common_runtime/bfc_allocator.cc:972] BFCAllocator dump for mklcpu
2021-05-21 13:14:41.195316: I tensorflow/core/common_runtime/bfc_allocator.cc:979] Bin (256):   Total Chunks: 0,Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2021-05-21 13:14:41.195330: I tensorflow/core/common_runtime/bfc_allocator.cc:979] Bin (512):   Total Chunks: 0,Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2021-05-21 13:14:41.195380: I tensorflow/core/common_runtime/bfc_allocator.cc:979] Bin (1024):  Total Chunks: 0,Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2021-05-21 13:14:41.195401: I tensorflow/core/common_runtime/bfc_allocator.cc:979] Bin (2048):  Total Chunks: 0,Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.

.....

2021-05-21 13:14:41.195846: I tensorflow/core/common_runtime/bfc_allocator.cc:1001]   Size: 893.18MiB | Requested Size: 4.2KiB | in_use: 0 | bin_num: 20,prev:   Size: 32.55MiB | Requested Size: 32.55MiB | in_use: 1 | bin_num: -1
2021-05-21 13:14:41.195868: I tensorflow/core/common_runtime/bfc_allocator.cc:1001]   Size: 2.00GiB | Requested Size: 1.60GiB | in_use: 0 | bin_num: 20
2021-05-21 13:14:41.195891: I tensorflow/core/common_runtime/bfc_allocator.cc:1001]   Size: 2.08GiB | Requested Size: 0B | in_use: 0 | bin_num: 20,prev:   Size: 16.76GiB | Requested Size: 16.76GiB | in_use: 1 | bin_num: -1
2021-05-21 13:14:41.195913: I tensorflow/core/common_runtime/bfc_allocator.cc:1001]   Size: 4.00GiB | Requested Size: 2.41GiB | in_use: 0 | bin_num: 20
2021-05-21 13:14:41.195937: I tensorflow/core/common_runtime/bfc_allocator.cc:1001]   Size: 15.24GiB | Requested Size: 0B | in_use: 0 | bin_num: 20,prev:   Size: 16.76GiB | Requested Size: 16.76GiB | in_use: 1 | bin_num: -1

最后,错误

2021-05-21 13:14:41.200222: W tensorflow/core/common_runtime/bfc_allocator.cc:441] *****************************___*****************************___________________________________*_**
2021-05-21 13:14:41.200275: W tensorflow/core/framework/op_kernel.cc:1763] OP_REQUIRES Failed at concat_op.cc:158 : Resource exhausted: OOM when allocating tensor with shape[10000,74983] and type string on /job:localhost/replica:0/task:0/device:cpu:0 by allocator mklcpu
Traceback (most recent call last):
  File "/root/petar/main.py",line 15,in <module>
    message_embeddings = model(documents)
  File "/root/miniconda3/envs/pipeline/lib/python3.9/site-packages/tensorflow/python/saved_model/load.py",line 668,in _call_attribute
    return instance.__call__(*args,**kwargs)
  File "/root/miniconda3/envs/pipeline/lib/python3.9/site-packages/tensorflow/python/eager/def_function.py",line 828,in __call__
    result = self._call(*args,**kwds)
  File "/root/miniconda3/envs/pipeline/lib/python3.9/site-packages/tensorflow/python/eager/def_function.py",line 894,in _call
    return self._concrete_stateful_fn._call_flat(
  File "/root/miniconda3/envs/pipeline/lib/python3.9/site-packages/tensorflow/python/eager/function.py",line 1918,in _call_flat
    return self._build_call_outputs(self._inference_function.call(
  File "/root/miniconda3/envs/pipeline/lib/python3.9/site-packages/tensorflow/python/eager/function.py",line 555,in call
    outputs = execute.execute(
  File "/root/miniconda3/envs/pipeline/lib/python3.9/site-packages/tensorflow/python/eager/execute.py",line 59,in quick_execute
    tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle,device_name,op_name,tensorflow.python.framework.errors_impl.ResourceExhaustedError:  OOM when allocating tensor with shape[10000,74983] and type string on /job:localhost/replica:0/task:0/device:cpu:0 by allocator mklcpu
     [[{{node StatefulPartitionedCall/StatefulPartitionedCall/text_preprocessor/add_bigrams/concat}}]]
Hint: If you want to see a list of allocated tensors when OOM happens,add report_tensor_allocations_upon_oom to Runoptions for current allocation info.
 [Op:__inference_restored_function_body_5285]

解决方法

暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!

如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。

小编邮箱:dio#foxmail.com (将#修改为@)

相关问答

Selenium Web驱动程序和Java。元素在(x,y)点处不可单击。其...
Python-如何使用点“。” 访问字典成员?
Java 字符串是不可变的。到底是什么意思?
Java中的“ final”关键字如何工作?(我仍然可以修改对象。...
“loop:”在Java代码中。这是什么,为什么要编译?
java.lang.ClassNotFoundException:sun.jdbc.odbc.JdbcOdbc...