问题描述
我遇到错误(“AssertionError:找不到需要填充的字段;我们很惊讶您收到此错误,请在 github 上打开一个问题”)。
我不知道为什么会出现这个错误。
我的配置文件在下面。
"""
{
"dataset_reader": {
"type": "tbmse_drop","answer_field_generators": {
"arithmetic_answer": {
"type": "arithmetic_answer_generator","special_numbers": [
100,1
]
},"count_answer": {
"type": "count_answer_generator"
},"passage_span_answer": {
"type": "span_answer_generator","text_type": "passage"
},"question_span_answer": {
"type": "span_answer_generator","text_type": "question"
},"tagged_answer": {
"type": "tagged_answer_generator","ignore_question": false,"labels": {
"I": 1,"O": 0
}
}
},"answer_generator_names_per_type": {
"date": [
"arithmetic_answer","passage_span_answer","question_span_answer","tagged_answer"
],"multiple_span": [
"tagged_answer"
],"number": [
"arithmetic_answer","count_answer","single_span": [
"tagged_answer","question_span_answer"
]
},"is_training": true,"old_reader_behavior": true,"pickle": {
"action": "load","file_name": "all_heads_IO_roberta-large","path": "../pickle/drop"
},"tokenizer": {
"type": "huggingface_transformers","pretrained_model": "roberta-large"
}
},"model": {
"type": "multi_head","dataset_name": "drop","head_predictor": {
"activations": [
"relu","linear"
],"dropout": [
0.1,0
],"hidden_dims": [
1024,5
],"input_dim": 2048,"num_layers": 2
},"heads": {
"arithmetic": {
"type": "arithmetic_head","output_layer": {
"activations": [
"relu","linear"
],"dropout": [
0.1,0
],"hidden_dims": [
1024,3
],"num_layers": 2
},"special_embedding_dim": 1024,1
],"training_style": "soft_em"
},"count": {
"type": "count_head","max_count": 10,11
],"input_dim": 1024,"num_layers": 2
}
},"multi_span": {
"type": "multi_span_head","decoding_style": "at_least_one","O": 0
},2
],"prediction_method": "viterbi","passage_span": {
"type": "passage_span_head","end_output_layer": {
"activations": "linear","hidden_dims": 1,"num_layers": 1
},"start_output_layer": {
"activations": "linear","question_span": {
"type": "question_span_head","end_output_layer": {
"activations": [
"relu",1
],"training_style": "soft_em"
}
},"passage_summary_vector_module": {
"activations": "linear","num_layers": 1
},"pretrained_model": "roberta-large","question_summary_vector_module": {
"activations": "linear","num_layers": 1
}
},"train_data_path": "drop_data/drop_dataset_train.json","validation_data_path": "drop_data/drop_dataset_dev.json","trainer": {
"num_epochs": 15,"optimizer": {
"type": "adam","lr": 5e-06
},"patience": 10,"validation_metric": "+f1"
},"data_loader": {
"batch_sampler": {
"type": "bucket","batch_size": 1
}
},"distributed": {
"cuda_devices": [
0,1
]
},"validation_dataset_reader": {
"type": "tbmse_drop","is_training": false,"pretrained_model": "roberta-large"
}
}
}
"""
我使用了 allennlp 2.0.1 & 2.0.2(两个版本都出现了同样的错误。)
解决方法
您的数据集读取器是否可能根本不生成任何实例?或者,它产生的实例中是否可能没有字段(即完全为空)?在这两种情况下,都会发生此错误。
如果这些都没有发生,请尝试将 sorting_keys
的 batch_sampler
设置为实例中最长的字段。