如何从嵌套元组列表中生成批量数据？

问题描述

我已经实现了Keras定制的DataGenerator，它从一对嵌套的元组形式（文件，测试）生成数据的正例和负例。

数据示例：

 [((0,1,2),0),((3,4,5),((12,),1),((0,7),1)]

批次示例：

{'files': (0,'test': 0},label=1

其中数据的正例标签为1，负例标签为0。

我具有以下生成数据的功能：

 def data_generation(self,pairs):
    """Generate batches of samples for training"""
    batch = np.zeros((self.batch_size,3)) # I KNow THE PROBLEM CAN BE HERE

    # Adjust label based on task
    if self.classification:
        neg_label = 0
    else:
        neg_label = -1

    # This creates a generator
    while True:
        for idx,(file_id,test_id) in enumerate(random.sample(pairs,self.n_positive)):
            batch[idx,:] = (file_id,test_id,1)

        # Increment idx by 1
        idx += 1

        # Add negative examples until reach batch size
        while idx < self.batch_size:

            # random selection
            random_test = random.randrange(self.nr_tests)

            # Check to make sure this is not a positive example
            if (file_id,random_test) not in self.pairs_set:
                # Add to batch and increment index
                batch[idx,random_test,neg_label)
                idx += 1

        np.random.shuffle(batch)
        yield {'file': batch[:,0],'test': batch[:,1]},batch[:,2]

Traceback:
    File "/Users/DataGenerator.py",line 83,in data_generation
        batch[idx,1)
    ValueError: setting an array element with a sequence.

现在，我知道批处理是问题所在，因为批处理的形式（元组，整数，整数）而且长度可变。我应该掩盖或填充元组吗？我该如何工作？

解决方法

暂无找到可以解决该程序问题的有效方法，小编努力寻找整理中！

如果你已经找到好的解决方法，欢迎将解决方案带上本链接一起发送给小编。

小编邮箱:dio#foxmail.com (将#修改为@）

data-generation keras python