合并 3 个深度网络并进行端到端训练

问题描述

我正在使用深度学习概念,但它是初学者,我正在尝试使用 3 个深度神经网络模型构建特征融合概念,这个想法是我试图从所有三个模型中获取特征并执行对最后一个单sigmoid层进行分类然后得到结果,这是我运行的代码

代码

from keras.layers import Input,Dense
from keras.models import Model
from sklearn.model_selection import train_test_split
import numpy
# random seed for reproducibility
numpy.random.seed(2)
# loading load pima indians diabetes dataset,past 5 years of medical history
dataset = numpy.loadtxt('https://raw.githubusercontent.com/jbrownlee/Datasets/master/pima-indians-diabetes.data.csv',delimiter=",")
# split into input (X) and output (Y) variables,splitting csv data
X = dataset[:,0:8]
Y = dataset[:,8]
x_train,x_validation,y_train,y_validation = train_test_split(X,Y,test_size=0.20,random_state=5)
#create the input layer
input_layer = Input(shape=(8,))
A2 = Dense(8,activation='relu')(input_layer)
A3 = Dense(30,activation='relu')(A2)
B2 = Dense(40,activation='relu')(A2)
B3 = Dense(30,activation='relu')(B2)
C2 = Dense(50,activation='relu')(B2)
C3 = Dense(5,activation='relu')(C2)
merged = Model(inputs=[input_layer],outputs=[A3,B3,C3])
final_model = Dense(1,activation='sigmoid')(merged
final_model.compile(loss="binary_crossentropy",optimizer="adam",metrics=['accuracy'])
# call the function to fit to the data (training the network)
final_model.fit(x_train,epochs=2000,batch_size=50,validation_data=(x_validation,y_validation))
# evaluate the model
scores = final_model.evaluate(x_validation,y_validation)
print("\n%s: %.2f%%" % (final_model.metrics_names[1],scores[1] * 100))

这是我面临的错误

if x.shape.ndims is None:

AttributeError: 'Functional' object has no attribute 'shape'

请帮我解决这个问题,或者如果有人知道我应该使用什么代码然后让我知道我也愿意更改代码但不是概念谢谢。


更新

根据@M.Innat 的回答,我们尝试了以下方法。这个想法是我们首先构建 3 个模型,然后通过将这些模型与单个分类器连接来构建最终/组合模型。但我正面临一个差异。当我训练每个模型时,它们给出了 90% 的结果,但当我将它们组合起来时,它们几乎不会达到 60 或 70。

代码模型 1:

   model = Sequential()
    # input layer requires input_dim param
    model.add(Dense(10,input_dim=8,activation='relu'))
    model.add(Dense(50,activation='relu'))
    model.add(Dense(5,activation='relu'))
    # sigmoid instead of relu for final probability between 0 and 1
    model.add(Dense(1,activation='sigmoid'))
    
    # compile the model,adam gradient descent (optimized)
    model.compile(loss="binary_crossentropy",metrics=['accuracy'])
    
    # call the function to fit to the data (training the network)
    model.fit(x_train,epochs=1000,y_validation))
    
    # evaluate the model
    
    scores = model.evaluate(X,Y)
    print("\n%s: %.2f%%" % (model.metrics_names[1],scores[1] * 100))
    model.save('diabetes_risk_nn.h5')

模型 1 精度 = 94.14%。和另外两个模型一样。

模型 2 精度 = 93.62% 模型 3 精度 = 92.71%

接下来,正如@M.Innat 建议合并模型一样。在这里,我们使用上述模型 1、2、3 完成了该操作。但分数还没有接近 90%。最终组合模型:

# Define Model A 
input_layer = Input(shape=(8,))
A2 = Dense(10,activation='relu')(input_layer)
A3 = Dense(50,activation='relu')(A2)
A4 = Dense(50,activation='relu')(A3)
A5 = Dense(50,activation='relu')(A4)
A6 = Dense(50,activation='relu')(A5)
A7 = Dense(50,activation='relu')(A6)
A8 = Dense(5,activation='relu')(A7)
model_a = Model(inputs=input_layer,outputs=A8,name="ModelA")

# Define Model B 
input_layer = Input(shape=(8,))
B2 = Dense(10,activation='relu')(input_layer)
B3 = Dense(50,activation='relu')(B2)
B4 = Dense(40,activation='relu')(B3)
B5 = Dense(60,activation='relu')(B4)
B6 = Dense(30,activation='relu')(B5)
B7 = Dense(50,activation='relu')(B6)
B8 = Dense(50,activation='relu')(B7)
B9 = Dense(5,activation='relu')(B8)
model_b = Model(inputs=input_layer,outputs=B9,name="ModelB")

# Define Model C
input_layer = Input(shape=(8,))
C2 = Dense(10,activation='relu')(input_layer)
C3 = Dense(50,activation='relu')(C2)
C4 = Dense(40,activation='relu')(C3)
C5 = Dense(40,activation='relu')(C4)
C6 = Dense(70,activation='relu')(C5)
C7 = Dense(50,activation='relu')(C6)
C8 = Dense(50,activation='relu')(C7)
C9 = Dense(60,activation='relu')(C8)
C10 = Dense(50,activation='relu')(C9)
C11 = Dense(5,activation='relu')(C10)
model_c = Model(inputs=input_layer,outputs=C11,name="ModelC")
all_three_models = [model_a,model_b,model_c]
all_three_models_input = Input(shape=all_three_models[0].input_shape[1:])

然后将这三者结合起来。

models_output = [model(all_three_models_input) for model in all_three_models]
Concat           = tf.keras.layers.concatenate(models_output,name="Concatenate")
final_out     = Dense(1,activation='sigmoid')(Concat)
final_model   = Model(inputs=all_three_models_input,outputs=final_out,name='Ensemble')
#tf.keras.utils.plot_model(final_model,expand_nested=True)
final_model.compile(loss="binary_crossentropy",y_validation))

# evaluate the model

scores = final_model.evaluate(x_validation,scores[1] * 100))
final_model.save('diabetes_risk_nn.h5')

但与他们给出 90% 的每个模型不同,这个组合最终模型给出的准确度约为 =70%

解决方法

我想输出层是Dense(1,activation='sigmoid')。所以尝试这样的事情

# ...
merged = tf.keras.layers.concatenate([A3,B3,C3])
out = Dense(1,activation='sigmoid')(merged)
model = (input_layer,out)

model.fit(x_train,y_train,...)
,

根据您的代码,只有一个模型(不是三个)。通过查看您尝试的输出,我认为您正在寻找这样的东西:

数据集

import tensorflow as tf 
from tensorflow.keras.layers import Input,Dense
from tensorflow.keras.models import Model
from sklearn.model_selection import train_test_split
import numpy

# random seed for reproducibility
numpy.random.seed(2)
# loading load pima indians diabetes dataset,past 5 years of medical history
dataset = numpy.loadtxt('https://raw.githubusercontent.com/jbrownlee/Datasets/master/pima-indians-diabetes.data.csv',delimiter=",")

# split into input (X) and output (Y) variables,splitting csv data
X = dataset[:,0:8]
Y = dataset[:,8]

x_train,x_validation,y_validation = train_test_split(X,Y,test_size=0.20,random_state=5)

模型

#create the input layer
input_layer = Input(shape=(8,))

A2 = Dense(8,activation='relu')(input_layer)
A3 = Dense(30,activation='relu')(A2)

B2 = Dense(40,activation='relu')(input_layer)
B3 = Dense(30,activation='relu')(B2)

C2 = Dense(50,activation='relu')(input_layer)
C3 = Dense(5,activation='relu')(C2)


merged = tf.keras.layers.concatenate([A3,C3])
final_out = Dense(1,activation='sigmoid')(merged)

final_model = Model(inputs=[input_layer],outputs=final_out)
tf.keras.utils.plot_model(final_model)

enter image description here

训练

final_model.compile(loss="binary_crossentropy",optimizer="adam",metrics=['accuracy'])

# call the function to fit to the data (training the network)
final_model.fit(x_train,epochs=5,batch_size=50,validation_data=(x_validation,y_validation))

# evaluate the model
scores = final_model.evaluate(x_validation,y_validation)
print("\n%s: %.2f%%" % (final_model.metrics_names[1],scores[1] * 100))
Epoch 1/5
13/13 [==============================] - 1s 15ms/step - loss: 0.7084 - accuracy: 0.6803 - val_loss: 0.6771 - val_accuracy: 0.6883
Epoch 2/5
13/13 [==============================] - 0s 5ms/step - loss: 0.6491 - accuracy: 0.6600 - val_loss: 0.5985 - val_accuracy: 0.6623
Epoch 3/5
13/13 [==============================] - 0s 5ms/step - loss: 0.6161 - accuracy: 0.6813 - val_loss: 0.6805 - val_accuracy: 0.6883
Epoch 4/5
13/13 [==============================] - 0s 5ms/step - loss: 0.6335 - accuracy: 0.7003 - val_loss: 0.6115 - val_accuracy: 0.6623
Epoch 5/5
13/13 [==============================] - 0s 5ms/step - loss: 0.5684 - accuracy: 0.7285 - val_loss: 0.6150 - val_accuracy: 0.6883
5/5 [==============================] - 0s 2ms/step - loss: 0.6150 - accuracy: 0.6883

accuracy: 68.83%

更新

基于您的评论:

让我向你解释我想要做什么,首先我分别创建了 3 个模型 DNN,然后我尝试组合这些模型以获取所有这些模型的特征,之后我想使用所有提取的特征进行分类,然后进行评估准确性。这就是我真正想要开发的。

  • 分别创建 3 个模型 - 好的,3 个模型
  • 将它们组合起来得到一个特征 - 好的,特征提取器
  • 分类 - 好的,平均模型输出特征图并传递给分类器 - 换句话说,集成。

让我们这样做。首先,分别构建三个模型。

# Define Model A 
input_layer = Input(shape=(8,))
A2 = Dense(8,activation='relu')(A2)
C3 = Dense(5,activation='relu')(A3)
model_a = Model(inputs=input_layer,outputs=C3,name="ModelA")

# Define Model B 
input_layer = Input(shape=(8,activation='relu')(A3)
model_b = Model(inputs=input_layer,name="ModelB")

# Define Model C
input_layer = Input(shape=(8,activation='relu')(A3)
model_c = Model(inputs=input_layer,name="ModelC")

我用了同样多的参数,自己改吧。无论如何,这三个模型作为每个特征提取器(而不是分类器)执行。接下来,我们将通过平均组合它们的输出,然后将其传递给分类器。

all_three_models = [model_a,model_b,model_c]
all_three_models_input = Input(shape=all_three_models[0].input_shape[1:])


models_output = [model(all_three_models_input) for model in all_three_models]
Avg           = tf.keras.layers.average(models_output,name="Average")
final_out     = Dense(1,activation='sigmoid')(Avg)
final_model   = Model(inputs=all_three_models_input,outputs=final_out,name='Ensemble')
tf.keras.utils.plot_model(final_model,expand_nested=True)

enter image description here

现在,您可以训练模型并在测试集上对其进行评估。希望这会有所帮助。


更多信息。

(1)。您可以添加种子。

from tensorflow.keras.models import Model
from sklearn.model_selection import train_test_split
import tensorflow as tf 
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense,Dropout
from sklearn.model_selection import train_test_split
import os,numpy

# random seed for reproducibility
numpy.random.seed(101)
tf.random.set_seed(101)
os.environ['TF_CUDNN_DETERMINISTIC'] = '1'

dataset = .. your data 

# split into input (X) and output (Y) variables,8]
x_train,random_state=101)

(2)。尝试使用 SGD 优化器。此外,使用 ModelCheckpoint 回调保存最高的 validation accuracy

final_model.compile(loss="binary_crossentropy",optimizer="sgd",metrics=['accuracy'])

model_save = tf.keras.callbacks.ModelCheckpoint(
                'merge_best.h5',monitor="val_accuracy",verbose=0,save_best_only=True,save_weights_only=True,mode="max",save_freq="epoch"
            )

# call the function to fit to the data (training the network)
final_model.fit(x_train,epochs=1000,batch_size=256,callbacks=[model_save],y_validation))

在测试集上进行评估。

# evaluate the model
final_model.load_weights('merge_best.h5')
scores = final_model.evaluate(x_validation,scores[1] * 100))
5/5 [==============================] - 0s 4ms/step - loss: 0.6543 - accuracy: 0.7662

accuracy: 76.62%