问题描述
我正在尝试构建多模式情感分类器,为此我分别创建了基于CNN的音频和视频模型。以下是实施的CNN:
##Model for audio input_audio_ = Input(shape = (40,1,)) output_audio_ = Conv2D(64,5,activation='relu')(input_audio_) output_audio_ = Dropout(0.2)(output_audio_) output_audio_ = Flatten()(output_audio_) output_audio_ = Dense(8,activation='softmax')(output_audio_) model_aud = Model(inputs=[input_audio_],outputs=[output_audio_]) model_aud.summary() ##Model for Video input_frame_ = Input(shape=(256,512,3)) output_frame_ = Conv2D(64,3,padding='same',activation='relu')(input_frame_) output_frame_ = Flatten()(output_frame_) output_frame_ = Dense(8,activation='softmax')(output_frame_) model_img = Model(inputs=[input_frame_],outputs=[output_frame_]) model_img.summary()
@H_502_5@但是,在执行后期融合时,我坚持解决输入尺寸问题。我建立了融合模型:
我能够编译模型,这就是结构的样子:
但是在拟合模型时-
base_history = model_main.fit({'input_frame_': img_train,'input_audio_': X_train_aud},{'output_frame_': img_train_labels,'output_audio_': y_train_aud},epochs=50,validation_data= ({'input_frame_': img_test,'input_audio_': X_test_aud},{'output_frame_': img_test_labels,'output_audio_': y_test_aud}))
@H_502_5@我收到以下错误:
ValueError: Input 0 of layer conv1d_5 is incompatible with the layer: expected axis -1 of input shape to have value 1 but received input with shape [None,256,3]
@H_502_5@输入的形状如下:
X_train_aud.shape,X_test_aud.shape,y_train_aud.shape,y_test_aud.shape ((1715,40,1),(736,(1715,),)) img_train.shape,img_test.shape ((1715,3),3)) img_train_labels.shape (1715,)
@H_502_5@我很困惑如何立即融合音频和视频的这些特征向量,以使其准备好进行处理。在这方面的任何帮助将不胜感激。预先感谢!
解决方法
暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!
如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。
小编邮箱:dio#foxmail.com (将#修改为@)