如何处理 load_svmlight_file 中的“索引必须排序且唯一”错误？

问题描述

所以，总结一下我的问题：

我使用 scipy.matrixes 来拟合 sklearn 模型
我想将它们转换为 svmlight/LIBSVM 格式的文件（到目前为止一切顺利...）
我想对这些文件进行修改，向它们添加功能，然后我想将这些新修改的文件转换回 scipy.matrixes（这就是它出错的地方！）

我刚开始的错误是：

ValueError: Feature indices in svmlight/LibSVM data file should be sorted and unique.

在侧面进行了一些测试后，我看到问题，或者至少是其中一个问题，总是在文件行上，那里有由索引 0 表示的特征，当我删除该特征时索引 = 0，或（手动）将索引更改为不同的数字，代码有效。所以我想知道什么是好的解决方法？比你们提前。

这是代码的基本思想：

X_train,X_test = X[train_index],X[test_index]  
Y_train,Y_test = Y[train_index],Y[test_index]
            
tf_vectorizer = CountVectorizer()
X_train = tf_vectorizer.fit_transform(X_train)
                        
X_test = tf_vectorizer.transform(X_test)

#[] X_train 's type : <class 'scipy.sparse.csr.csr_matrix'>

dump_svmlight_file(X=X_train,y=Y_train,f="trainfile.libsvm")
dump_svmlight_file(X=X_test,y=Y_test,f="testfile.libsvm")

#***performs modification (add features)to trainfile.libsvm ****

X_train_new,Y_train_new = load_svmlight_files('trainfile.libsvm')

#Training phase
model = svm.SVC(kernel='linear')
model.fit(X_train_new,Y_train_new)

解决方法

暂无找到可以解决该程序问题的有效方法，小编努力寻找整理中！

如果你已经找到好的解决方法，欢迎将解决方案带上本链接一起发送给小编。

小编邮箱:dio#foxmail.com (将#修改为@）

libsvm machine-learning python svmlight text-classification

如何处理 load_svmlight_file 中的“索引必须排序且唯一”错误？

问题描述

解决方法

相关问答