AttributeError: 'cupy.core.core.ndarray' 对象没有属性 'iloc'

问题描述

我正在尝试将数据拆分为训练数据和验证数据,为此我使用了 train_test_split 模块中的 cuml.preprocessing.model_selection

但出现错误

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-317-4e11838456ea> in <module>
----> 1 X_train,X_test,y_train,y_test = train_test_split(train_dfIF,train_y,test_size=0.20,random_state=42)

/opt/conda/lib/python3.7/site-packages/cuml/preprocessing/model_selection.py in train_test_split(X,y,test_size,train_size,shuffle,random_state,seed,stratify)
    454         X_train = X.iloc[0:train_size]
    455         if y is not None:
--> 456             y_train = y.iloc[0:train_size]
    457 
    458     if hasattr(X,"__cuda_array_interface__") or \

AttributeError: 'cupy.core.core.ndarray' object has no attribute 'iloc'

虽然我没有使用 iloc。

代码如下:

from cuml.preprocessing.model_selection import train_test_split

X_train,random_state=42)

这里的 train_dfIF一个 cudf DataFrame,而 train_y一个cupy 数组。

解决方法

如果您的 y 参数是数据框,则您(当前)不能将数组传递给 X 参数。我建议传递两个数据帧或两个数组,而不是一个。

from cuml.preprocessing.model_selection import train_test_split
import cudf
import cupy as cp

df = cudf.DataFrame({
    "a":range(5),"b":range(5)
})
y = cudf.Series(range(5))

# train_test_split(df,y.values,test_size=0.20,random_state=42) # fail
X_train,X_test,y_train,y_test = train_test_split(df,y,random_state=42) # succeed
X_train,y_test = train_test_split(df.values,random_state=42) # succeed