如果数据集有缺失值,如何在python中将名义数据转换为数字?

问题描述

我使用的是二进制分类数据集。我想将名义数据转换为数字。但我有缺失值,我不想删除它们,因为我的目标是用 KNN 方法填充主题。我应该怎么做才能将它们转换为数字数据?

// Establish the coordinages of the target
const target = new THREE.Vector3(x,y,z);

// Make the missile point directly at the position of your target
projectile.lookAt(target);

let speed = 1;

function animate() {
    // Now you move it forward by translating down its own Z-axis
    projectile.translateZ(speed);
    requestAnimationFrame(animate);
}

如果数据集有缺失值,此代码不起作用

age | class
------------
 1 |  NAN
 2 |  yes
 3 |  no
 4 |  NAN
 5 |  no
 6 |  NAN
 7 |  no
 8 |  yes
 9 |  no
10 |  NAN

解决方法

在调用 unique 之前过滤掉空值?

<svg viewBox="0 0 81.786 19.03" xmlns="http://www.w3.org/2000/svg">
 <defs>
  <mask id="MASK-CIRCLE">
    <path
      fill="#fff"
      transform="translate(-7.3666 -45.427)"
      stroke-width=".26458"
      d="something"
    />
  </mask>
 </defs>

 <g transform="translate(-7.3666 -45.427)">
   <circle cx="16.882" cy="54.942" r="9.5152" opacity=".8" />
    <path
      mask="url(#MASK-CIRCLE)
      stroke-width=".26458"
      d="something"
     />
  </g>
</svg>

import numpy as np import pandas as pd df = pd.DataFrame([None,'yes','no',None,None],columns=['class']) mapping = { label: idx for idx,label in enumerate(np.unique(df.loc[df['class'].notnull(),'class'])) } df['class'] = df['class'].map(mapping) print(df)

df

我不知道您是否有更多类,这就是您动态分配 class 0 NaN 1 1.0 2 0.0 3 NaN 4 0.0 5 NaN 6 0.0 7 1.0 8 0.0 9 NaN 的原因,但对于这种特殊情况:

mapping