问题描述
我正在处理这个数据集(我已经清理过了,没有缺失值)。
Area No. of bedrooms Resale latitude longitude price Alaknanda Badarpur Bharat Vihar Bindapur Burari Chattarpur Chittaranjan Park Delhi Delhi Meerut Expressway Dwarka Mor Dwarka More Govindpuri Greater Kailash Hari Nagar Jamia Nagar Jasola Kalkaji Kamla Nagar Mahavir Enclave Mansa Ram Park Mayur Vihar Mayur Vihar II Model Town Mundka Munirka New Ashok Nagar Noida Road Okhla Om Nagar Om Vihar Palam Paschim Vihar Pitampura Preet Vihar Punjabi Bagh Rohini Sector 9 Rohini sector 24 Roop Nagar Sainik Farms Saket Sarita Vihar Sector 10 Dwarka Sector 11 Dwarka Sector 12 Dwarka Sector 13 Dwarka Sector 13 Rohini Sector 17 Dwarka Sector 18A Dwarka Sector 19 Dwarka Sector 2 Dwarka Sector 22 Dwarka Sector 22 Rohini Sector 23 Dwarka Sector 23 Rohini Sector 24 Rohini Sector 3 Dwarka Sector 4 Dwarka Sector 5 Dwarka Sector 6 Dwarka Sector 7 Dwarka Sector 9 Dwarka Sector-18 Dwarka Shahdara Shanti Park Dwarka Shastri Nagar Uttam Nagar Vasant Kunj Vikas Puri West End West Punjabi Bagh nawada
0 1200 2 1 28.584311 77.057693 105.0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1 1000 3 0 28.619074 77.056686 60.0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0
2 1350 2 1 28.528574 77.288331 150.0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
3 435 2 0 28.619074 77.056686 25.0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0
4 900 3 0 28.619310 77.033279 58.0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
4993 540 2 1 28.603176 77.063060 25.0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
4994 540 2 1 28.603176 77.063060 30.0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
4995 415 1 1 28.544790 77.051083 26.0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
4996 415 1 1 28.544790 77.051083 55.0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
4997 900 3 1 28.619074 77.056686 42.0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0
4157 rows × 77 columns
应用随机森林回归器后表现不佳, 所以我决定缩放特征 - (卧室转售纬度经度的区域号) 和目标变量 - (price)
但在执行缩放后:
from sklearn.preprocessing import StandardScaler
def scaleColumns(df,cols_to_scale):
for col in cols_to_scale:
scaler = StandardScaler()
df[col] = pd.DataFrame(scaler.fit_transform(df[col].values.reshape((-1,1))))
df
return df
scaled_df = scaleColumns(df,['Area','No. of bedrooms','latitude','longitude','price'])
scaled_df
我明白了:
Area No. of bedrooms Resale latitude longitude price Alaknanda Badarpur Bharat Vihar Bindapur Burari Chattarpur Chittaranjan Park Delhi Delhi Meerut Expressway Dwarka Mor Dwarka More Govindpuri Greater Kailash Hari Nagar Jamia Nagar Jasola Kalkaji Kamla Nagar Mahavir Enclave Mansa Ram Park Mayur Vihar Mayur Vihar II Model Town Mundka Munirka New Ashok Nagar Noida Road Okhla Om Nagar Om Vihar Palam Paschim Vihar Pitampura Preet Vihar Punjabi Bagh Rohini Sector 9 Rohini sector 24 Roop Nagar Sainik Farms Saket Sarita Vihar Sector 10 Dwarka Sector 11 Dwarka Sector 12 Dwarka Sector 13 Dwarka Sector 13 Rohini Sector 17 Dwarka Sector 18A Dwarka Sector 19 Dwarka Sector 2 Dwarka Sector 22 Dwarka Sector 22 Rohini Sector 23 Dwarka Sector 23 Rohini Sector 24 Rohini Sector 3 Dwarka Sector 4 Dwarka Sector 5 Dwarka Sector 6 Dwarka Sector 7 Dwarka Sector 9 Dwarka Sector-18 Dwarka Shahdara Shanti Park Dwarka Shastri Nagar Uttam Nagar Vasant Kunj Vikas Puri West End West Punjabi Bagh nawada
0 -0.156044 -0.846368 1 0.146719 0.197107 -0.154917 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1 -0.361197 0.327590 0 0.154070 0.197058 -0.245661 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0
2 -0.002180 -0.846368 1 0.134931 0.208280 -0.064172 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
3 -0.940754 -0.846368 0 0.154070 0.197058 -0.316239 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0
4 -0.463774 0.327590 0 0.154120 0.195924 -0.249694 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
4993 NaN NaN 1 NaN NaN NaN 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
4994 NaN NaN 1 NaN NaN NaN 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
4995 NaN NaN 1 NaN NaN NaN 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
4996 NaN NaN 1 NaN NaN NaN 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
4997 NaN NaN 1 NaN NaN NaN 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0
4157 rows × 77 columns
许多值现在变成了 NaN。我该如何解决这个问题?
解决方法
暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!
如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。
小编邮箱:dio#foxmail.com (将#修改为@)