如何从pandas.value_counts返回元素

问题描述

y = pd.DataFrame([3,1,2,3,4],columns=['TARGET'])
y['TARGET'].value_counts()

输出：

3.0    2
4.0    1
2.0    1
1.0    1
Name: TARGET,dtype: int64

如何分别返回上面输出中的元素（即计数2、1、1、1）？

当我尝试下面的代码时：

y['TARGET'].value_counts()[0]

我收到以下错误消息：

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
~\Anaconda3\lib\site-packages\pandas\core\indexes\base.py in get_loc(self,key,method,tolerance)
   2645             try:
-> 2646                 return self._engine.get_loc(key)
   2647             except KeyError:

pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.Float64HashTable.get_item()

pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.Float64HashTable.get_item()

KeyError: 0.0

During handling of the above exception,another exception occurred:

KeyError                                  Traceback (most recent call last)
<ipython-input-59-63137bfef4a6> in <module>
----> 1 index['TARGET'].value_counts()[0]

~\Anaconda3\lib\site-packages\pandas\core\series.py in __getitem__(self,key)
    869         key = com.apply_if_callable(key,self)
    870         try:
--> 871             result = self.index.get_value(self,key)
    872 
    873             if not is_scalar(result):

~\Anaconda3\lib\site-packages\pandas\core\indexes\numeric.py in get_value(self,series,key)
    447 
    448         k = com.values_from_object(key)
--> 449         loc = self.get_loc(k)
    450         new_values = com.values_from_object(series)[loc]
    451 

~\Anaconda3\lib\site-packages\pandas\core\indexes\numeric.py in get_loc(self,tolerance)
    506         except (TypeError,NotImplementedError):
    507             pass
--> 508         return super().get_loc(key,method=method,tolerance=tolerance)
    509 
    510     @cache_readonly

~\Anaconda3\lib\site-packages\pandas\core\indexes\base.py in get_loc(self,tolerance)
   2646                 return self._engine.get_loc(key)
   2647             except KeyError:
-> 2648                 return self._engine.get_loc(self._maybe_cast_indexer(key))
   2649         indexer = self.get_indexer([key],tolerance=tolerance)
   2650         if indexer.ndim > 1 or indexer.size > 1:

pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.Float64HashTable.get_item()

pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.Float64HashTable.get_item()

KeyError: 0.0

为什么会这样？

当我尝试时：

y['TARGET'].value_counts()[1]

或

y['TARGET'].value_counts()[2]

等

它可以找到，但是元素的顺序混合在一起。有人知道为什么会这样吗？

解决方法

如果需要按Series中的位置进行选择，请使用Series.iat或Series.iloc：

s = y['TARGET'].value_counts()
print (s.iat[0])
2
print (s.iloc[0])
2

如果需要按标签选择，此处3的第一个值使用Series.at或Series.loc：

print (s.at[3])
2

print (s.loc[3])
2

像索引一样工作：

print (s[3])
2

使用.iloc

import pandas as pd
y = pd.DataFrame([3,1,2,3,4],columns=['TARGET'])
print(y['TARGET'].value_counts().iloc[0])  # output 2
print(y['TARGET'].value_counts().iloc[1])  # output 1
print(y['TARGET'].value_counts().iloc[2])  # output 1
print(y['TARGET'].value_counts().iloc[3])  # output 1

pandas pandas python series series