问题描述
如何获取表中连续非 NaN
值的最后一部分?例如:
col
0 1.0
1 2.0
2 NaN
3 NaN
4 3.0
5 NaN
6 4.0
7 5.0
8 6.0
想要的结果:
col
6 4.0
7 5.0
8 6.0
我的解决方案:
df[df.col[::-1].isna().cumsum()[::-1].eq(0)]
解决方法
试试这个:
last_index = df[df.col.isna()].index[-1]
df.iloc[last_index + 1:]
输出:
col
6 4.0
7 5.0
8 6.0
,
这是您可以执行此操作的另一种方法 -
last_index = np.where(df.col.isna())[0][-1]
df.iloc[last_index+1:]
df[df.loc[::-1,'col'].isna().cumsum()[::-1]==0]
,
这对我有用:
from urllib.request import urlopen
from bs4 import BeautifulSoup
import base64
import pytesseract as pyt
import requests
from PIL import Image
import matplotlib.pyplot as ptl
import numpy as np
pyt.pytesseract.tesseract_cmd = r'C:\Program Files\Tesseract-OCR\tesseract.exe'
login_url = 'http://www.root-me.org/?page=login&lang=fr'
payload = {
'var_login': 'email','password': 'pass'
}
with requests.Session() as s:
response = requests.post(login_url,payload)
scrap_url= urlopen('http://challenge01.root-me.org/programmation/ch8/')
soup = BeautifulSoup(scrap_url)
img = soup.find('img')['src'].split(',')[1]
with open('captcha.png','wb') as guardar:
decodificar = base64.b64decode(img)
guardar.write(decodificar)
leer_img = Image.open('captcha.png','r')
ptl.imshow(np.asarray(leer_img))
texto_captcha = pyt.image_to_string(leer_img)
print(texto_captcha)