尝试从JSON中提取必填字段 蟒蛇

问题描述

我被困住了。 目的是使用API​​来获取用户名列表。具体来说,提交数量严格大于给定阈值的用户的用户名列表。必须按用户在结果中出现的顺序返回用户名列表。

这是我的JSON:

 - page_content {'page': '1','per_page': 10,'total': 15,'total_pages': 2,'data': [{'id': 1,'username': 'epaga','about':
   'Java developer / team leader at inetsoftware.de by day<p>iOS
   developer by
   night<p>http://www.mindscopeapp.com<p>http://inflightassistant.info<p>http://appstore.com/johngoering<p>[
   my public key: https://keybase.io/johngoering; my proof:
   https://keybase.io/johngoering/sigs/I1UIk7t3PjfB5v2GI-fhiOMvdkzn370_Z2iU5GitXa0
   ]<p>hnchat:oYwa7PJ4Yaf1Vw9Om4ju','submitted': 654,'updated_at':
   '2019-08-29T13:45:12.000Z','submission_count': 197,'comment_count':
   439,'created_at': 1301039509},{'id': 2,'username': 'patricktomas','about': '[ my public key: https://keybase.io/ptrcktms; my proof:
   https://keybase.io/ptrcktms/sigs/Z_woLEAc6ZrVtIAdZbAyp23r7vsL_clxNE3RE8DEmGQ
   ]','submitted': 9,'updated_at': '2019-01-29T22:47:01.000Z','submission_count': 6,'comment_count': 3,'created_at': 1255392958},{'id': 3,'username': 'saintamh','about': '','submitted': 8,'updated_at': '2019-08-21T10:04:13.000Z','submission_count': 4,'comment_count': 4,'created_at': 1267029423},{'id': 4,'username':
   'panny','submitted': 71,'updated_at':
   '2019-09-06T11:13:29.000Z','submission_count': 51,'comment_count':
   15,'created_at': 1510174513},{'id': 5,'username': 'olalonde','about':
   'olalonde@gmail.com<p>http://www.github.com/olalonde<p>CTO/Co-Founder
   @ https://binded.com','submitted': 4561,'updated_at':
   '2019-09-08T09:26:52.000Z','submission_count': 1032,'comment_count': 3045,'created_at': 1261051630},{'id': 6,'username': 'WisNorCan','about': 'bayesian optimist','submitted':
   177,'updated_at': '2019-07-26T01:40:10.000Z','submission_count':
   42,'comment_count': 107,'created_at': 1497196382},{'id': 7,'username': 'dmmalam','about': 'Cofounder OctaveWealth (YCS12)','submitted': 765,'updated_at': '2019-08-12T21:38:21.000Z','submission_count': 645,'comment_count': 115,'created_at':
   1312041112},{'id': 8,'username': 'replicatorblog','about':
   'https://twitter.com/josephflaherty<p>Formerly
   Wired:<p>https://www.wired.com/author/joseph-flaherty/<p>Now covering
   startups for Founder Collective,a fantastic VC
   firm:<p>http://www.foundercollective.com/','submitted': 1441,'updated_at': '2019-09-06T02:06:35.000Z','submission_count': 550,'comment_count': 790,'created_at': 1224455310},{'id': 9,'username': 'eightturn','about': 'twitter: @searchbound','submitted': 84,'updated_at': '2019-08-10T21:33:15.000Z','submission_count': 7,'comment_count': 75,'created_at':
   1405978844},{'id': 10,'username': 'vladikoff','about': '[ my
   public key: https://keybase.io/vladikoff; my proof:
   https://keybase.io/vladikoff/sigs/jxMsGDORM-qiAf0bQy91Uw4RYpHNeQa1bZD3WFdIZWo
   ]','submitted': 65,'updated_at': '2019-05-10T22:04:36.000Z','submission_count': 15,'comment_count': 50,'created_at':
   1298054029}]} type(page_content) <class 'dict'>
       page_content {'page': '2','data': [{'id': 11,'username': 'mpweiher','about':
   'http://blog.metaobject.com/<p>http://www.metaobject.com/<p>http://objective.st/','submitted': 5967,'updated_at': '2019-09-07T19:35:03.000Z','submission_count': 3342,'comment_count': 2577,'created_at':
   1333104319},{'id': 12,'username': 'coloneltcb','about': 'I work at
   Twilio,with Jol Franusic. Comments are my own.<p>[ my public key:
   https://keybase.io/selviano; my proof:
   https://keybase.io/selviano/sigs/fxZxeSakx-aPR2d0iICZU9FpHnDqPDW0Kz4s9OkrGS0
   ]','submitted': 2249,'updated_at': '2019-08-31T20:31:24.000Z','submission_count': 2137,'comment_count': 88,'created_at':
   1205248131},{'id': 13,'username': 'guelo','submitted': 4996,'updated_at': '2019-09-07T13:09:59.000Z','submission_count': 72,'comment_count': 4595,'created_at':
   1247879051},{'id': 14,'username': 'frederfred','about': 'web
   developer','submitted': 4,'updated_at': '2018-03-22T13:44:20.000Z','submission_count': 3,'comment_count': 1,'created_at': 1361951997},{'id': 15,'username': 'pkiller','submitted': 32,'updated_at': '2019-04-07T21:56:40.000Z','submission_count': 1,'comment_count': 31,'created_at': 1539079554}]} type(page_content)
   <class 'dict'>

我的代码:

import json
import requests
def getUsernames(threshold):
    username = []  
    data = requests.get("https://jsonmock.hackerrank.com/api/article_users/search?page={}")  
    response = json.loads(data.content.decode('utf-8'))    
    for page in range(0,response["total_pages"]):       
        page_response = requests.get("https://jsonmock.hackerrank.com/api/article_users/search?page={}".format(page + 1))    
        page_content = json.loads(page_response.content.decode('utf-8'))
        print ('page_content',page_content,'type(page_content)',type(page_content))
        for item in range(0,len(page_content["data"])):
             username.append(str(page_content["data"][item]["username"]))
    return "\n".join(username)

print(getUsernames(10))

请帮助-@-@-

解决方法

效率低下-您所做的第一个请求基本上是在您检查total_pages时就丢弃了。要遍历不需要range()的数据,请使用for循环,这样更容易:

def getUsernames(threshold):
    username = []
    page = 1
    while True:
        data = requests.get("https://jsonmock.hackerrank.com/api/article_users/search?page={}".format(page))
        page_content = json.loads(data.content.decode('utf-8'))
        # iterate through `data`
        for item in page_content["data"]:
            username.append(item["username"])
        
        # if we have enough usernames to hit threshold,quit
        if len(username) >= threshold:
            break
        
        # if we are on the last page,quit
        if page >= page_content['total_pages']:
            break
        
        # increase page and repeat the loop
        page += 1

    return "\n".join(username)
,

尝试一下

for user in page_content['data']:
    if user['submission_count'] > threshold:
        username.append(user['username'])
,

我刚刚尝试了您的代码,这似乎是正确的。只需获取usernames of users with submission count strictly greater than the given threshold,就需要设置条件。下面的代码应该可以工作:

import json
import requests


def getUsernames(threshold):
    username = []
    data = requests.get("https://jsonmock.hackerrank.com/api/article_users/search?page={}")
    response = json.loads(data.content.decode('utf-8'))
    for page in range(0,response["total_pages"]):
        page_response = requests.get("https://jsonmock.hackerrank.com/api/article_users/search?page={}".format(page + 1))
        page_content = json.loads(page_response.content.decode('utf-8'))
        # print ('page_content',page_content,'type(page_content)',type(page_content))
        for item in range(0,len(page_content["data"])):
            if page_content["data"][item]['submission_count'] > threshold: # You missed this condition
                username.append(str(page_content["data"][item]["username"]))
    return "\n".join(username)


print(getUsernames(10))

输出:

epaga
panny
olalonde
WisNorCan
dmmalam
replicatorblog
vladikoff
mpweiher
coloneltcb
guelo

相关问答

Selenium Web驱动程序和Java。元素在(x,y)点处不可单击。其...
Python-如何使用点“。” 访问字典成员?
Java 字符串是不可变的。到底是什么意思?
Java中的“ final”关键字如何工作?(我仍然可以修改对象。...
“loop:”在Java代码中。这是什么,为什么要编译?
java.lang.ClassNotFoundException:sun.jdbc.odbc.JdbcOdbc...