问题描述
我有很长的地址列表,我需要在坐标中对它们进行地理编码,我正在 Python 中使用 geopy 来完成。我编写了一个循环,以便为每个观察找到相应的坐标。在这个循环中,我还考虑到有时会出现连接超时问题(因此它会重新尝试进行地理编码)并且有时无法找到坐标(不返回任何坐标)这一事实。问题是它很慢,我在半小时内设法对 1000 个 obs 进行了地理编码,所以我想知道是否有一种方法可以加快速度,例如矢量化。
我可以减少重试的等待时间,但更多的尝试将失败
import pandas as pd
from geopy.geocoders import Nominatim
import numpy as np
import time
geolocator = Nominatim(user_agent = 'local_agent')
def geocode_address(address):
g = geolocator.geocode(address)
return g
def try_address(address,attempts_remaining,wait_time):
g = geocode_address(address)
if g is None:
time.sleep(wait_time)
if attempts_remaining > 0:
try_address(address,attempts_remaining-1,wait_time+wait_time)
return g
start_index = 0
# How often the program prints the status of the running program
status_rate = 100
# How many times the program tries to geocode an address before it gives up
attempts_to_geocode = 2
# Time it delays each time it does not find an address
wait_time = 3
# Variables used in the main for loop
results = []
Failed = 0
total_Failed = 0
progress = len(df) - start_index
for i,address in enumerate(df["address"]):
# Print the status of how many addresses have be processed so far and how many of the Failed.
if (start_index + i) % status_rate == 0:
total_Failed += Failed
print("Completed {} of {}. Failed {} for this section and {} in total."
.format(i + start_index,progress,Failed,total_Failed))
Failed = 0
# Try geocoding the addresses
try:
g = try_address(address,attempts_to_geocode,wait_time)
if g is None:
results.append([address,"","None"])
print("Gave up on address: " + address)
Failed += 1
else:
results.append([address,g.latitude,g.longitude,"ArcGIS"])
# If we Failed with an error like a timeout we will try the address again after we wait 5 secs
except Exception as e:
print("Failed with error {} on address {}. Will try again.".format(e,address))
try:
time.sleep(5)
g = geocode_address(address)
if g is None:
print("Did not find it.")
results.append([address,"None"])
Failed += 1
else:
print("Successfully found it.")
results.append([address,"ArcGIS"])
except Exception as e:
print("Failed with error {} on address {} again.".format(e,address))
Failed += 1
results.append([address,"Error"])
解决方法
暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!
如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。
小编邮箱:dio#foxmail.com (将#修改为@)