找出每个点在哪个多边形中

问题描述

我是Python的新手，所以我对基本的编程技能表示歉意，我知道我使用了太多的“循环”（来自Matlab，这使我失望了）。

我有数百万个点（时间步长，经度，纬度，pointID）和数百个不规则的不重叠多边形（vertex_long，vertex_lat，polygonID）。points and polygons format sample

我想知道每个点包含什么多边形。

我能够这样做：

from matplotlib import path
def inpolygon(lon_point,lat_point,lon_poly,lat_poly):
   shape = lon_point.shape
   lon_point = lon_point.reshape(-1)
   lat_point = lat_point.reshape(-1)
   lon_poly = lon_poly.values.reshape(-1)
   lat_poly = lat_poly.values.reshape(-1)
   points = [(lon_point[i],lat_point[i]) for i in range(lon_point.shape[0])]
   polys = path.Path([(lon_poly[i],lat_poly[i]) for i in range(lon_poly.shape[0])])
   return polys.contains_points(points).reshape(shape)

然后

import numpy as np
import pandas as pd
Areas_Lon = Areas.iloc[:,0]
Areas_Lat = Areas.iloc[:,1]
Areas_ID  = Areas.iloc[:,2]
Unique_Areas = np.unique(Areas_ID)

Areas_true=np.zeros((Areas_ID.shape[0],Unique_Areas.shape[0]))
for i in range(Areas_ID.shape[0]):
    for ii in range(Unique_Areas.shape[0]):
        Areas_true[i,ii]=(Areas_ID[i]==Unique_Areas[ii])

Areas_Lon_Vertex=np.zeros(Unique_Areas.shape[0],dtype=object)
Areas_Lat_Vertex=np.zeros(Unique_Areas.shape[0],dtype=object)
for i in range(Unique_Areas.shape[0]):
    Areas_Lon_Vertex[i]=(Areas_Lon[(Areas_true[:,i]==1)])
    Areas_Lat_Vertex[i]=(Areas_Lat[(Areas_true[:,i]==1)])

import f_inpolygon as inpolygon
Areas_in=np.zeros((Unique_Areas.shape[0],Points.shape[0]))
for i in range (Unique_Areas.shape[0]):
    for ii in range (PT.shape[0]):
        Areas_in[i,ii]=(inpolygon.inpolygon(Points[ii,2],Points[ii,3],Areas_Lon_Vertex[i],Areas_Lat_Vertex[i]))

这样，最终结果Areas_in Areas_in format包含与多边形一样多的行，以及与点一样多的列，其中该点相对于多边形索引的行中的每一列均为true = 1（第一个给定的多边形ID ->第一行，依此类推。

代码可以正常工作，但是执行起来很慢。当我在规则网格中或点半径内定位点时，我成功地实现了KDtree，这极大地提高了速度，但是对于不规则的不重叠多边形，我无法做到相同或更快。

我已经看到了一些相关的问题，但不是问一个点是什么多边形，而是关于一个点是否在多边形内部。

请问有什么主意吗？

解决方法

您是否尝试过Geopandas Spatial加入？

使用pip安装软件包 pip install geopandas 或康达 conda install -c conda-forge geopandas

那么您应该能够将数据读取为GeoDataframe

import geopandas 

df = geopandas.read_file("file_name1.csv") # you can read shp files too.
right_df = geopandas.read_file("file_name2.csv") # you can read shp files too.

# Convert into geometry column 
geometry = [Point(xy) for xy in zip(df['longitude'],df['latitude'])] # Coordinate reference system : WGS84

crs = {'init': 'epsg:4326'}
# Creating a Geographic data frame 
left_df = geopandas.GeoDataFrame(df,crs=crs,geometry=geometry)

然后您可以应用sjoin

jdf = geopandas.sjoin(left_df,right_df,how='inner',op='intersects',lsuffix='left',rsuffix='right')

op中的选项为：

相交
包含
内部

当您连接两个类型为Polygon和Point的几何列时，所有情况都应相同

dataframe numpy point polygon python