使用 GeoPandas Python 用 shapefile 屏蔽 netcdf 文件

问题描述

我有一个 EDGAR 排放清单的 netcdf 文件和美国人口普查数据的 shapefile。我想从整个 shapefile 中仅与 NYC 区域重叠/相交的 netcdf 中提取数据,以便计算 NYC 的总排放量。

我从来没有使用过 shapefiles/GeoPandas,所以请耐心等待。我能够读取 shapefile,过滤特定区域,然后将 netcdf 转换为 GeoDataFrame。 我只想保留来自 shapefile 过滤区域内的 netcdf 数据以便进行分析

更新:我尝试使用 sjoinclip,但是当我执行该命令时,我的数据框没有数据,当我使用 sjoin 绘图时,出现错误您尝试绘制的 GeoDataFrame 为空。未显示任何内容"

import netCDF4
import numpy as np
from osgeo import gdal,osr,ogr
import matplotlib.pyplot as plt
import geopandas as gpd
import pandas as pd
import xarray as xr


# read in file path for shapefile
fp_shp = "C:/Users/cb_2018_us_ua10_500k/cb_2018_us_ua10_500k.shp"
# read in netcdf file path
ncs = "C:/Users/v50_N2O_2015.0.1x0.1.nc"

# Read in NETCDF as a pandas dataframe
# Xarray provides a simple method of opening netCDF files,and converting them to pandas dataframes
ds = xr.open_dataset(ncs)
edgar = ds.to_dataframe()

# the index in the df is a Pandas.MultiIndex. To reset it,use df.reset_index()
edgar = edgar.reset_index()

# Read shapefile using gpd.read_file()
shp = gpd.read_file(fp_shp)

# read the netcdf data file
#nc = netCDF4.Dataset(ncs,'r')

# quick check for shpfile plotting
shp.plot(figsize=(12,8));

# filter out shapefile for SPECIFIC city/region

# how to filter rows in DataFrame that contains string
# extract NYC from shapefile dataframe
nyc_shp = shp[shp['NAME10'].str.contains("New York")]

# export shapefile
#nyc_shp.to_file('NYC.shp',driver ='ESRI Shapefile')

# use geopandas points_from_xy() to transform Longitude and Latitude into a list of shapely.Point objects and set it as a geometry while creating the GeoDataFrame
edgar_gdf = gpd.GeoDataFrame(edgar,geometry=gpd.points_from_xy(edgar.lon,edgar.lat))

print(edgar_gdf.head())

# check CRS coordinates
nyc_shp.crs #shapefile
edgar_gdf.crs #geodataframe netcdf

# set coordinates equal to each other
# PointsGeodataframe.crs = polygonsGeodataframe.crs
edgar_gdf.crs = nyc_shp.crs

# check coordinates after setting coordinates equal to each other
edgar_gdf.crs #geodataframe netcdf

# Clip points,lines,or polygon geometries to the mask extent.
mask = gpd.clip(edgar_gdf,nyc_shp)

解决方法

我想通了!我需要确保我的 netcdf 文件与我的 shapefile 具有相同的经度。因此,在转换为 GeoDataFrame 之前,我将其转换为 [-180,180] 而不是 [0,360] 以匹配。那么上面的代码就起作用了!