从zipfile中提取并合并几个shapefile,而不保存到光盘

问题描述

我第一次使用Python处理zipfile:-/

手头上的任务如下(主要要求是不向光盘写入任何内容

给出了以下网址:http://shapefiles.fews.net.s3.amazonaws.com/ALL_HFIC.zip

  • 获取压缩文件
  • 从zip arcvhie中提取shapefile,其文件名中包含Africa
  • 将所有文件合并为一个shapefile(将所有文件读入geopandas)。
  • 转换为geoJson。

这是我到目前为止的代码结构-但我不断遇到属性错误

AttributeError: 'ZipFile' object has no attribute 'seek'

import io
import zipfile
import pandas as pd     
import geopandas as gpd 

# util funcs
is_africa = lambda string: "Africa" in string                                
is_shape = lambda string: string.endswith('shp')

# get_zip() defined in module
filebytes = io.BytesIO(get_zip(url=URL).content)  

# get the zipfile object
myzipfile = zipfile.ZipFile(filebytes)
                                                               
# instantiate empty list where to store the shapefiles of interest.      
shapefiles = []

# below code adapted from: https://stackoverflow.com/questions/4917284/                                                                                               
with zipfile.ZipFile(zip_file,'r') as zf:                               
    for file_name in zf.namelist():                                      
        if is_africa(file_name) and is_shape(file_name):                 
            data = zf.read(file_name)                                    
            shapefiles.append(data)                                      


# below code adapted from https://stackoverflow.com/questions/48874113/                                                                         
gdf_africa = gpd.GeoDataFrame(pd.concat([gpd.read_file(i) for i in shapefiles],ignore_index=True),crs=gpd.read_file(shapefiles[0]).crs)   

gdf_africa.to_file("output.json",driver="GeoJSON")
                  

解决方法

此代码从URL请求ZipFile,将ZipFile读取到流中并提取非洲ShapeFiles的名称。

from zipfile import ZipFile
import requests

# util funcs
is_africa = lambda string: "Africa" in string
is_shape = lambda string: string.endswith('shp')

# instantiate empty list where to store the shapefiles of interest.
africa_data = []

response = requests.get('http://shapefiles.fews.net.s3.amazonaws.com/ALL_HFIC.zip')
with ZipFile(io.BytesIO(response.content)) as zf:
    for file_name in zf.namelist():
       if is_africa(file_name) and is_shape(file_name):
         print(file_name)
         # Output
         ALL_HFIC/ALL_HFIC/East Africa/EA_200907_CS.shp
         ALL_HFIC/ALL_HFIC/East Africa/EA_200910_CS.shp
         ALL_HFIC/ALL_HFIC/East Africa/EA_201001_CS.shp
         ALL_HFIC/ALL_HFIC/East Africa/EA_201004_CS.shp

从未 使用ShapeFiles或geopandas。我花了最后4个小时来尝试了解如何使用它们。我能够输出JSON文件,但是不确定该JSON文件中的数据是否满足您的需求。

# util funcs
is_africa = lambda string: "Africa" in string
is_shape = lambda string: string.endswith('shp')

# instantiate empty list where to store the shapefiles of interest.
africa_data = []

response = requests.get('http://shapefiles.fews.net.s3.amazonaws.com/ALL_HFIC.zip')
with ZipFile(io.BytesIO(response.content)) as zf:
    for file_name in zf.namelist():
       if is_africa(file_name) and is_shape(file_name):
         reader = shapefile.Reader(file_name)
         fields = reader.fields[1:]
         field_names = [field[0] for field in fields]
         for sr in reader.shapeRecords():
            atr = dict(zip(field_names,sr.record))
            geom = sr.shape.__geo_interface__
            africa_data.append(dict(type="Feature",geometry=geom,properties=atr))

    geojson = open("african_geo_data.json","w")
    geojson.write(dumps({"type": "FeatureCollection","features": africa_data},indent=2) + "\n")
    geojson.close()

从JSON文件中采样:

{
  "type": "FeatureCollection","features": [
    {
      "type": "Feature","geometry": {
      "type": "MultiPolygon","coordinates": [
         [
           [
            [
              40.213226318000125,-10.277393340999765
            ],[
              40.21355056800013,-10.279667853999932
            ],[
              40.21699915800019,-10.27847569599988
            ]
          },"properties": {
          "CS": 4.0,"HA0": 0.0
          }
        }
       ]
      }

相关问答

错误1:Request method ‘DELETE‘ not supported 错误还原:...
错误1:启动docker镜像时报错:Error response from daemon:...
错误1:private field ‘xxx‘ is never assigned 按Alt...
报错如下,通过源不能下载,最后警告pip需升级版本 Requirem...