如何在不丢失信息的情况下修改和保存 rnaturalearthdata::countries50?

问题描述

我不熟悉地理数据。我已经打开并修改了 SpatialpolygonsDataFrame rnaturalearthdata::countries50。在使用 st_as_sf() 和 st_shift_longitude() 之后,它现在是一个 SpatialPointsDataFrame。与 SpatialPointsDataFrame 附加的数据相比,保存文件的最佳方法是什么,以便我不会丢失任何数据?我阅读了 this 帖子,似乎在地理投影方面存在差异。我担心如果我以一种方式保存它,我会丢失不可恢复的信息(比如当我将数据帧保存到 .csv 而不是 .RData 时,我会丢失 R 对象的名称)。

这是我考虑过的:

library(dplyr) # for pipes

mySPDF <- rnaturalearthdata::countries50 %>% 
    st_as_sf() %>%
    st_shift_longitude()

library(sf)
st_write(mySPDF,"mySPDF.shp") # error
library(maptools)
writeSpatialShape(mySPDF,"mySPDF") # warning: 1: writeSpatialShape is deprecated; use rgdal::writeOGR or sf::st_write 

来自上面引用的帖子:

library(rgdal)

writeOGR(obj=mySPDF,dsn="tempdir",layer="mySPDF",driver="ESRI Shapefile") # this is in geographical projection

所以我想知道如何保存/导出我的 SpatialPointsDataFrame 以便我可以在不丢失信息的情况下打开它?

解决方法

只是为了澄清,mySPDF 不是 SpatialPointsDataFrame 而是 sf 对象,因为您使用包含多边形的 st_as_sf 将其转换(因为 mySPDF 是一个SpatialPolygonsDataFrame).

正如您所说,您不熟悉地理数据,我建议您使用 sf 而不是 sp,尽管您可以轻松地在两种数据类型之间进行转换。

我试过这个,它在这个版本上工作,看看它是否符合你的需求:

library(dplyr) # for pipes
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter,lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect,setdiff,setequal,union
library(sf)
#> Linking to GEOS 3.6.1,GDAL 2.2.3,PROJ 4.9.3

# Getting data - this may be corrupted
mySPDF <- rnaturalearthdata::countries50 

class(mySPDF)
#> [1] "SpatialPolygonsDataFrame"
#> attr(,"package")
#> [1] "sp"

# To sf 
mySPDF_sf <- mySPDF %>% 
  st_as_sf() %>%
  st_shift_longitude()
#> Warning in CPL_wrap_dateline(x,options,quiet): GDAL Error 1:
#> TopologyException: Input geom 0 is invalid: Self-intersection at or near point
#> 16.123482736917911 -84.347832109632833 at 16.123482736917911 -84.347832109632833

#> Warning in CPL_wrap_dateline(x,quiet): GDAL Error 1:
#> TopologyException: Input geom 0 is invalid: Self-intersection at or near point
#> 16.123482736917911 -84.347832109632833 at 16.123482736917911 -84.347832109632833

# Check object
st_geometry(mySPDF_sf)
#> Geometry set for 241 features 
#> geometry type:  MULTIPOLYGON
#> dimension:      XY
#> bbox:           xmin: 5.684342e-14 ymin: -89.99893 xmax: 359.9953 ymax: 83.59961
#> CRS:            +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0
#> First 5 geometries:
#> MULTIPOLYGON (((290.1009 12.452,290.1043 12.42...
#> MULTIPOLYGON (((74.89131 37.23164,74.84023 37....
#> MULTIPOLYGON (((14.19082 -5.875977,14.39863 -5...
#> MULTIPOLYGON (((296.9988 18.22178,296.84 18.17...
#> MULTIPOLYGON (((20.06396 42.54727,20.10352 42....

st_write(mySPDF_sf,"mySPDF.shp") # Some warnings,but may be spurious
#> Writing layer `mySPDF' to data source `mySPDF.shp' using driver `ESRI Shapefile'
#> Writing 241 features with 63 fields and geometry type Multi Polygon.
#> Warning in CPL_write_ogr(obj,dsn,layer,driver,#> as.character(dataset_options),: GDAL Message 1: Value 156050883 of field
#> pop_est of feature 22 not successfully written. Possibly due to too larger
#> number with respect to field width
#> Warning in CPL_write_ogr(obj,: GDAL Message 1: Value 198739269 of field
#> pop_est of feature 32 not successfully written. Possibly due to too larger
#> number with respect to field width
#> Warning in CPL_write_ogr(obj,: GDAL Message 1: Value 1338612970 of field
#> pop_est of feature 41 not successfully written. Possibly due to too larger
#> number with respect to field width
#> Warning in CPL_write_ogr(obj,: GDAL Message 1: Value 240271522 of field
#> pop_est of feature 96 not successfully written. Possibly due to too larger
#> number with respect to field width
#> Warning in CPL_write_ogr(obj,: GDAL Message 1: Value 1166079220 of field
#> pop_est of feature 98 not successfully written. Possibly due to too larger
#> number with respect to field width
#> Warning in CPL_write_ogr(obj,: GDAL Message 1: Value 127078679 of field
#> pop_est of feature 110 not successfully written. Possibly due to too larger
#> number with respect to field width
#> Warning in CPL_write_ogr(obj,: GDAL Message 1: Value 111211789 of field
#> pop_est of feature 139 not successfully written. Possibly due to too larger
#> number with respect to field width
#> Warning in CPL_write_ogr(obj,: GDAL Message 1: Value 149229090 of field
#> pop_est of feature 158 not successfully written. Possibly due to too larger
#> number with respect to field width
#> Warning in CPL_write_ogr(obj,: GDAL Message 1: Value 176242949 of field
#> pop_est of feature 167 not successfully written. Possibly due to too larger
#> number with respect to field width
#> Warning in CPL_write_ogr(obj,: GDAL Message 1: Value 140041247 of field
#> pop_est of feature 183 not successfully written. Possibly due to too larger
#> number with respect to field width
#> Warning in CPL_write_ogr(obj,: GDAL Message 1: Value 313973000 of field
#> pop_est of feature 226 not successfully written. Possibly due to too larger
#> number with respect to field width

# Check if I can reload the shape
st_read("mySPDF.shp")
#> Reading layer `mySPDF' from data source `C:\Users\ xx\AppData\Local\Temp\RtmpCW170K\reprex25781dc712ca\mySPDF.shp' using driver `ESRI Shapefile'
#> Simple feature collection with 241 features and 63 fields
#> geometry type:  MULTIPOLYGON
#> dimension:      XY
#> bbox:           xmin: 5.684342e-14 ymin: -89.99893 xmax: 359.9953 ymax: 83.59961
#> CRS:            4326

sessionInfo()
#> R version 3.6.3 (2020-02-29)
#> Platform: x86_64-w64-mingw32/x64 (64-bit)
#> Running under: Windows 10 x64 (build 18363)
#> 
#> Matrix products: default
#> 
#> locale:
#> [1] LC_COLLATE=Spanish_Spain.1252  LC_CTYPE=Spanish_Spain.1252   
#> [3] LC_MONETARY=Spanish_Spain.1252 LC_NUMERIC=C                  
#> [5] LC_TIME=Spanish_Spain.1252    
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> other attached packages:
#> [1] sf_0.9-2    dplyr_1.0.4
#> 
#> loaded via a namespace (and not attached):
#>  [1] Rcpp_1.0.4.6            knitr_1.31              magrittr_1.5           
#>  [4] units_0.6-6             tidyselect_1.1.0        lattice_0.20-41        
#>  [7] R6_2.4.1                rlang_0.4.10            rnaturalearthdata_0.1.0
#> [10] stringr_1.4.0           highr_0.8               tools_3.6.1            
#> [13] grid_3.6.1              xfun_0.21               KernSmooth_2.23-16     
#> [16] e1071_1.7-3             DBI_1.1.0               class_7.3-16           
#> [19] htmltools_0.4.0         ellipsis_0.3.1          assertthat_0.2.1       
#> [22] yaml_2.2.1              digest_0.6.25           tibble_3.0.6           
#> [25] lifecycle_1.0.0         crayon_1.3.4            purrr_0.3.4            
#> [28] vctrs_0.3.6             glue_1.4.0              evaluate_0.14          
#> [31] rmarkdown_2.7           sp_1.4-1                stringi_1.4.6          
#> [34] compiler_3.6.1          pillar_1.4.3            generics_0.0.2         
#> [37] classInt_0.4-3          pkgconfig_2.0.3

reprex package (v0.3.0) 于 2021 年 4 月 8 日创建