Sentinel3 OLCIchlPython上netcdf文件的平均值

问题描述

我在尝试使用Sentinel 3图像获取每月平均值时遇到了一些麻烦。 Python，Matlab，我们两个人陷入了这个问题。

主要原因是这些图像的信息不在单个netcdf文件中，它与坐标和乘积整齐地放置在一起。相反，它们全都放在一天文件夹中的单独文件中，例如 different .nc files with different information each,about one single satellite image。据我了解，SNAP使用xmlxs文件处理所有这些单独的.nc文件。

现在，尽管尝试合并和创建/编辑.nc文件是一个不错的主意，但我希望创建一个新的每日.nc文件，其中包括叶绿素，坐标以及可能添加的时间。稍后，我将合并这些新的变量，以便能够使用xarray每月平均。至少那是我的主意，但我不能做第一部分。这可能是一个显而易见的解决方案，但这是我使用xarray模块尝试过的方法

import os
import numpy as np
import xarray as xr
import netCDF4
from netCDF4 import Dataset
  
nc_folder = df_try.iloc[0] #folder where the image files are

#open dataset in xarray
nc_chl = xr.open_dataset(str(nc_folder['path']) + '/' + 'chl_nn.nc') #path to chlorophyll file 
nc_chl

n_coord =xr.open_dataset(str(nc_folder['path'])+ '/'+ 'geo_coordinates.nc') #path to coordinates file

n_time = xr.open_dataset(str(nc_folder['path'])+ '/' + 'time_coordinates.nc') #path to time file

ds_grid = [[nc_chl],[n_coord],[n_time]]

combined = xr.combine_nested(ds_grid,concat_dim=[None,None]) 
combined #dataset with all but not recognizing coordinates

ds = combined.rename({'latitude': 'lat','longitude': 'lon','time_stamp' : 'time'}).set_coords(['lon','lat','time']) #dataset recognizing coordinates as coordinates
ds

提供了一个数据集

维度：列4865行：4091

3个坐标（纬度，经度和时间）和chl变量。

现在，它不能保存到netcdf4中（我尝试过但是有一个错误），但是我还在想，是否有人知道另一种求平均值的方法？我有三年的图像（从2017年开始到2019年结束），我需要以不同的方式进行平均（每月，季节性...）。我目前的主要问题是叶绿素值与地理坐标是分开的，因此仅直接使用叶绿素文件是行不通的，只会造成混乱。

有什么建议吗？

解决方法

这里有两个选项：

使用xarray

在xarray中，您可以将它们添加为坐标。 geo_coordinates.nc文件中的坐标也是多维的，这有点棘手。

可能的解决方法如下：

import netCDF4 
import xarray as xr
import matplotlib.pyplot as plt

# paths
root = r'C:<your_path>\S3B_OL_2_WFR____20201015.SEN3\chl_nn.nc'        #set path to chl file
coor = r'C:<your_path>\S3B_OL_2_WFR____20201015.SEN3\geo_coordinates.nc' #set path to the coordinates file

# loading xarray datasets
ds = xr.open_dataset(root)
olci_geo_coords = xr.open_dataset(coor)

# extracting coordinates
lat = olci_geo_coords.latitude.data
lon = olci_geo_coords.longitude.data

# assign coordinates to the chl dataset (needs to refer to both the dimensions of our dataset)
ds = ds.assign_coords({"lon":(["rows","columns"],lon),"lat":(["rows",lat)})

# clip the image (add your own coordinates)
area_of_interest = ds.where((10 < ds.lon) & (ds.lon < 12) & (58 < ds.lat) & (ds.lat < 59),drop=True)

# simple plot with coordinates as axis
plt.figure(figsize=(15,15))
area_of_interest["CHL_NN"].plot(x="lon",y="lat")

更简单的方法是将它们作为变量添加到新数据集中：

# path to the folder
root = r'C:<your_path>\S3B_OL_2_WFR____20201015.SEN3\*.nc'        #set path to chl file

# create a dataset by combining nc files (coordinates will become variables)
ds = xr.open_mfdataset(root,combine = 'by_coords')

但是在这种情况下，当您绘制图像或对其进行裁剪时，您将无法直接使用坐标。

使用快照

在python中，snappy软件包可用并且基于SNAP工具箱（在JAVA上实现）。检查：https://senbox.atlassian.net/wiki/spaces/SNAP/pages/19300362/How+to+use+the+SNAP+API+from+Python

安装后（不幸的是，snappy仅支持python 2.7、3.3或3.4），您可以直接在python上使用可用的SNAP函数来汇总卫星图像并创建周/月平均值。然后，您将不需要合并lon，lat netcdf文件，因为您将在xfdumanifest.xml上工作，而SNAP会解决这个问题。

这是一个示例。它还执行聚合（在两个chl nc文件上计算的平均值）：

from snappy import ProductIO,WKTReader
from snappy import jpy
from snappy import GPF
from snappy import HashMap

# setting the aggregator method
aggregator_average_config = snappy.jpy.get_type('org.esa.snap.binning.aggregators.AggregatorAverage$Config')
agg_avg_chl = aggregator_average_config('CHL_NN')

# creating the hashmap to store the parameters
HashMap = snappy.jpy.get_type('java.util.HashMap')
parameters = HashMap()

#creating the aggregator array
aggregators = snappy.jpy.array('org.esa.snap.binning.aggregators.AggregatorAverage$Config',1)
#adding my aggregators in the list
aggregators[0] = agg_avg_chl

# set parameters
# output directory 
dir_out = 'level-3_py_dynamic.dim'
parameters.put('outputFile',dir_out)

# number of rows (directly linked with resolution)
parameters.put('numRows',66792) # to have about 300 meters spatial resolution

# aggregators list
parameters.put('aggregators',aggregators)

# Region to clip the aggregation on
wkt="POLYGON ((8.923302175377243 59.55648108694149,13.488748662344074 59.11388968719029,12.480488185001589 56.690625338725155,8.212366327767503 57.12425256476263,8.923302175377243 59.55648108694149))"
geom = WKTReader().read(wkt)
parameters.put('region',geom)

# Source product path 
path_15 = r"C:<your_path>\S3B_OL_2_WFR____20201015.SEN3\xfdumanifest.xml"
path_16 = r"C:\<your_path>\S3B_OL_2_WFR____20201016.SEN3\xfdumanifest.xml"
path = path_15 + "," + path_16
parameters.put('sourceProductPaths',path)

#result = snappy.GPF.createProduct('Binning',parameters,(source_p1,source_p2))

# create results
result = snappy.GPF.createProduct('Binning',parameters) #to be used with product paths specified in the parameters hashmap

print("results stored in: {0}".format(dir_out) )

我对这个主题很新，也很感兴趣，很高兴听到您/其他解决方案！

netcdf python satellite-image