python - 从 2D lat/lon 矩阵转换为 1D lat/lon 数组

标签 python matrix coordinates netcdf python-xarray

我正在使用一个没有坐标的 netcdf 文件。我的纬度/经度值以矩阵的形式存储在变量中,即纬度(x,y)和经度(x,y)。我的目标是提取纬度和经度一维数组以将其指定为坐标,因为它们必须是一维数组。

这是数据集最初的样子:

<xarray.Dataset>
Dimensions:             (y: 10980, x: 10980)
Dimensions without coordinates: y, x
Data variables: (12/20)
    lon                 (y, x) float32 ...
    lat                 (y, x) float32 ...

例如,lat 变量如下所示:

<xarray.DataArray 'lat' (y: 10980, x: 10980)>
array([[41.52681 , 41.52681 , 41.526814, ..., 41.54671 , 41.54671 , 41.546715],
       [41.52672 , 41.526722, 41.526722, ..., 41.54662 , 41.546623, 41.546623],
       [41.52663 , 41.52663 , 41.526634, ..., 41.54653 , 41.54653 , 41.54653 ],
       ...,
       [40.538834, 40.538837, 40.538837, ..., 40.55806 , 40.55806 , 40.558064],
       [40.538746, 40.538746, 40.53875 , ..., 40.55797 , 40.557972, 40.557972],
       [40.538654, 40.53866 , 40.53866 , ..., 40.55788 , 40.55788 , 40.55788 ]],
      dtype=float32)
Dimensions without coordinates: y, x
Attributes:
    parameter:      lat
    standard_name:  latitude
    long_name:      latitude
    units:          degree_north

因此,为了将两个变量转换为一维数组,我执行以下操作:

#First I open the dataset
file_to_input = 'landsat.nc'
nc1 = xr.open_dataset(file_to_input)

#Then I take the y axis from lat:
lati = nc1.lat[:,0]
#And the x axis from lon:
long = nc1.lon[0,:]

#To then assign them as 1D array to the dataset:
nc1 = nc1.assign_coords({'x':long,'y':lati})
nc1 = nc1.rio.set_spatial_dims('x', 'y')

#I set the proper CRS for the varaible to export (EPSG: 32631):
nc1var = nc1['ndci'].rio.set_crs("epsg:32631")

#And then I export it as a geotiff:
nc1var.rio.to_raster('ndci.tiff')

到目前为止一切顺利。当我可视化导出的 geotiff 时,问题就出现了,geotiff 发生了轻微的偏移。在下图中,您可以欣赏到 geotiff 相对于 basemap 的小幅下移。我尝试将此方法与其他良好的工作 tiff 一起使用,并且发生了相同的转变,因此我认为这与我从 2D lat lon 更改为 1D 数组的方式有关。

enter image description here

我认为它可以通过 pyproj 变压器或类似的东西来实现,但我不知道如何将其与纬度/经度 2D 网格一起使用为 1D。任何帮助将不胜感激!!

更新

下载数据集here

(.rar - 241MB,.nc 文件 - 1.34GB)

最佳答案

我将建议以下代码使用最近邻插值将数据获取到规则网格。也可以用其他方法插值,这只是最简单的,对原始数据影响不大。

另请注意,我正在降低原始分辨率以提高内存效率,否则应从脚本中删除该部分。

这是代码:

#!/usr/bin/env ipython
# ---------------------
import numpy as np
from scipy.interpolate import griddata
from pylab import pcolormesh, show
from netCDF4 import Dataset
# --------------------------------------------
def nc_varget(fin,vin):
    with Dataset(fin) as f: return f.variables[vin][:];
# --------------------------------------------
fin = 'S2A_reduced_dataset.nc' 
xin = nc_varget(fin,'lon');
yin = nc_varget(fin,'lat');
zin = nc_varget(fin,'ndci');
# --------------------------------------------
# I will reduce the size further to test the code (matrices with 10k X 10k points are too big for my laptop and testing purposes)
xskip, yskip = 25,25
xin = xin[::yskip,::xskip]
yin = yin[::yskip,::xskip]
zin = zin[::yskip,::xskip]
# --------------------------
# let us take some info from original coordinates:
x0,x1,dx = np.min(xin),np.max(xin),np.abs(np.mean(np.diff(xin)))
y0,y1,dy = np.min(yin),np.max(yin),np.abs(np.mean(np.diff(yin.T)))
# --------------------------
# let us make new (regular) coordinates:
xout = np.arange(x0,x1+dx,dx)
yout = np.arange(y0,y1+dx,dy)
# --------------------------
xm,ym = np.meshgrid(xout,yout)
zo = griddata((xin.flatten(),yin.flatten()),zin.flatten(),(xm,ym),'nearest')
# ---------------------------------------------------------------------------
# let us save results as netCDF:
import xarray as xr
df = xr.DataArray(zo,dims=['lat','lon'])
df.coords['lon'] = xout
df.coords['lat'] = yout
df.to_netcdf('test.nc')

关于python - 从 2D lat/lon 矩阵转换为 1D lat/lon 数组,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/71616481/

相关文章:

python - tensorflow 中的张量形状

r - 使用组变量创建新列

c++ - 跟随另一个旋转的物体

Javascript - 存储大数据集坐标的有效方法(能够按值搜索)?

google-maps - 确定坐标是否在 Google map 范围内?

python - 如何根据逻辑条件提取列值

python - 无法使用 subprocess.Popen 在 Web 服务中打开 pdf 文件

玛雅中的 python : get theline with warning

Python 函数返回正确的结果但解释器返回奇怪的错误

java - 从用户坐标中找到最近点