1 year ago

#352118

test-img

TylerSingleton

accessing geospatial raster data with limited memory

I am following the Rasterio documentation to access the geospatial raster data downloaded from here -- a large tiff image. Unfortunately, I do not have enough memory so numpy throws an ArrayMemoryError.

numpy.core._exceptions._ArrayMemoryError: Unable to allocate 77.8 GiB for an array with shape (1, 226112, 369478) and data type uint8

My code is as follow:

import rasterio
import rasterio.features
import rasterio.warp


file_path = r'Path\to\file\ESACCI-LC-L4-LC10-Map-10m-MEX.tif'
with rasterio.open(file_path) as dataset:

    # Read the dataset's valid data mask as a ndarray.
    mask = dataset.dataset_mask()

    # Extract feature shapes and values from the array.
    for geom, val in rasterio.features.shapes(
            mask, transform=dataset.transform):

        # Transform shapes from the dataset's own coordinate
        # reference system to CRS84 (EPSG:4326).
        geom = rasterio.warp.transform_geom(
            dataset.crs, 'EPSG:4326', geom, precision=6)

        # Print GeoJSON shapes to stdout.
        print(geom)

I need a way to store the numpy array to disk, so I tried looking into numpy nemmap, but I do not understand how to implement it for this. Additionally, I do not need to the full geospacial data, I am only interested in the lat, long, and the type of land cover as I planed to merge this with another dataset.

Using python 3.9.

Edit: I updated my code to try using a window.

with rasterio.open(file_path) as dataset:
    mask = dataset.read(1, window=Window(0, 0, 226112, 369478))
    ...

I can obviously adjust the window and upload the file in sections now. However, I do not understand how this has almost halved the memory required from 77.8 to 47.6.

python-3.x

rasterio

numpy-memmap

0 Answers

Your Answer

Accepted video resources