Skip to content

Geovio Jupyter Notebooks Documentation

Overview

This document provides information about how user Jupyter notebooks work in Geovio. It covers the input environment, the process for uploading custom files such as pre-trained machine learning models, and the libraries that are prepared based on the provided Dockerfile configuration.

Geovio Basic Object Types

Here is a list of the basic object types that can be used with the Geovio API:

  • Project (API Workspace)
    An isolated user environment that contains all user data, including areas of interest, mosaics, and analytic maps. Projects also enable team role management and access control.

  • Area (API AreaInterest)
    A geometry feature collection defining an area of interest (AOI) for analysis and processing.

  • Image (API Mosaic)
    Raw raster imagery related to an AOI, stored as Cloud Optimized GeoTIFF (COG) for efficient access and processing.

  • Analysis (API Analysis)
    Analytic outputs derived from processing, such as index maps, comparison maps, difference maps, and other results.

  • API Dataset
    Vector geospatial data used as overlays or attachments for image analysis. Can include features from external sources like shapefiles, GeoJSON, or other vector formats.

  • Job (API JobDef)
    A user workflow template in Jupyter Notebook format. It can be executed repeatedly in processing pipelines, for example on specific mosaics. The outputs can be any results created using the Geovio API.

Execution Environment

  • Containerized environment: all notebooks run inside a Docker container with pre-installed dependencies.
  • Free tier accounts:
  • 2 GB RAM, 1 CPU for notebook execution
  • Total of 30 processing minutes per account for testing and experiments
  • Paid accounts can access more resources, including GPU-enabled execution (Contact us).
  • Important restrictions: notebooks cannot access the internet or external data sources; they can only work with the Geovio API and user data according to authorization rules.

Input Environment

Geovio's Jupyter notebooks are designed to facilitate machine learning and geospatial data processing workflows. The environment is pre-configured with all necessary tools, libraries, and variables to simplify workflow execution.

Key Environment Variables

  • JWT_TOKEN
    Access token used for user data authorization within Geovio APIs.
  • For local integration, the user can use a token from the Geovio Application Settings.

  • GEOVIO_API_URL
    Internal URL to access the main Geovio API.

  • For local integration, this can be replaced with the Geovio API Swagger.

  • TRANSFER_API_URL
    Internal URL for the file transfer API, used for uploading/downloading mosaics and other data.

  • For local integration, this can be replaced with the Geovio Transfer API Swagger.

  • INPUT_PARAMETERS
    JSON object dynamically assigned by the application for the notebook execution.

  • Example: For a job started for a user-specified mosaic with ID 10:
    json {"mosaic": 10}
  • WORKSPACE
    Unique identifier of a project (API Workspace) object that encapsulates user data (jobs, mosaics, AOIs, etc.).

Data Zip for Uploading Custom Files

To upload custom files, including pre-trained machine learning models, users are required to package their files into a ZIP archive. This ZIP file should contain all necessary components, such as:

  • Pre-trained model files (e.g., .pth files for PyTorch models)
  • Any additional configuration files or scripts needed for the model

Once the ZIP file is prepared, it can be uploaded through the Jupyter notebook interface, allowing users to seamlessly integrate their custom models into the Geovio platform.

When you upload a ZIP file data.zip (for example, containing a pre-trained machine learning model or other resources), it is automatically mounted into the container in the /workdir/data.zip location. This ensures your files are available at a predictable path for use in your Jupyter notebook.

Libraries and Versions

The Jupyter notebooks in Geovio utilize a variety of libraries to support geospatial analysis and machine learning tasks. The libraries are ensuring that the correct versions are installed in the environment. Below is a list of the key libraries and their versions:

Base Image

  • python:3.10

System Packages

  • gdal-bin, libgdal-dev, python3-gdal — for geospatial raster/vector operations
  • build-essential — compilation tools
  • jq, curl — command-line utilities

GDAL Environment

  • GDAL_VERSION=3.6.0

Python Packages

  • Jupyter ecosystem: jupyter, ipykernel, nbformat, papermill
  • Geospatial processing: rasterio, gdal==3.6.2, geopandas==1.0.1, shapely==2.0.5
  • Machine Learning: torch==2.8.0, scikit-learn, segmentation_models_pytorch==0.5.0, joblib

These libraries are essential for performing tasks such as data manipulation and geospatial analysis within the Jupyter notebooks in Geovio.

Next →