Skip to content

Deployment of the Database, API and Frontend⚓︎

The Heiplanet platform is deployed using Docker Compose, which orchestrates multiple containerized services to create a complete surveillance system. Access to GitHub Container Registry (GHCR) images requires appropriate GitHub permissions and authentication tokens.

System Architecture⚓︎

The deployment consists of three main containerized services:

1. PostgreSQL/PostGIS Database⚓︎

  • Image: postgis/postgis:17-3.5 (public image)
  • Purpose: Stores all spatial and temporal disease surveillance data
  • Features:
  • PostgreSQL 17 with PostGIS 3.5 for advanced geospatial queries
  • NUTS region geometries and definitions
  • Time-series disease transmission data (R0 values)
  • Efficient spatial indexing for fast geographic queries

2. Web Frontend⚓︎

  • Image: ghcr.io/ssciwr/heiplanet-frontend:<tag>
  • Purpose: Interactive web interface for data visualization
  • Features:
  • Geographic map-based visualization
  • Temporal data exploration
  • Regional comparison tools
  • Accessible via web browser on port 80

The <tag> can be a specific version number, latest for the most recent release, or a branch name for development versions.

3. Python Backend⚓︎

  • Image: ghcr.io/ssciwr/heiplanet-db:<tag> or locally built
  • Purpose: Multi-function backend service providing:
  • Database ORM: SQLAlchemy-based object-relational mapping
  • REST API: FastAPI service processing frontend requests
  • Data Pipeline: ETL (Extract, Transform, Load) processes for ingesting surveillance data
  • Data Validation: Quality checks and integrity verification

name = "mailcom" license = {text = "MIT License"} readme = "README.md" description = "Pseudonymize email content in Romance languages" requires-python = ">=3.9" classifiers = [ "Programming Language :: Python :: 3", "License :: OSI Approved :: MIT License", "Operating System :: OS Independent", The <tag> can be a version number, latest, or a branch name. Using a locally built image allows customization of the data configuration, enabling you to modify which datasets are ingested or add new data sources.

Development Environment⚓︎

The development environment allows you to run the complete Heiplanet platform locally for testing, development, and experimentation with test datasets.

Environment Configuration⚓︎

Create a .env file in the heiplanet-db root directory with the following environment variables:

# Database credentials
POSTGRES_USER=<user>
POSTGRES_PASSWORD=<password>
POSTGRES_DB=<db-name>

# Backend database connection
DB_USER=<user>
DB_PASSWORD=<password>
DB_HOST=db
DB_PORT=5432
DB_URL=postgresql://<user>:<password>@db:5432/<db-name>

# Startup configuration
WAIT_FOR_DB=true

# CORS configuration (for production deployment)
IP_ADDRESS=0.0.0.0

Configuration notes: - Replace <user>, <password>, and <db-name> with your chosen credentials - Example: POSTGRES_USER=heiplanet, POSTGRES_PASSWORD=secure_password, POSTGRES_DB=heiplanet_dev - DB_HOST=db refers to the database container name in the Docker network - WAIT_FOR_DB=true ensures the backend waits for the database to be ready before starting - IP_ADDRESS configures Cross-Origin Resource Sharing (CORS) for web requests - For local development: leave as 0.0.0.0 or omit - For production: set to your server's public IP address

Starting the Development Environment⚓︎

Step 1: Initialize Database with Test Data⚓︎

Build the database tables and populate with test/development data:

docker compose up --abort-on-container-exit production-runner

This command: - Creates the database schema (tables, indexes, constraints) - Downloads test datasets defined in the configuration - Processes data through the bronze/silver/gold pipeline - Inserts data into the PostgreSQL database - Exits automatically when data loading completes

Note: The first run will take several minutes as it downloads and processes data files.

Step 2: Start API and Frontend Services⚓︎

Once data loading is complete, start the application services:

docker compose up api

This starts: - FastAPI backend (internal port 8000) - Frontend web application (port 80) - Database server (internal to Docker network)

Accessing the Application⚓︎

Open your web browser and navigate to: - Frontend: http://localhost:80 or http://127.0.0.1:80 - The API is accessible to the frontend through the internal Docker network

Advanced Development Options⚓︎

Direct API Access⚓︎

To test the API directly (useful for debugging or development), expose port 8000 by modifying docker-compose.yaml:

services:
  api:
    ports:
      - "8000:8000"  # Add this line

After making this change, the API documentation will be available at: - Swagger UI: http://localhost:8000/docs - ReDoc: http://localhost:8000/redoc

Direct Database Access⚓︎

Similarly, you can expose the database port for external database clients:

services:
  db:
    ports:
      - "5432:5432"  # Add this line

This allows connecting with tools like pgAdmin, DBeaver, or psql:

psql -h localhost -p 5432 -U <user> -d <db-name>

Alternative: Minimal Container Development⚓︎

If you lack GitHub permissions for GHCR access or want a simpler setup with just the database and API (without the frontend), follow this standalone procedure.

This approach manually creates and links containers instead of using Docker Compose.

Prerequisites: Navigate to the heiplanet-db/ directory

Step 1: Start PostgreSQL/PostGIS Database⚓︎

Create and start a database container:

docker run -d \
  --name postgres_heiplanet \
  -p 5432:5432 \
  -e POSTGRES_PASSWORD=postgres \
  -e POSTGRES_DB=heiplanet_db \
  postgis/postgis:17-3.5

This creates a database accessible at localhost:5432 with: - Username: postgres - Password: postgres - Database name: heiplanet_db

Step 2: Build and Start API Container⚓︎

Build the backend image and start the API:

# Build the image locally
docker build -t heiplanet-db .

# Start the API container linked to the database
docker run -d \
  --name heiplanet_api \
  -p 8000:8000 \
  --link postgres_heiplanet:db \
  -e IP_ADDRESS=0.0.0.0 \
  -e DB_URL=postgresql+psycopg2://postgres:postgres@db:5432/heiplanet_db \
  heiplanet-db

Local run without docker compose⚓︎

To run with a local Python process and only PostGIS in Docker, use the following workflow from the heiplanet-db/ root directory.

1. Configure .env⚓︎

Create/update .env with:

POSTGRES_USER=postgres
POSTGRES_PASSWORD=heiplanet-db-secret-name
POSTGRES_DB=heiplanet_db

DB_USER=postgres
DB_PASSWORD=heiplanet-db-secret-name
DB_HOST=127.0.0.1
DB_PORT=5432
DB_URL=postgresql://postgres:heiplanet-db-secret-name@127.0.0.1:5432/heiplanet_db
WAIT_FOR_DB=true
IP_ADDRESS=127.0.0.1
BATCH_SIZE=10000
MAX_WORKERS=4
VAR_TIME_CHUNK=6
VAR_LAT_CHUNK=45
VAR_LON_CHUNK=90

2. Start PostgreSQL/PostGIS⚓︎
docker run --name my-postgres \
  -e POSTGRES_USER=$POSTGRES_USER \
  -e POSTGRES_PASSWORD=$POSTGRES_PASSWORD \
  -e POSTGRES_DB=$POSTGRES_DB \
  -p 5432:5432 \
  --shm-size=512mb \
  -d postgis/postgis:17-3.5
3. Populate database tables⚓︎
python heiplanet_db/production.py

Alternative 3 when using docker compose: Populate Database with Test Data⚓︎

Run the data ingestion script inside the API container:

docker exec -it heiplanet_api python3 /heiplanet_db/production.py
4. Run the API locally⚓︎
cd heiplanet_db
fastapi dev

With the currently reduced input files (Grid and NUTS only), database setup is typically fast. The API will be accessible at http://localhost:8000

This processes and loads test data into the database. Depending on the configuration file used, this may take several minutes.

Accessing the Services⚓︎

  • API Documentation: http://localhost:8000/docs
  • Database: localhost:5432 (connect with any PostgreSQL client)

Stopping and Cleaning Up⚓︎

# Stop containers
docker stop heiplanet_api postgres_heiplanet

# Remove containers
docker rm heiplanet_api postgres_heiplanet

# Remove volumes (optional - deletes all data)
docker volume prune

Building the Image Locally⚓︎

Building a local Docker image is necessary when: - Testing a custom data configuration file - Modifying the data ingestion pipeline - Developing new features for the backend - Working offline without GHCR access

Build the Image⚓︎

From the heiplanet-db/ root directory:

docker build -t heiplanet-db .

This creates a local image tagged as heiplanet-db containing: - Python backend application - Data processing pipeline - Configuration files - Database ORM and API code

Using the Local Image with Docker Compose⚓︎

Option 1: Modify docker-compose.yaml

Change the image reference in docker-compose.yaml:

# Before
image: ghcr.io/ssciwr/heiplanet-db:latest

# After
image: heiplanet-db

Option 2: Use Build Configuration

Alternatively, uncomment the build: lines in docker-compose.yaml to force local builds:

services:
  api:
    build: .
    # image: ghcr.io/ssciwr/heiplanet-db:latest

Then run:

docker compose build
docker compose up

Publishing to GitHub Container Registry (GHCR)⚓︎

For maintainers with push access, publish your local image to GHCR:

Step 1: Authenticate with GHCR

export CR_PAT=<your-github-personal-access-token>
echo $CR_PAT | docker login ghcr.io -u <github-username> --password-stdin

Step 2: Tag the Image

docker image tag heiplanet-db ghcr.io/ssciwr/heiplanet-db:latest

# For version-specific tags
docker image tag heiplanet-db ghcr.io/ssciwr/heiplanet-db:v1.2.3

Step 3: Push to GHCR

docker push ghcr.io/ssciwr/heiplanet-db:latest
docker push ghcr.io/ssciwr/heiplanet-db:v1.2.3

Note: Ensure your GitHub Personal Access Token has write:packages and read:packages permissions.

Production environment⚓︎

Deployment Configuration Options⚓︎

The system provides three pre-configured deployment options based on data volume and temporal coverage. Choose the configuration that best matches your deployment requirements and available resources.

Configuration Comparison⚓︎

Feature Small Medium Large
Config File production_config_small.yml production_config_medium.yml production_config.yml
Temporal Coverage Single month (July 2025) 3 months (May-July 2025) Full year (2025)
Use Case Testing, demos, development Seasonal analysis, typical production Comprehensive analysis, research
Deployment Time Fastest (~5-10 min) Moderate (~15-25 min) Slower (~30-60 min)
Storage Required Minimal (~500 MB) Moderate (~1-2 GB) Higher (~3-5 GB)
Data Resolution 0.1° global grid 0.1° global grid 0.1° global grid
NUTS Aggregation Yes Yes Yes
Recommended For Development environments Production deployments Research and analysis

Small Deployment⚓︎

Configuration file: heiplanet_db/data/production_config_small.yml

This minimal deployment is suitable for testing, demonstrations, or resource-constrained environments.

Included data: - NUTS region definitions (European geographic boundaries) - West Nile virus transmission suitability (R0) for a single month (July 2025) - High-resolution global grid (0.1° resolution) - NUTS-aggregated regional data

Characteristics: - Fastest deployment time - Minimal storage requirements - Ideal for development and testing

Medium Deployment⚓︎

Configuration file: heiplanet_db/data/production_config_medium.yml

This balanced deployment provides seasonal data coverage suitable for typical use cases.

Included data: - NUTS region definitions (European geographic boundaries) - West Nile virus transmission suitability (R0) for summer months (May-July 2025) - High-resolution global grid (0.1° resolution) - NUTS-aggregated regional data

Characteristics: - Moderate deployment time - Covers critical transmission season - Recommended for production deployments with seasonal focus

Large Deployment⚓︎

Configuration file: heiplanet_db/data/production_config.yml

This comprehensive deployment provides full annual data coverage for detailed analysis.

Included data: - NUTS region definitions (European geographic boundaries) - West Nile virus transmission suitability (R0) for the entire year (2025) - High-resolution global grid (0.1° resolution) - NUTS-aggregated regional data

Characteristics: - Longer deployment time - Higher storage requirements - Provides complete annual transmission dynamics - Recommended for comprehensive analysis and research

Historical Data Configuration (Advanced)⚓︎

Configuration file: heiplanet_db/data/production_config_45yrs.yml

An additional configuration is available for long-term historical analysis covering 45 years of data (1980-2025).

Included data: - NUTS region definitions - West Nile virus transmission suitability (R0) for 1980-2025 - Lower resolution global grid (0.5° resolution due to data volume) - NUTS-aggregated regional data - Requires local data files (not available via remote download)

Characteristics: - Extensive deployment time (several hours) - Significant storage requirements (>20 GB) - Enables long-term trend analysis and climate change impact assessment - Intended for advanced research applications

Note: This configuration requires data files to be available locally (host: "local" in the configuration), as the dataset is too large for standard remote hosting.

Deploying with a Configuration⚓︎

To deploy the system in production with your chosen configuration:

  1. Select your configuration by specifying the appropriate config file when building your Docker image. You can either:

a. Copy your chosen configuration file to production_config.yml:

cp heiplanet_db/data/production_config_small.yml heiplanet_db/data/production_config.yml
# or production_config_medium.yml, or keep the default production_config.yml

b. Or modify the production.py script to reference your chosen configuration file directly.

  1. Build the Docker image with your selected configuration (see above):

    docker build -t heiplanet-db .
    

  2. Run the deployment using docker compose:

    docker compose up --abort-on-container-exit production-runner
    docker compose up api
    

Note: You can also create custom configuration files based on these templates to include specific datasets tailored to your research needs. The configuration files follow a YAML structure defining data sources, file locations, and metadata for each dataset to be ingested into the database.

Troubleshooting⚓︎

Common Issues and Solutions⚓︎

Database Connection Errors⚓︎

Symptom: Backend fails to connect to database with connection refused or could not connect errors.

Solutions: - Verify database container is running: docker ps | grep postgres - Check database logs: docker logs <postgres-container-name> - Ensure WAIT_FOR_DB=true is set in your .env file - Verify database credentials match between .env and database initialization

Stale Volume Data⚓︎

Symptom: Old data persists after configuration changes, or database initialization fails.

Solution: Remove Docker volumes to start fresh:

# Stop and remove containers and volumes
docker compose down -v

# Or manually prune volumes
docker volume prune

# For aggressive cleanup (removes ALL unused Docker resources)
docker system prune --force --volumes

Warning: This deletes all data in the database. Export important data first if needed.

Network Connectivity Issues⚓︎

Symptom: Containers can't communicate, frontend can't reach API, or API can't reach database.

Solution: Reset Docker networking:

# Stop services
docker compose down

# Restart services (Docker recreates networks)
docker compose up -d

For persistent issues:

# Remove all networks
docker network prune

# Restart Docker daemon (Linux)
sudo systemctl restart docker

Port Already in Use⚓︎

Symptom: Error: bind: address already in use when starting containers.

Solution: Identify and stop conflicting services:

# Find process using port 80 (frontend)
sudo lsof -i :80

# Find process using port 8000 (API)
sudo lsof -i :8000

# Find process using port 5432 (database)
sudo lsof -i :5432

# Kill the process or change ports in docker-compose.yaml

Out of Disk Space⚓︎

Symptom: Container crashes during data loading or image build fails.

Solution: Free up Docker disk space:

# Check Docker disk usage
docker system df

# Remove unused images
docker image prune -a

# Remove all stopped containers
docker container prune

# Complete cleanup
docker system prune -a --volumes

Permission Denied Errors⚓︎

Symptom: Cannot access files or directories inside containers.

Solution: Check file permissions and ownership:

# Fix ownership of data directories
sudo chown -R $USER:$USER .data_heiplanet_db/

# Ensure Docker has proper permissions
sudo usermod -aG docker $USER
# Log out and back in for group changes to take effect

Image Pull Authentication Failures⚓︎

Symptom: Error response from daemon: unauthorized when pulling GHCR images.

Solution: Authenticate with GitHub Container Registry:

# Create GitHub Personal Access Token with read:packages permission
export CR_PAT=<your-token>
echo $CR_PAT | docker login ghcr.io -u <github-username> --password-stdin

Data Loading Takes Too Long⚓︎

Symptom: production-runner container runs for extended periods without completing.

Potential causes: - Large configuration file (e.g., using full year or historical data) - Slow network connection for data downloads - Insufficient system resources

Solutions: - Use a smaller configuration file for testing (e.g., production_config_small.yml) - Check available disk space and memory - Monitor container logs: docker logs -f <container-name> - Ensure stable internet connection for data downloads

CORS Errors in Browser⚓︎

Symptom: Frontend shows CORS policy errors in browser console.

Solution: Configure IP_ADDRESS in .env file:

# For local development
IP_ADDRESS=0.0.0.0

# For production deployment
IP_ADDRESS=<your-server-public-ip>

Restart the API container after changes:

docker compose restart api

Getting Additional Help⚓︎

If issues persist: 1. Check container logs: docker logs <container-name> 2. Review the GitHub Issues 3. Consult the Issues documentation for known problems 4. Report new bugs with full error messages and system information