Getting Started for Production

Follow this page if you are interested in deploying or running the production environment locally.

Prerequisites

Configuring the Stack

Clone the project

git clone https://github.com/aims-group/metagrid.git

If you go to the root project directory for metagrid, you'll see there is a manage_metagrid.sh script.

This script provides some convenience functions for tasks related to running Metagrid. When preparing for production, you'll need to edit the file 'metagrid_config' located in the 'metagrid_configs' folder in the root directory. To do so, simply run the script: ./manage_metagrid.sh Then in the script's option menu, select the "Configure Metagrid" option.

This will open up an editor where you can enter the configuration parameters as needed (refer to table below). Once you save the config file and close the editor, the script will automatically copy the configuration to the necessary locations and save a backup using the current timestamp. If you change parameters in the future or need to revert your settings, you can do so by running the "Restore Backup Config" option in the manage_metagrid.sh script.

NOTE: You can easily generate a secret key with Python using this command:

python3 -c 'import secrets; print(secrets.token_hex(100))'

Config Parameters

Environment Variable	Description	Documentation	Type	Example
=========== TREAFIK CONFIG =============
`DOMAIN_NAME`	The domain linked to the server hosting the Metagrid site.		string	`DOMAIN_NAME=esgf-dev1.llnl.gov` Local environment: `DOMAIN_NAME=localhost`
`PUBLIC_URL`	OPTIONAL The domain subdirectory that is used to serve the front-end. Leave blank if you want users to access the app from the domain directly.		string	`DOMAIN_SUBDIRECTORY=metagrid`
`DOMAIN_SUBDIRECTORY`	OPTIONAL The domain subdirectory that is proxied to the Django site (e.g. esgf-dev1.llnl.gov/metagrid-backend). Omit backslash and match backend rules' `PathPrefix` in `traefik.yml`.		string	`DOMAIN_SUBDIRECTORY=metagrid-backend`
=========== BACKEND CONFIG =============
`DJANGO_SECRET_KEY`	A secret key for a particular Django installation. This is used to provide cryptographic signing, and should be set to a unique, unpredictable value.	Link	string	`DJANGO_SECRET_KEY=YAFKApvifkIFTw0DDNQQdHI34kyQdyWH89acWTogCfm4SGRz2x`
`DJANGO_ADMIN_URL`	The url to access the Django Admin page. It should be set to a unique, unpredictable value (not `admin/`). Take note of this value in order to access the admin page later on. For example with the settings shown here you would go to: https://esgf-dev1.llnl.gov/metagrid-backend/example_admin_url_87261847395 to access the admin site. Then you would use the admin credentials created when creating a django superuser (explained further below).		string	`DJANGO_ADMIN_URL=example_admin_url_87261847395`
`DJANGO_ALLOWED_HOSTS`	A list of strings representing the host/domain names that this Django site can serve. This is a security measure to prevent HTTP Host header attacks, which are possible even under many seemingly-safe web server configurations.	Link	array of strings	`DJANGO_ALLOWED_HOSTS=esgf-dev1.llnl.gov` Local environment: `DJANGO_ALLOWED_HOSTS=localhost`
`KEYCLOAK_URL`	The url of your hosted Keycloak server, it must end with `/auth`.	Link	string	`KEYCLOAK_URL=https://keycloak.metagrid.com/auth`
`KEYCLOAK_REALM`	The name of the Keycloak realm you want to use.	Link	string	`KEYCLOAK_REALM=esgf`
`KEYCLOAK_CLIENT_ID`	The id for the Keycloak client, which is the entity that can request Keycloak to authenticate a user.		string	`KEYCLOAK_CLIENT_ID=metagrid-backend`
========== FRONTEND CONFIG =============
`REACT_APP_METAGRID_URL`	The URL for the MetaGrid API used to query projects, users, etc.		string	`REACT_APP_METAGRID_API_URL=https://esgf-dev1.llnl/metagrid-backend` Local environment: `REACT_APP_METAGRID_API_URL=http://localhost:8000`
`REACT_APP_WGET_API_URL`	The URL for the ESGF wget API to generate a wget script for downloading selected datasets.	Link	string	`REACT_APP_WGET_API_URL=https://pcmdi8vm.llnl.gov/wget`
`REACT_APP_ESGF_NODE_URL`	The URL for the ESGF Search API node used to query datasets, files, and facets.	Link	string	`REACT_APP_ESGF_NODE_URL=https://esgf-node.llnl.gov`
`REACT_APP_ESGF_NODE_STATUS`	The URL for the ESGF node status API node used to query node status.		string	`REACT_APP_ESGF_NODE_STATUS_URL=https://aims4.llnl.gov/prometheus/api/v1/query?query=probe_success%7Bjob%3D%22http_2xx%22%2C+target%3D~%22.%2Athredds.%2A%22%7D`
`REACT_APP_KEYCLOAK_URL`	The url of your hosted Keycloak server, it must end with `/auth`.		string	`REACT_APP_KEYCLOAK_URL=https://keycloak.metagrid.com/auth`
`REACT_APP_KEYCLOAK_REALM`	The name of the Keycloak realm you want to use.		string	`REACT_APP_KEYCLOAK_REALM=esgf`
`REACT_APP_KEYCLOAK_CLIENT_ID`	The id for the Keycloak client, which is an entity that can request Keycloak to authenticate a user.		string	`REACT_APP_KEYCLOAK_CLIENT_ID=frontend`
`REACT_APP_HOTJAR_ID`	OPTIONAL Your site's ID. This is the ID which tells Hotjar which site settings it should load and where it should save the data collected.	Link	number	`REACT_APP_HOTJAR_ID=1234567`
`REACT_APP_HOTJAR_SV`	OPTIONAL The snippet version of the Tracking Code you are using. This is only needed if Hotjar ever updates the Tracking Code and needs to discontinue older ones. Knowing which version your site includes allows Hotjar team to contact you and inform you accordingly.	Link	number	`REACT_APP_HOTJAR_SV=6`
`REACT_APP_GOOGLE_ANALYTICS_TRACKING_ID`	OPTIONAL Google Analytics tracking id.	Link	string	`REACT_APP_GOOGLE_ANALYTICS_TRACKING_ID=UA-000000-01`

Example Production Configuration - v1.0.10

# =====================TREAFIK CONFIG=====================


DOMAIN_NAME=esgf-dev1.llnl.gov

PUBLIC_URL=   # Not used, should this be deprecated
REACT_APP_PREVIOUS_URL=metagrid
DOMAIN_SUBDIRECTORY=metagrid-backend

# =====================BACKEND CONFIG====================

# General

DJANGO_SETTINGS_MODULE=config.settings.production
DJANGO_SECRET_KEY=
DJANGO_ADMIN_URL=TG_-ztsXwL7NOx6cTqZ_bjINGF_R1VuOTI8FocOfAfs
DJANGO_ALLOWED_HOSTS=esgf-dev1.llnl.gov,198.128.245.131,localhost  #include the IP

# Security

DJANGO_SECURE_SSL_REDIRECT=False

# django-cors-headers

CORS_ORIGIN_WHITELIST=https://localhost:3000,https://esgf-dev1.llnl.gov  

# django-all-auth  - Configure your Keycloak here

KEYCLOAK_URL=https://esgf-login.ceda.ac.uk/
KEYCLOAK_REALM=esgf
KEYCLOAK_CLIENT_ID=esgf-dev1-metagrid

# postgres

POSTGRES_HOST=postgres
POSTGRES_PORT=5432
POSTGRES_DB=postgres
POSTGRES_USER=postgres
POSTGRES_PASSWORD=postgres

# =====================FRONTEND CONFIG====================

# Redirect the frontend to home page when old subdirectory is used (optional)

REACT_APP_PREVIOUS_URL=metagrid   # Leave blank for a new install

# MetaGrid API

# https://github.com/aims-group/metagrid/tree/master/backend

REACT_APP_METAGRID_API_URL=https://esgf-dev1.llnl.gov/metagrid-backend

# Globus

REACT_APP_GLOBUS_REDIRECT=https://esgf-dev1.llnl.gov/cart/items
REACT_APP_CLIENT_ID=   #  Generate a client ID at Globus for your Native App
REACT_APP_GLOBUS_NODES=aims3.llnl.gov,esgf-data1.llnl.gov,esgf-data2.llnl.gov

# ESGF wget API

# https://github.com/ESGF/esgf-wget

REACT_APP_GLOBUS_SCRIPT_URL=https://greyworm1-rh7.llnl.gov/globusscript

# ESGF Search API
# https://esgf.github.io/esg-search/ESGF_Search_RESTful_API.html

#REACT_APP_WGET_API_URL=https://esgf-fedtest.llnl.gov/esg-search/wget
REACT_APP_WGET_API_URL=https://greyworm1-rh7.llnl.gov/wget
#REACT_APP_WGET_API_URL=https://esgf-node.llnl.gov/esg-search/wget

# ESGF Search API

# https://esgf.github.io/esg-search/ESGF_Search_RESTful_API.html

REACT_APP_AUTHENTICATION_METHOD=globus
REACT_APP_SEARCH_URL=https://esgf-fedtest.llnl.gov/esg-search/search
REACT_APP_ESGF_SOLR_URL=https://esgf-fedtest.llnl.gov/solr

# ESGF Node Status API

# https://github.com/ESGF/esgf-utils/blob/master/node_status/query_prom.py
REACT_APP_ESGF_NODE_STATUS_URL=https://aims4.llnl.gov/prometheus/api/v1/query?query=probe_success%7Bjob%3D%22http_2xx%22%2C+target%3D~%22.%2Athredds.%2A%22%7D

# Keycloak  - same settings as above but used on the frontend side

# https://github.com/keycloak/keycloak

REACT_APP_KEYCLOAK_URL=https://esgf-login.ceda.ac.uk/
REACT_APP_KEYCLOAK_REALM=esgf
REACT_APP_KEYCLOAK_CLIENT_ID=esgf-dev1-metagrid


# Django Auth URLs
REACT_APP_DJANGO_LOGIN_URL=http://esgf-dev1.llnl.gov/metagrid-backend/login/globus/
REACT_APP_DJANGO_LOGOUT_URL=http://esgf-dev1.llnl.gov/metagrid-backend/proxy/globus-logout/

# Authentication Method - switch to keycloak or globus
REACT_APP_AUTHENTICATION_METHOD=keycloak

# https://docs.djangoproject.com/en/4.2/ref/settings/#logout-redirect-url
DJANGO_LOGIN_REDIRECT_URL=http://esgf-dev1.llnl.gov/search
DJANGO_LOGOUT_REDIRECT_URL=http://esgf-dev1.llnl.gov/search

# https://app.globus.org/settings/developers/registration/confidential_client
#  Generate these at Globus, this is a confidential client
GLOBUS_CLIENT_KEY=25e75a79-7d31-41bd-b1df-0668f7a42d91
#c111e306-ad45-49ef-af54-6b107ab592ff
GLOBUS_CLIENT_SECRET=



# react-hotjar

# https://github.com/abdalla/react-hotjar

REACT_APP_HOTJAR_ID=2079136
REACT_APP_HOTJAR_SV=6

# react-ga

# https://github.com/react-ga/react-ga
#  Get a tracking ID from Google if you want to enable Analytics
REACT_APP_GOOGLE_ANALYTICS_TRACKING_ID=

Building and Running Services

Once you've finished the configuration, you will be ready to start the service containers. Using the manage_metagrid.sh script you can start or stop all or specific docker containers by selecting the appropriate option in the menu. If you wish to start or stop a container manually, you need to go to the specific service directory, for example the frontend or backend, the run the command below:

docker compose -f docker-compose.prod.yml up --build

To run the stack and detach the containers, run:

docker compose -f docker-compose.prod.yml up --build -d

Post Build Steps

After running the containers for the first time, you should perform some other steps as well to finalize the production deployment of Metagrid.

1. Traefik Notes

Traefik is a modern HTTP reverse proxy and load balancer that makes deploying microservices easy. — https://github.com/traefik/traefik

Once configured and running, Traefik will get you a valid certificate from Lets Encrypt and update it automatically. This service should be run before running the backend or frontend if starting it up manually. Using the script to run all containers will build and run them in the appropriate order.

2. Back-end Notes

2.1 HTTPS is On by Default

If you are not using a subdomain of the domain name set in the project, then remember to put your staging/production IP address in the DJANGO_ALLOWED_HOSTS environment variable before you deploy the back-end. Failure to do this will mean you will not have access to your back-end services through the HTTP protocol.

Access to the Django admin is set up by default to require HTTPS in production or once live.

The Traefik reverse proxy used in the default configuration will get you a valid certificate from Lets Encrypt and update it automatically. All you need to do to enable this is to make sure that your DNS records are pointing to the server Traefik runs on.

You can read more about this feature and how to configure it, at Automatic HTTPS in the Traefik docs.

2.2 Run Django migrations

In production, you must apply Django migrations manually since they are not automatically applied to the database when you rebuild the docker-compose containers. To do so, with the backend docker container running, run the command below in the backend directory:

docker compose -f docker-compose.prod.yml run --rm django python manage.py migrate

NOTE: If this step is skipped, you may see issues loading the project drop-down and search table results.

2.3 Updating the database

In production, if the list of projects or specific groups, or facets need to be changed, the postgress database will need to be updated. Running migrations may not be possible without clearing the database tables and rebuilding them to include the new changes. There is a script that is designed to make the update process simpler, by first following the steps below:

Edit the intial_project_data file with changes that you need to make: ~/metagrid/backend/metagrid/initial_projects_data.py
Change directory to: ~/metagrid/backend
Make backups and/or test changes locally to make sure the changes are correct before updating production. Start by running the local backend container:

docker compose -f docker-compose.yml up --build -d

Then run the updateProjects script to update existing tables without removing user data:

./updateProjects.sh

Otherwise if you wish to clear the tables and start fresh, then run:

./updateProjects.sh --clear

When satisfied with results, stop the local container and perform the same steps again with the production backend container running

2.4 Helpful Commands

Run Command in Running Container

To run a command inside the docker container (front-end, backend, traefik) go to the appropriate directory and run:

docker compose -f docker-compose.prod.yml run --rm django [command]

Creating a Superuser

With backend docker container running, run command below in the backend directory to create a superuser. Useful for logging into Django Admin page to manage the database.

docker compose -f docker-compose.prod.yml run --rm django python manage.py createsuperuser

4. Supervisor

Supervisor is a client/server system that allows its users to monitor and control a number of processes on UNIX-like operating systems. — http://supervisord.org/index.html

Once you are ready with your initial setup, you want to make sure that your application is run by a process manager to survive reboots and auto restarts in case of an error.

Although we recommend using Supervisor, you can use the process manager you are most familiar with. All it needs to do is to run docker compose -f production.yml up for traefik, backend, and frontend.

4.1 Install Supervisor

Ubuntu/Debian

sudo apt install supervisor -y

CentOS

sudo yum update -y
sudo yum install epel-release
sudo yum update
sudo yum -y install supervisor

4.2 Enable Supervisor

sudo systemctl start supervisord
sudo systemctl enable supervisord
sudo systemctl status supervisord

4.3 Create Supervisor configuration files

You can use the .ini configuration files below as starting points and configure where necessary.

The directory for where to store the .ini files vary based on the OS:

For Ubuntu/Debian: /etc/supervisor/conf.d/
For CentOS: /etc/supervisor.d/

metagrid-traefik.ini

[program:metagrid-traefik]
command=docker compose -f docker-compose.prod.yml up
directory=/home/<username>/metagrid/traefik
redirect_stderr=true
autostart=true
autorestart=true
priority=10

metagrid-backend.ini

[program:metagrid-backend]
command=docker compose -f docker-compose.prod.yml up
directory=/home/<username>/metagrid/backend
redirect_stderr=true
autostart=true
autorestart=true
priority=10

metagrid-frontend.ini

[program:metagrid-frontend]
command=docker compose -f docker-compose.prod.yml up
directory=/home/<your-username>/metagrid/frontend
redirect_stderr=true
autostart=true
autorestart=true
priority=10

4.4 Load configurations and start the processes

sudo supervisorctl reread
sudo supervisorctl update
sudo supervisorctl start all

4.5 Check the status

sudo supervisorctl status

Example output

metagrid-backend                 RUNNING   pid 9359, uptime 1 day, 0:07:28
metagrid-frontend                RUNNING   pid 6819, uptime 1 day, 0:42:53
metagrid-traefik                 RUNNING   pid 9871, uptime 1 day, 0:03:27

4.6 Restart or Stop Containers

If you need to manually restart or stop production services for some reason, you must first stop supervisor from restoring the services as soon as they're stopped.

sudo supervisorctl stop all # Stops supervisor from restoring containers

Then either use the manage_metagrid.sh scripts to stop services, or you can go to the directory of the service(s) that you want to stop. For example if you need to stop the frontend and backend services to do a rebuild then:

cd ./backend # Shutting off backend service
docker compose -f docker-compose.prod.yml down # Shut down the container
cd ./frontend # Shutting off frontend service
docker compose -f docker-compose.prod.yml down

When you are ready to restore services, you can do so manually using docker-compose:

docker compose -f docker-compose.prod.yml up --build # Start the container

Or let supervisor restore all:

sudo supervisorctl start all # Will restore supervisor service and any stopped containers

Helpful Docker-Compose Commands

These commands can be run on any docker-compose.prod.yml file.

Check logs

docker compose -f docker-compose.prod.yml logs

Check status of containers

docker compose -f docker-compose.prod.yml ps