Apache Superset proves to be a fantastic Business Intelligence tool: open-source, feature-rich (with a multitude of charts, integration with multiple database management systems, etc.), and it has nothing to envy in comparison to well-known commercial alternatives (Tableau, PowerBI, BusinessObjects, and the like).
Unfortunately, being fantastic doesn’t mean it’s without flaws, particularly in terms of documentation, which may seem a bit sparse to some.
It is precisely to address this shortcoming that I propose we look at how to quickly set up a Superset instance tailored for production together.
Prerequisites
- Have a production server running Linux (regardless of the distribution) with:
- docker
- git
- caddy (or another reverse proxy of your choice)
- ports 443 and 80 open at the firewall level
- A domain name (e.g.,
bi.myawesomecompany.com
) that points to your server
Installation
Cloning the superset repo
Start by cloning the repository directly on your production server (yes, this is not common):
cd $HOME
git clone https://github.com/apache/superset.git
cd superset
Then select the desired version via its tag.
git checkout tags/3.0.0
Editing configuration files
Now comes the step of customizing certain configuration files. Since you are on a git tag, you won’t be able to commit your modifications as is, so you have the choice of:
- documenting all your modifications to reproduce them exactly in case of reinstallation
- or forking the project and creating a branch to commit your modifications
It’s up to you.
The docker/.env-non-dev
file (understand here: non-dev
= prod
) allows you to define a set of environment variables that will be used by the docker containers we will start later.
Add the following:
# We don't want demo data in production
SUPERSET_LOAD_EXAMPLES=no
# A random string to encode session cookies
SUPERSET_SECRET_KEY=4Sido8BkIjs54Vz2XyVD5GJIvANVIAT399dRESjdmr4vm92n
# To prevent XSS attacks (among other things)
TALISMAN_ENABLED=yes
# Number of workers: the higher the value, the fewer intermittent chart refresh failures you will have in your dashboards (adjust according to your server's power).
SERVER_WORKER_AMOUNT=64
Also, make some adjustments in docker/pythonpath_dev/superset_config.py
to enable alerting and the template engine (necessary for creating datasets with dynamic filtering):
FEATURE_FLAGS = {"ALERT_REPORTS": True, "ENABLE_TEMPLATE_PROCESSING": True}
Disable telemetry by replacing in the docker-compose-non-dev.yml
file:
x-superset-image: &superset-image apachesuperset.docker.scarf.sh/apache/superset:${TAG:-latest-dev}
with
x-superset-image: &superset-image apache/superset:${TAG:-latest-dev}
And switch to the latest stable version of postgreSQL by replacing:
image: postgres:14
with
image: postgres:16
Startup
Instantiate and start the docker containers:
docker compose -f docker-compose-non-dev.yml up -d
Superset is accessible on your production server via http://127.0.0.1:8088
.
Reverse proxy
Configure a reverse proxy to secure the connection, for example, using caddy:
Edit /etc/caddy/Caddyfile
bi.myawesomecompany.com {
reverse_proxy http://127.0.0.1:8088
}
Then restart caddy:
sudo service caddy restart
First Login
Log in at https://bi.myawesomecompany.com with the username / password: admin / admin
Change your password.
Backup and Restoration
When editing the docker-compose-non-dev.yml
configuration file, you may have noticed that a postgresql database is being instantiated.
Therefore, backup and restoration for superset only need to consider this database.
You can perform a hot backup with a simple command:
docker exec -t superset_db pg_dump superset -U superset | xz > backup.sql.xz
For restoration, start only the postgresql container in advance, avoiding having a superset instance connected to a database being restored:
docker compose down
docker compose -f docker-compose-non-dev.yml up db -d
docker exec -t superset_db dropdb -U superset superset
docker exec -t superset_db createdb -U superset superset
xz -dc backup.sql.xz | docker exec -i superset_db psql -U superset -d superset
docker compose -f docker-compose-non-dev.yml up -d