Automate Postgres Database Backup and Upload to S3

Database is a very critical system for an application. To prepare for any disaster we all must take regular backups of our database.

In this tutorial we will create a bash script to automate postgres database backup from a docker container and native postgres installation then upload to S3 based storage system.

What we will cover

Write db backup automation script for docker and non-docker based system
Upload backup file to AWS S3
Setup the cron job for daily backup

Steps

Move to your project folder, where docker-compose.yml is.
Take the database backup in container
Copy the backup file from container filesystem to our desired location
Delete the backup file inside the container
Upload backup file to S3
Delete the local backup file

Code snippet below covers step 1 to 4. But it is not optimal, later we will make modification so that we can easily automate it and make it re-usable.

cd project_dir

docker compose exec db_service_name sh -c "pg_dump -U db_user db_name > filename"

docker compose cp db_service_name:filename destination/filename

docker compose exec db_service_name rm filename

Upload to S3

To upload file to S3, we need to have aws cli installed, you can follow this guide from AWS docs to install awscli.

After installing awscli, you need to run aws configure and provide your key id and key secret. You can use any S3 based storage system here.

aws s3 cp filename s3://your-bucket

If you are using some other S3 based storage system then you’ll have to provide custom --endpoint-url argument here, make sure you read the respective docs for this.

Full Script

We have laid out all the steps and code to accomplish that, now let’s modify in such a way that we can easily automate it and make it reusable.

The filename is created based on the current datetime of the system, so that we can avoid ambiguity and exactly know how old is the backup file.

For docker compose

#!/bin/bash

############## ✅ Update These Values ##################
directory="your-project-dir"
home="/home/<user>"

db_service_name="db"
db_user="db_user"
db_name="db_name"

#####################################################
############ 🔴 CHANGE WITH CAUTION 🔴 #############
date=$(date +"%Y-%m-%d-%H:%M:%S")
filename=$db_name"_"$date".sql"

cd $home
cd $directory
docker compose exec $db_service_name sh -c "pg_dump -U $db_user $db_name > $filename"

docker compose cp $db_service_name:$filename $home"/"$filename
docker compose exec $db_service_name rm $filename

cd $home
#####################################################
#####################################################

############## ✅ CAN CHANGE BELOW ##################
aws s3 cp $filename s3://your-bucket

rm $filename

At the top, you’ve to update variable values based on your project and directory structure, and the main code will use that to backup the database.

At the end of script, we have aws command to copy the backup file to S3, here you need to update bucket name and if you are using other S3 storage provider (other than AWS), then --endpoint-url argument with appropriate value.

For Native Postgres Installation

#!/bin/bash

############## ✅ Update These Values ##################
home="/home/<user>"

db_user="db_user"
db_name="db_name"

#####################################################
############ 🔴 CHANGE WITH CAUTION 🔴 ##############
date=$(date +"%Y-%m-%d-%H:%M:%S")
filename=$db_name"_"$date".sql"

ch $home
pg_dump -U $db_user $db_name > $filename

#####################################################

############## ✅ CAN CHANGE BELOW ##################
aws s3 cp $filename s3://urlify-db-backup

rm $filename

When running pg_dump in the native installation, there are two ways postgres will try to authenticate the user.

Peer based authentication: It will check if the current user (system logged in user) is same as user passed to pg_dump.
If peer auth fails then postgres looks for .pgpass file in home directory for the password.
If both fails then it might ask for password.

So, make sure you use 1st method and run the script on the same user as the one passed to pg_dump.

Setup Cron Job

To automate backup and upload of our database, we will use crontab. It comes built-in with almost all the Linux distros and are very easy to setup.

Give your script executable permission:

chmod +x db_backup.sh

Open crontab:

crontab -e

This will open up crontab in your prefered choice of editor. Use the below statement to run the backup script everyday at 12 PM.

0 12 * * * /home/<user>/db_backup.sh >> cron.log 2>&1

Create cron.log file

touch cron.log

Use full path if cron.log is not in current user’s home folder.