How to automate backups of a PostgreSQL server using Barman/barnan-cloud-backup to S3
I was surprised not to find many up to date instructions on this. I have a few basic requirements:
- Back up daily to an S3 bucket
- Keep a certain number of backups
- Run automatically, preferably using systemd not cron as it’s easier to set up and troubleshoot
- Use a user with least privileges on the database, operating system, and in AWS/S3
- Send the results of each backup activity to healthchecks.io
After a bit of playing around, I decided to use Barman for the backups – it’s significantly easier to configure and use than pgBackRest and has native support for backing up to S3, point-in-time restore, and more. The major downside compared to, say, running pg_dump every night, is that it requires an identical setup to restore to – identical hardware and PostgreSQL version. Least privileges in the database is tricky – to be able to back up things like roles, the account basically needs full access to all schemas. The Barman documentation says that it should run as the same user as PostgresQL, postgres
.
Step 1: Create an S3 bucket
This one’s pretty simple. Just follow the instructions on the Amazon website.
Step 2: Create an IAM Policy to grant access to the bucket
We use an IAM role to provide only the specific access that the service account needs. Go to the IAM console, select “Policies” on the left, and “Create new”. This is the template. Substitute <container_name>
for your container name, obviously:
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "VisualEditor0",
"Effect": "Allow",
"Action": [
"s3:PutObject",
"s3:GetObject",
"s3:DeleteObject",
"s3:PutObjectAcl"
],
"Resource": [
"arn:aws:s3:::<container_name>/*"
]
},
{
"Sid": "VisualEditor1",
"Effect": "Allow",
"Action": [
"s3:ListBucket"
],
"Resource": [
"arn:aws:s3:::<container_name>"
]
}
]
}
Step 3: Create a new S3 user, assign the policy and generate credentials
In the IAM console, select Users > Create user. Give them a unique name. Do NOT grant console access. Click Next. On the “Set permissions” page, select “Attach policies directly” and attach the policy you just created. It’s easier if you “Filter by Type” and select “Customer managed”. Select Next then Review and Create. Lets assume you’ve created a user called backup_user
.
Once you’ve created backup_user
, click on their name in the list and go to the “Security Credentials” tab. Click “Create Access Key” and then select “Other” from the list of options. We need a long-lived key, so this is the best approach (unless you want to go and re-authenticate them every month??). Create the access key and then copy and note down both the Access Key and Secret. Do this now or you won’t be able to access them again and you’ll need to regenerate them.
Step 4: Create a new check on healthchecks.io
I use healthchecks.io to keep track of all the scheduled tasks and processes i’m expecting to run. Log in and create a new health check. Note the URL.
Step 5: Install AWS CLI on the server
I found that this mostly went as expected. I followed the instructions on the AWS website, however, as i’d hardened my server using the Ubuntu CIS hardening baseline, i had to set some additional permissions:
sudo chmod -R 755 /usr/local/aws-cli
Step 6: Authenticate your new user with IAM credentials
Run aws configure
. Enter the Access Key ID and Secret Access Key recorded in the step above. This generates a file at ~/.aws/credentials
which contains these details. Later we’ll copy this to our postgres
user’s home directory – but first we need to test our backup.
Step 7: Install prerequisites for python-snappy
compression library
We’re going to use the snappy compression algorithm because of its significant performance improvements over the defaults while still achieving approximately a 2:1 compression ratio (saving on both egress and S3 storage costs). First, install the required library and pip:
sudo apt-get install libsnappy-dev python3-pip
Then we install the package – we’ll need to do this again for the postgres
user later, as the package is installed to user packages, not site packages.
pip install python-snappy
Step 8: Download and install Barman
Barman is super easy to install. In my server setup, i added the PostgreSQL repos to my server – if you haven’t added the repo, follow the instructions there (which are slightly different to the ones on the PostgreSQL wiki), then simply install it – we’ll also install the Cloud CLI, allowing us to back up to S3:
sudo apt-get install barman barman-cli-cloud
Although documentation says we should configure Barman specifically for local backup by setting backup_method
to local-rsync
for our local server in a specific configuration file, we don’t actually need to do that – barman-backup-cloud
is a standalone script that simply uses Barman. We can quickly test our backup. Note i’ve already set up a .pgpass
file for the postgres_admin
user:
rob@pg:~$ sudo -E barman-cloud-backup -v --cloud-provider aws-s3 --snappy --host localhost -U postgres_admin -d postgres s3://<container_name>/barman pg
2023-11-07 23:01:40,171 [1139430] INFO: Found credentials in shared credentials file: ~/.aws/credentials
2023-11-07 23:01:40,749 [1139430] INFO: Starting backup '20231107T230140'
2023-11-07 23:01:41,408 [1139430] INFO: Uploading 'pgdata' directory '/mnt/postgres/postgresql/15/main' as 'data.tar.snappy'
2023-11-07 23:01:51,430 [1139436] INFO: Upload process started (worker 1)
2023-11-07 23:01:51,428 [1139435] INFO: Upload process started (worker 0)
2023-11-07 23:01:51,533 [1139436] INFO: Found credentials in shared credentials file: ~/.aws/credentials
2023-11-07 23:01:51,534 [1139435] INFO: Found credentials in shared credentials file: ~/.aws/credentials
2023-11-07 23:01:51,680 [1139435] INFO: Uploading 'barman/pg/base/20231107T230140/data.tar.snappy', part '1' (worker 0)
2023-11-07 23:01:58,138 [1139436] INFO: Uploading 'barman/pg/base/20231107T230140/data.tar.snappy', part '2' (worker 1)
...
2023-11-07 23:12:38,601 [1139436] INFO: Uploading 'barman/pg/base/20231107T230140/data.tar.snappy', part '278' (worker 1)
2023-11-07 23:12:41,232 [1139430] INFO: Uploading 'pg_control' file from '/mnt/postgres/postgresql/15/main/global/pg_control' to 'data.tar.snappy' with path 'global/pg_control'
2023-11-07 23:12:41,248 [1139430] INFO: Uploading 'config_file' file from '/etc/postgresql/15/main/postgresql.conf' to 'data.tar.snappy' with path 'postgresql.conf'
2023-11-07 23:12:41,249 [1139430] INFO: Uploading 'hba_file' file from '/etc/postgresql/15/main/pg_hba.conf' to 'data.tar.snappy' with path 'pg_hba.conf'
2023-11-07 23:12:41,249 [1139430] INFO: Uploading 'ident_file' file from '/etc/postgresql/15/main/pg_ident.conf' to 'data.tar.snappy' with path 'pg_ident.conf'
2023-11-07 23:12:41,250 [1139430] INFO: Stopping backup '20231107T230140'
2023-11-07 23:12:41,545 [1139430] INFO: Restore point 'barman_20231107T230140' successfully created
2023-11-07 23:12:41,546 [1139430] INFO: Uploading 'backup_label' file to 'data.tar.snappy' with path 'backup_label'
2023-11-07 23:12:41,546 [1139430] INFO: Marking all the uploaded archives as 'completed'
2023-11-07 23:12:41,547 [1139435] INFO: Uploading 'barman/pg/base/20231107T230140/data.tar.snappy', part '279' (worker 0)
2023-11-07 23:12:41,745 [1139436] INFO: Completing 'barman/pg/base/20231107T230140/data.tar.snappy' (worker 1)
2023-11-07 23:12:41,880 [1139430] INFO: Calculating backup statistics
2023-11-07 23:12:41,886 [1139430] INFO: Uploading 'barman/pg/base/20231107T230140/backup.info'
2023-11-07 23:12:42,016 [1139430] INFO: Backup end at LSN: 52/B91715B0 (0000000100000052000000B9, 001715B0)
2023-11-07 23:12:42,017 [1139430] INFO: Backup completed (start time: 2023-11-07 23:01:40.749792, elapsed time: 11 minutes, 1 second)
2023-11-07 23:12:42,021 [1139435] INFO: Upload process stopped (worker 0)
2023-11-07 23:12:42,022 [1139436] INFO: Upload process stopped (worker 1)
Step 9: Share AWS credentials with postgres
user
Barman and barman-cloud-backup both require read access to the PostgreSQL storage. So we need to run our backup job as the postg
res user. To make this work, we’ll copy our AWS credentials to them:
sudo mkdir ~postgres/.aws
sudo cp ~/.aws/credentials ~postgres/.aws/credentials
sudo chmod 0600 ~postgres/.aws/credentials
sudo chown -R postgres: ~postgres/.aws
Step 10: Install python-snappy
as user postgres
We quickly need to log in and install the python-snappy
package for the postgres
user. First, log in as them:
sudo -i -u postgres
If you get this error on logging in as the user:rob@pg:~$ sudo -i -u postgres
sudo: unable to change directory to /var/lib/postgresql: No such file or directory
you’ll need to create the user’s home directory. First, log out of the postgres
user, then check the home directory:rob@pg:~$ getent passwd barman
postgres:x:116:122:PostgreSQL administrator,,,:/var/lib/postgresql:/bin/bash
then create it:sudo mkdir -p /var/lib/postgresql
sudo chown postgres:postgres /var/lib/postgresql
Then once logged in as them, install the package:
pip install python-snappy
Step 11: Create backup service
We want the schedule to run every day, so we’ll create three systemd files. The first two are a backup script and service, the third a timer to trigger it. Firstly, we’ll check the home directory of the postgres
user:
postgres@pg:~$ getent passwd barman
barman:x:118:123:Backup and Recovery Manager for PostgreSQL,,,:/var/lib/barman:/bin/bash
We can see that this it’s /var/lib/barman
. If yours is different adjust these scripts. We need to use the absolute path because they won’t be expanded when running as a service. Create this file with sudo nano ~postgres/backup-script.sh
, obviously substituting your S3 bucket, Healthchecks.io UUID and retention policy. We’re using peer authentication to allow the postgres
user to sign in without a password:
#!/bin/bash
# Variables
BACKUP_DIR="/var/lib/postgresql/backup"
DATE_SUFFIX=$(date +%F_%H-%M-%S)
LOG_FILE="$BACKUP_DIR/barman_backup_log_$DATE_SUFFIX.txt"
S3_BUCKET="s3://<container_name>/barman"
HEALTHCHECK_URL="https://hc-ping.com/<UUID>"
SERVER_NAME="pg"
RETENTION_POLICY="RECOVERY WINDOW OF 30 DAYS" # Adjust the retention policy as needed
RETAIN_LOG_DAYS=7
# create backup temp dir if it doesnt exist
mkdir -p $BACKUP_DIR
# Redirect all output to log file
exec > "$LOG_FILE" 2>&1
# Function to send log to healthchecks.io
send_log() {
local url="$1"
curl -fsS --retry 3 -m 10 -X POST -H "Content-Type: text/plain" --data-binary "@$LOG_FILE" "$url"
}
# Perform backup with Barman
barman-cloud-backup -v --cloud-provider aws-s3 --snappy -d postgres --port 1234 "$S3_BUCKET" "$SERVER_NAME" || {
send_log "$HEALTHCHECK_URL/fail"
exit 1
}
# Delete old backups according to retention policy
barman-cloud-backup-delete --cloud-provider aws-s3 --retention-policy "$RETENTION_POLICY" "$S3_BUCKET" "$SERVER_NAME" || {
send_log "$HEALTHCHECK_URL/fail"
exit 1
}
# Notify healthchecks.io of success and send log
send_log "$HEALTHCHECK_URL"
# Finally, delete old log files in BACKUP_DIR
find "$BACKUP_DIR" -type f -name 'barman_backup_log_*.txt' -mtime +$RETAIN_LOG_DAYS -exec rm -f {} \;
Make sure that the postgres
user owns it and it’s executable:
sudo chown -R postgres: ~postgres/backup-script.sh
sudo chmod +x ~postgres/backup-script.sh
Create this file as /etc/systemd/system/barman-cloud-backup.service
:
[Unit]
Description=Barman Cloud Backup Service
[Service]
Type=oneshot
ExecStart=/var/lib/postgresql/backup-script.sh
User=postgres
Test the timer with sudo systemctl start barman-cloud-backup
. You can check the status using systemctl
too – although you’ll need to do it from a second terminal as the service is non-forking. Here we can see that the service is running:
rob@pg:~$ sudo systemctl status barman-cloud-backup
● barman-cloud-backup.service - Barman Cloud Backup Service
Loaded: loaded (/etc/systemd/system/barman-cloud-backup.service; static)
Active: activating (start) since Tue 2023-11-07 23:34:26 UTC; 16s ago
Main PID: 1143268 (backup-script.s)
Tasks: 2 (limit: 2220)
Memory: 50.7M
CPU: 10.325s
CGroup: /system.slice/barman-cloud-backup.service
├─1143268 /bin/bash /var/lib/postgresql/backup-script.sh
└─1143271 /usr/bin/python3 /usr/bin/barman-cloud-backup -v --cloud-provider aws-s3 --snappy -d postgres --port 1234 s3://<container_name>/barman pg
Nov 07 23:34:26 pg systemd[1]: Starting Barman Cloud Backup Service...
We can check the log file too:
rob@pg:~$ sudo ls -ltr ~postgres/backup
total 36
-rw-r--r-- 1 postgres postgres 36296 Nov 7 23:48 barman_backup_log_2023-11-07_23-34-25.txt
rob@pg:~$ sudo cat ~postgres/backup/barman_backup_log_2023-11-07_23-34-25.txt
2023-11-07 23:34:26,705 [1143271] INFO: Found credentials in shared credentials file: ~/.aws/credentials
2023-11-07 23:34:27,263 [1143271] INFO: Starting backup '20231107T233427'
2023-11-07 23:34:33,420 [1143271] INFO: Uploading 'pgdata' directory '/mnt/postgres/postgresql/15/main' as 'data.tar.gz'
2023-11-07 23:35:04,095 [1143316] INFO: Upload process started (worker 1)
2023-11-07 23:35:04,097 [1143315] INFO: Upload process started (worker 0)
2023-11-07 23:35:04,213 [1143316] INFO: Found credentials in shared credentials file: ~/.aws/credentials
2023-11-07 23:35:04,220 [1143315] INFO: Found credentials in shared credentials file: ~/.aws/credentials
2023-11-07 23:35:04,352 [1143316] INFO: Uploading 'barman/pg/base/20231107T233427/data.tar.gz', part '1' (worker 1)
2023-11-07 23:35:35,917 [1143315] INFO: Uploading 'barman/pg/base/20231107T233427/data.tar.gz', part '2' (worker 0)
2023-11-07 23:36:03,578 [1143316] INFO: Uploading 'barman/pg/base/20231107T233427/data.tar.gz', part '3' (worker 1)
...
Eventually, the backup will complete and we can check it in Healthchecks.io. We can also use barman-cloud-backup-list
to list the backups:
rob@pg:~$ barman-cloud-backup-list s3://<container>/barman pg
Backup ID End Time Begin Wal Archival Status Name
20231023T132628 2023-10-23 13:33:25 000000010000004E00000060
20231103T130531 2023-11-03 13:22:11 000000010000005200000081
20231103T135700 2023-11-03 14:11:59 000000010000005200000083
20231107T211340 2023-11-07 21:28:10 0000000100000052000000B1
20231107T230140 2023-11-07 23:12:41 0000000100000052000000B9
20231107T231341 2023-11-07 23:22:42 0000000100000052000000BB
20231107T234029 2023-11-07 23:48:10 0000000100000052000000C1
Step 12: Configure barman-cloud-backup to run on a schedule
Create the timer as /etc/systemd/system/barman-cloud-backup.timer
[Unit]
Description=Run Barman Cloud Backup every 6 hours
[Timer]
OnCalendar=*-*-* 00/6:00:00
Persistent=true
[Install]
WantedBy=timers.target
Install the timer with:
sudo systemctl enable barman-cloud-backup.timer
sudo systemctl start barman-cloud-backup.timer
Step 13: Configure PostgreSQL to use barman-wal-cloud-archive
to archive WAL files to S3
Barman uses WAL archives for a restore. By configuring PostgreSQL to ship WAL archives directly to S3, we can achieve almost no loss of data on failure. We’ll do this by setting the archive_command
and archive_mode
configuration item in /etc/postgresql/15/main/postgresql.conf
to the following values:
archive_mode = on
archive_command = 'barman-cloud-wal-archive --snappy s3://<container_name>/barman pg %p'
archive_mode
tells PostgreSQL to process completed archive files with the archive_command
. That means when the archive file completes, it is uploaded to S3
Step 14: Verify your backup works
I’ll write another page on this. For now, check out the barman-cloud-restore
documentation.
And that’s it! Check healthchecks.io for exceptions, check your S3 storage costs, and periodically test a restore!