Use Barman to back up PostgreSQL on an Azure VM to Azure Blob storage
In a previous post, I created a Barman backup script to back up PostgreSQL running in an VM to AWS S3. If you host your PostgreSQL server in Azure, this can get expensive quickly because you pay egress bandwidth fees to Microsoft. In this article, i’ll show you how to use Azure Blob storage instead.
Step 1: Install Barman, barman-cli-cloud, snappy etc.
see original article.
Step 2: Install Azure CLI instead of AWS S3 CLI
Follow the instructions on the Microsoft website.
Step 3: Install azure-storage-blob
Python package
Log in as the postgres
user and use pip to install azure-blob-storage
:
sudo -u postgres /bin/bash
pip install azure-storage-blob
Step 4: Create an Azure storage account and a container
Follow the Microsoft documentation to create a new storage account in the same region as your VM (to avoid inter-region data transfer fees) and then create a container in it. I used standard storage, enabled only private networks (i.e. to connect from my VM via a private end point), and disabled soft deletes. I then created a container called backup
. Note that disabling public networks will prevent you browsing the container – you can edit this later via the GUI.
Be sure to create your storage account in the same availability zone as your VM. Microsoft is introducing inter-AZ bandwidth charges for VMs in the future, and so it’s inevitable that it is introduced for other services too.
Step 5: Assign a managed identity to the VM and grant it access to the container
Here, we’re basically following the instructions from the Microsoft website.
First, enable a system-managed identity for the VM: go to the VM in Azure, and under “Security” find the “Identity” panel. Under “System Assigned”, turn the status to “on”.
Go to the Azure Storage Account you created above. Navigate to the storage account itself (not the container in it), and then to “Access Control (IAM)”. Choose the option to add a Role Assignment. For the Role, select Storage Blob Data Contributor
. On the “Members” tab search for Managed Identity of the VM. Save the assignment.
Step 6: Test that the credentials work by running a manual backup
We need to do this as the postgres
user, so that it has access to the PostgreSQL database files. First, we’ll log in as the postgres
user with sudo -u postgres /bin/bash
. then run a manual backup:
postgres@pg:~$ barman-cloud-backup -v --cloud-provider azure-blob-storage --azure-credential=managed-identity --snappy -d postgres "azure://yourcontainername.blob.core.windows.net/backup" "server_name"
2023-12-10 12:15:04,993 [327423] INFO: Authenticating to Azure with shared key
2023-12-10 12:15:05,068 [327423] INFO: Request URL: 'https://yourcontainername.blob.core.windows.net/backup?restype=REDACTED&comp=REDACTED'
Request method: 'GET'
Request headers:
'x-ms-version': 'REDACTED'
'Accept': 'application/xml'
'User-Agent': 'azsdk-python-storage-blob/12.19.0 Python/3.10.12 (Linux-6.2.0-1018-azure-x86_64-with-glibc2.35)'
'x-ms-date': 'REDACTED'
'x-ms-client-request-id': 'c0613fee-9755-11ee-a617-979c6065e555'
'Authorization': 'REDACTED'
No body was attached to the request
2023-12-10 12:15:05,138 [327423] INFO: Response status: 200
Response headers:
'Transfer-Encoding': 'chunked'
'Content-Type': 'application/xml'
'Server': 'Windows-Azure-Blob/1.0 Microsoft-HTTPAPI/2.0'
'x-ms-request-id': 'c33dbf1e-e01e-004d-0a62-2b4fc2000000'
'x-ms-client-request-id': 'c0613fee-9755-11ee-a617-979c6065e555'
'x-ms-version': 'REDACTED'
'Date': 'Sun, 10 Dec 2023 12:15:08 GMT'
2023-12-10 12:15:05,151 [327423] INFO: Starting backup '20231210T121505'
2023-12-10 12:15:05,194 [327423] INFO: Uploading 'pgdata' directory '/mnt/postgres/postgresql/15/main' as 'data.tar.snappy'
2023-12-10 12:15:05,522 [327428] INFO: Upload process started (worker 0)
2023-12-10 12:15:05,523 [327428] INFO: Authenticating to Azure with shared key
2023-12-10 12:15:05,541 [327429] INFO: Upload process started (worker 1)
2023-12-10 12:15:05,542 [327429] INFO: Authenticating to Azure with shared key
2023-12-10 12:15:05,545 [327428] INFO: Uploading 'PG/base/20231210T121505/data.tar.snappy', part '1' (worker 0)
...
2023-12-10 12:19:11,568 [327423] INFO: Backup end at LSN: 63/9A000138 (00000002000000630000009A, 00000138)
2023-12-10 12:19:11,568 [327423] INFO: Backup completed (start time: 2023-12-10 12:15:05.151810, elapsed time: 4 minutes, 6 seconds)
2023-12-10 12:19:11,569 [327429] INFO: Upload process stopped (worker 1)
2023-12-10 12:19:11,569 [327428] INFO: Upload process stopped (worker 0)
From this we can see that the backup was successful, two workers ran and it took just over 4 minutes. We’re now ready to create our script. Create it as ~postgres/backup-script.sh
:
#!/bin/bash
# Variables
BACKUP_DIR="/var/lib/postgresql/backup"
DATE_SUFFIX=$(date +%F_%H-%M-%S)
LOG_FILE="$BACKUP_DIR/barman_backup_log_$DATE_SUFFIX.txt"
AZURE_CONTAINER="azure://yourcontainername.blob.core.windows.net/backup" # Replace with your Azure Blob Storage container URL
HEALTHCHECK_URL="https://hc-ping.com/<slug>"
SERVER_NAME="pg"
RETENTION_POLICY="RECOVERY WINDOW OF 30 DAYS" # Adjust the retention policy as needed
RETAIN_LOG_DAYS=7
# create backup temp dir if it doesnt exist
mkdir -p $BACKUP_DIR
# Redirect all output to log file
exec > "$LOG_FILE" 2>&1
# Function to send log to healthchecks.io
send_log() {
local url="$1"
curl -fsS --retry 3 -m 10 -X POST -H "Content-Type: text/plain" --data-binary "@$LOG_FILE" "$url"
}
# Perform backup with Barman
# dont use verbose (-v) as output will be too long for healthchecks.io
barman-cloud-backup --cloud-provider=azure-blob-storage --azure-credential=managed-identity --snappy -p 31432 -d postgres "$AZURE_CONTAINER" "$SERVER_NAME" || {
send_log "$HEALTHCHECK_URL/fail"
exit 1
}
# Delete old backups according to retention policy
barman-cloud-backup-delete --cloud-provider=azure-blob-storage --azure-credential=managed-identity --retention-policy "$RETENTION_POLICY" "$AZURE_CONTAINER" "$SERVER_NAME" || {
send_log "$HEALTHCHECK_URL/fail"
exit 1
}
# Notify healthchecks.io of success and send log
send_log "$HEALTHCHECK_URL"
# Finally, delete old log files in BACKUP_DIR
find "$BACKUP_DIR" -type f -name 'barman_backup_log_*.txt' -mtime +$RETAIN_LOG_DAYS -exec rm -f {} \;
Remember to make it executable with chmod +x backup-script.sh
. Create the $BACKUP
directory (in this script, it’s /var/lib/postgresql/backup
because /var/lib/postgresql
is the home directory of the postgres
user). While logged in as postgres
use mkdir -p ~postgres/backup
Step 7: Create a backup service
We now need to create our backup service. Use this content as /etc/systemd/system/barman-cloud-backup.service
:
[Unit]
Description=Barman Cloud Backup Service
[Service]
Type=oneshot
ExecStart=/var/lib/postgresql/backup-script.sh
User=postgres
There’s no need to ‘enable’ the service because it has no ‘installation config’ i.e. it’s not triggered by another service – we’ll use a timer later to trigger it. But for now, lets test it:
sudo systemctl start barman-cloud-backup
You’ll see nothing for a while, then the command prompt will return. We can check the logs at $BACKUP_DIR
(configured in the script). At the end, we can use systemctl to check for success:
rob@pg:~$ sudo systemctl status barman-cloud-backup.service
○ barman-cloud-backup.service - Barman Cloud Backup Service
Loaded: loaded (/etc/systemd/system/barman-cloud-backup.service; static)
Active: inactive (dead) since Sun 2023-12-10 12:57:01 UTC; 42s ago
TriggeredBy: ○ barman-cloud-backup.timer
Process: 330631 ExecStart=/var/lib/postgresql/backup-script.sh (code=exited, status=0/SUCCESS)
Main PID: 330631 (code=exited, status=0/SUCCESS)
CPU: 2min 41.835s
Dec 10 12:52:52 pg systemd[1]: Starting Barman Cloud Backup Service...
Dec 10 12:57:01 pg systemd[1]: barman-cloud-backup.service: Deactivated successfully.
Dec 10 12:57:01 pg systemd[1]: Finished Barman Cloud Backup Service.
Dec 10 12:57:01 pg systemd[1]: barman-cloud-backup.service: Consumed 2min 41.835s CPU time.
Finally, we can check on healthchecks.io to see that the ping was successful. We can also use barman-cloud-backup-list
to see:
postgres@pg:~$ barman-cloud-backup-list --cloud-provider=azure-blob-storage --azure-credential=managed-identity azure://containername.blob.core.windows.net/backup pg
Backup ID End Time Begin Wal Archival Status Name
20231210T124733 2023-12-10 12:51:39 00000002000000630000009F
20231210T125252 2023-12-10 12:56:58 0000000200000063000000A2
Step 8: Create a timer
Refer to the original article.
Step 9: configure WAL to Azure
See the main article for most of the settings e.g. archive_mode
, wal_level
, archive_timeout
. We’ll edit archive_command
in /etc/postgresql/15/main/postgresql.conf
to call this file:
archive_command = 'barman-cloud-wal-archive --snappy --cloud-provider=azure-blob-storage --azure-credential=managed-identity azure://pgcynexianetbackup.blob.core.windows.net/backup pg %p'
Now restart PostgreSQL:
sudo systemctl restart [email protected]
If we browse to the container in the storage account, we can see that it has a folder for the server, then a base
folder containing timestamped backups, and a wals
folder containing the WAL files.
Step 10: test your backup
As always with a backup – test that it works by restoring – check out my guide to restoring a Barman backup from S3 – it’s basically the same, and you can adjust the parameters with the setup and steps above.