Migrate your TimescaleDB from timescale.com to self hosted
I’ve been using timescale.com for about a year, but it’s quite expensive for a hobbyist (their cheapest plan in the UK is about $50/month), so I thought i’d try and implement something cheaper. I know i won’t get the cool automatic DR failover or the sexy continuous backups – but it’s not really mission critical if i lose a small amount of the history of my home energy consumption. Timescale publish some instructions, but they weren’t complete, and didnt work for me.
Step 1 – install base OS
I use linode. I chose a Linode 2GB which comes in at around $12/month for 1 shared CPU and 2GB ram. This is about double what i had on timescale, which was 0.5 CPU. I added a 50GB storage account, which woudl allow me to rebuild my linode without worrying about losing my database data, bringing my total to $17/month – but this isnt necessary.
I installed Ubuntu 22.04 LTS, and set up default automatic updates, disable SSH password auth etc. using my standard setup scripts. I decided to use ZFS as the file system for the data drive to take advantage of the native error checking capabilities. It does have some performance implications, but for a single drive with a relatively small database, i think this is an acceptable tradeoff, especially when compared to the significant performance hit of BTFRS. I called the additional volume postgres
, then mounted this at /mnt/postgres
First, install ZFS, create ZFS pool,
sudo apt-get install zfsutils-linux
sudo zpool create postgres_storage /dev/disk/by-id/scsi-0Linode_Volume_postgres
If the volume was previously used for something else, ZFS will warn you:/dev/disk/by-id/scsi-0Linode_Volume_postgres contains a filesystem of type 'ext4'
You’ll need to erase the existing file system before creating the new ZFS pool. Obviously this is destructive and you’ll lose all the data on the drive:sudo wipefs --all --force /dev/disk/by-id/scsi-0Linode_Volume_postgres
Create a new ZFS dataset and mount it:
sudo zfs create postgres_storage/data
sudo zfs set mountpoint=/mnt/postgres postgres_storage/data
There’s no need to modify /etc/fstab
as ZFS will automatically manage mounting of datasets within a pool at boot time.
Step 2: install PostgreSQL and move data to other drive (and change the port)
Next, i had to install PostgreSQL and TimescaleDB. First, install PostgreSQL itself. Timescale’s latest supported version is 15. I had to change the instructions slightly from the PostgreSQL website because they use the legacy keystore.
# crate repository info
sudo sh -c 'echo "deb https://apt.postgresql.org/pub/repos/apt $(lsb_release -cs)-pgdg main" > /etc/apt/sources.list.d/pgdg.list'
# Import the repository signing key - note this instruction is different than the PostgreSQL website.
wget --quiet -O - https://www.postgresql.org/media/keys/ACCC4CF8.asc | sudo gpg --dearmour -o /etc/apt/trusted.gpg.d/postgres.gpg
# fix permissions on the new key
sudo chmod 644 /etc/apt/trusted.gpg.d/*.gpg
# Update the package lists:
sudo apt-get update
# Install version 15 PostgreSQL:
sudo apt-get -y install postgresql-15
ok great! Now lets move the database storage to our new ZFS volume. First, i need to stop the service:
sudo systemctl stop postgresql
then open the config file to edit:
sudo nano /etc/postgresql/15/main/postgresql.conf
I also want to change the port that PostgreSQL is on. So find the port
line and edit it:
port = 15433
and move the data by updating the data_directory
property:
data_directory = '/mnt/postgres/postgresql/15/main'
Finally, move the database, update permissions and restart PostgreSQL:
sudo cp -ra /var/lib/postgresql /mnt/postgres/
sudo chown -R postgres:postgres /mnt/postgres/
sudo chmod -R 700 /mnt/postgres/
sudo systemctl start postgresql
Verify the new location:
rob@server:~$ sudo -u postgres psql -c "show data_directory;"
data_directory
----------------------------------
/mnt/postgres/postgresql/15/main
(1 row)
If there are any problems, check the logs with sudo cat /var/log/postgresql/postgresql-15-main.log
Delete the old storage so that we don’t get confused:
sudo rm -r /var/lib/postgresql/15/main
To make it easy to remember the port (i.e. so that i dont need to set it every time), i set it as an export:
echo "export PGPORT=15433" >> ~/.bashrc
source ~/.bashrc
Step 3: Install TimescaleDB
First, log in to the new server, then add the TimescaleDB repo:
echo "deb https://packagecloud.io/timescale/timescaledb/$(lsb_release -i -s | tr '[:upper:]' '[:lower:]')/ $(lsb_release -c -s) main" | sudo tee /etc/apt/sources.list.d/timescaledb.list
#fix permissions
sudo chmod 644 /etc/apt/sources.list.d/*.list
Then add the TimescaleDB GPG key:
wget --quiet -O - https://packagecloud.io/timescale/timescaledb/gpgkey | sudo gpg --dearmor -o /etc/apt/trusted.gpg.d/timescaledb.gpg
# fix permissions
sudo chmod 644 /etc/apt/trusted.gpg.d/*.gpg
Update and install:
sudo apt update
sudo apt install timescaledb-2-postgresql-15
To finish off, run timescaledb-tune
to set up the extension:
Increase max_locks_per_transaction
to 1024
in config file /etc/postgresql/*/main/postgresql.conf
. First, uncomment it, if commented out:
sudo sed -i.bak -E '/^[[:space:]]*#?[[:space:]]*max_locks_per_transaction[[:space:]]*=[[:space:]]*[0-9]+/s/^([[:space:]]*)#?/\1/' /etc/postgresql/*/main/postgresql.conf
Then set the value to 1024:
sudo sed -i.bak -E '/^[[:space:]]*max_locks_per_transaction[[:space:]]*=[[:space:]]*[0-9]+/s/[0-9]+/1024/' /etc/postgresql/*/main/postgresql.conf
restart PostgreSQL with sudo systemctl restart postgresql
If you have any other extensions installed, be sure to install them now. For example, i also have PostGIS installed, so i followed the instructions on the PostGIS website – it uses the same repo as PostgreSQL so i just needed to install the package with sudo apt-get install postgresql-15-postgis-3
and i was done.
Step 4: back up your existing DB
This part is pretty easy. Stop any processes which are using the database so you dont lose data. Then run this command, substituting your own details for -h
, -p
and –U
. I believe tsdb
is the default database name on timescale.com but check yours. Do this from the new server – the slowest part of the process is the restore, so we want the files locally. That also lets you redo the restore if it fails without incurring ingress costs.
pg_dump -Fd -j 1 -h c99aarsjk6.b099as2kb2.tsdb.cloud.timescale.com -p 15433 -U tsdbadmin -d tsdb -f _dump
I found that trying to run more than one job in parallel just failed with memory errors. I ended up with about 500GB of data for a ~2,500GB database.
Step 5: Create roles in the database
This was more tricky. You have to run a few more magic steps on the new server.
Timescale.com doesn’t let you dump roles because roles are cluster-wide. But if you’re migrating from your own server, you can do it with the command pg_dumpall --roles-only -U postgres_admin -h localhost -f roles.sql
You’ll need to filter roles.sql
to only include the roles you’re interested in – but it’s much easier.
We need to database and roles i needed as pg_restore
doesn’t create roles, only assigns privileges. First, start psql and connect to my hosted timescale.com database:
psql -h c99aarsjk6.b099as2kb2.tsdb.cloud.timescale.com -p 15433 -U tsdbadmin -d tsdb
You can find your roles with the \du
command e.g.
sdb=> \du
Role name | Attributes | Member of
------------------+-----------------------------------+------------------
conditionseditor | Cannot login | {}
functionrestapi | | {conditionseditor}
functionwriter | | {conditionseditor}
postgres | Superuser, Create role, Create DB,| {}
| Replication, Bypass RLS |
postgres_admin | Superuser, Create DB | {}
tsdbadmin | | {}
tsdbexplorer | Cannot login | {}
You’ll need to create these roles and add them together. Naturally, you should use long, complex passwords. To connect to the new database, create a postgres_admin
user with a password. We do this by connecting as the postgres
user:
sudo -u postgres psql
then create the user with a password:
CREATE ROLE postgres_admin;
ALTER ROLE postgres_admin WITH SUPERUSER INHERIT NOCREATEROLE CREATEDB LOGIN NOREPLICATION NOBYPASSRLS PASSWORD 'supersecretpassword';
Quit with \q
.
Create a .pgpass
file to store the password for login and set permissions on it to secure it. The docs say you can use *
for hostname, port, database or username.
echo "localhost:*:*:postgres_admin:supersecretpassword" | tee ~/.pgpass
chmod 600 ~/.pgpass
We can then connect to the database using psql and our new postgres_admin
user. You need to use -h
so that you can specify -U
.
psql -h localhost -U postgres_admin postgres
then create the roles:
CREATE ROLE conditionseditor WITH NOLOGIN;
CREATE ROLE functionrestapi WITH LOGIN PASSWORD '<pw_here>';
CREATE ROLE functionwriter WITH LOGIN PASSWORD '<pw_here>';
GRANT conditionseditor TO functionrestapi;
GRANT conditionseditor TO functionwriter;
CREATE ROLE tsdbadmin WITH LOGIN PASSWORD '<pw_here>';
CREATE ROLE tsdbexplorer NOLOGIN;
Don’t forget the admin users (tsdb… etc.).
Next, create your database, switch to it, and activate the timescaledb_pre_restore
which does some magic. Inside psql:
CREATE DATABASE tsdb;
\c tsdb
CREATE EXTENSION IF NOT EXISTS timescaledb;
SELECT timescaledb_pre_restore();
then quit with \q
.
Step 6: actually restore the data
This step takes ages. I found that i need to avoid parallelism or i received referential integrity errors. Here’s the command which worked – obviously from the new server, as thats where i backed up to before. I do this using tmux in case my SSH connection drops:
pg_restore -Fd -j1 -h localhost -U postgres_admin -d tsdb -S postgres_admin --disable-triggers --exit-on-error --verbose _dump
You need the --disable-triggers
and if you have that, you also need the -S
flag to specify the superuser to run the disable commands as. We add --exit-on-error
so that we dont lose track of things. When finished, log in to psql and run the post-restore script and update statistics using ANALYZE
– this last step took a while too:
SELECT timescaledb_post_restore();
ANALYZE;
Step 7: Allow external network connections
By default, PostgreSQL doesn’t allow external connections. We can test that by using our connection string from another machine e.g.:
~ psql "postgres://functionwriter:<password>@server.example.com:15433/tsdb?sslmode=require"
psql: error: connection to server at "server.example.com" (2606:2800:220:1:248:1893:25c8:1946), port 15433 failed: Connection refused
Is the server running on that host and accepting TCP/IP connections?
connection to server at "server.example.com" (1.2.3.4), port 15433 failed: Connection refused
Is the server running on that host and accepting TCP/IP connections?
To do this, on the server, we need to adjust postgresql.conf
to listen on our external addresses, and update pg_hba.conf
to accept authentication requests from them too.
Set listen_addresses
to '*'
. First, uncomment the line listen_addresses
:
sudo sed -i.bak -E '/^[[:space:]]*#?[[:space:]]*listen_addresses[[:space:]]*=[[:space:]]*/s/^#//' /etc/postgresql/*/main/postgresql.conf
Then, set value to '*'
sudo sed -i.bak -E "/^[[:space:]]*listen_addresses[[:space:]]*=[[:space:]]*/s/=.*/= '*'/" /etc/postgresql/*/main/postgresql.conf
Next, allow scram-sha-256 authentication from 0.0.0.0/0
for IPv4 and ::/0
for IPv6 we add them explicitly to the config file:
echo -e "# Allow all IP addresses with scram-sha-256 authentication\n# IPv4\nhost all all 0.0.0.0/0 scram-sha-256\n# IPv6\nhost all all ::/0 scram-sha-256" | sudo tee -a /etc/postgresql/*/main/pg_hba.conf
Then restart PostgreSQL:
sudo systemctl restart postgresql
I can test the connection from a different machine:
~ psql "postgres://functionwriter:<password>@server.example.com:15433/tsdb?sslmode=require" ok
psql (16.0, server 15.4 (Ubuntu 15.4-2.pgdg22.04+1))
SSL connection (protocol: TLSv1.3, cipher: TLS_AES_256_GCM_SHA384, compression: off)
Type "help" for help.
tsdb=>
thats pretty much it! Now i just need to update the connection string in my apps, and i’m good to go.
Next steps
I should probably sort a few other things out:
- backups – doesnt come for free any more. I need to figure something out, probably using pg_dump again.
- Install fail2ban with a custom filter for postgresql
- Automated updates – typically, i’d like new minor versions of TimescaleDB to be installed. Can i automate the psql command?
- Automated updates – how to manage major version upgrades?
- Updates to PostgreSQL – e.g. PostgreSQL 16 just came out.
- of course i set up a firewall on linode, but can i do more?
2 thoughts on “Migrate your TimescaleDB from timescale.com to self hosted”
For backups take a look at barman. It’s the daddy of postgres backup systems. Point in time recovery too https://pgbarman.org
thanks for the pointer! i think i’ve managed to set this up: https://blog.cynexia.com/how-to-automate-backups-of-a-postgresql-server/