Migrate your TimescaleDB from timescale.com to self hosted

I’ve been using timescale.com for about a year, but it’s quite expensive for a hobbyist (their cheapest plan in the UK is about $50/month), so I thought i’d try and implement something cheaper. I know i won’t get the cool automatic DR failover or the sexy continuous backups – but it’s not really mission critical if i lose a small amount of the history of my home energy consumption. Timescale publish some instructions, but they weren’t complete, and didnt work for me.

  1. Step 1 – install base OS
  2. Step 2: install PostgreSQL and move data to other drive (and change the port)
  3. Step 3: Install TimescaleDB
  4. Step 4: back up your existing DB
  5. Step 5: Create roles in the database
  6. Step 6: actually restore the data
  7. Step 7: Allow external network connections
  8. Next steps

Step 1 – install base OS

I use linode. I chose a Linode 2GB which comes in at around $12/month for 1 shared CPU and 2GB ram. This is about double what i had on timescale, which was 0.5 CPU. I added a 50GB storage account, which woudl allow me to rebuild my linode without worrying about losing my database data, bringing my total to $17/month – but this isnt necessary.

I installed Ubuntu 22.04 LTS, and set up default automatic updates, disable SSH password auth etc. using my standard setup scripts. I decided to use ZFS as the file system for the data drive to take advantage of the native error checking capabilities. It does have some performance implications, but for a single drive with a relatively small database, i think this is an acceptable tradeoff, especially when compared to the significant performance hit of BTFRS. I called the additional volume postgres, then mounted this at /mnt/postgres

First, install ZFS, create ZFS pool,

sudo apt-get install zfsutils-linux
sudo zpool create postgres_storage /dev/disk/by-id/scsi-0Linode_Volume_postgres

If the volume was previously used for something else, ZFS will warn you:
/dev/disk/by-id/scsi-0Linode_Volume_postgres contains a filesystem of type 'ext4'
You’ll need to erase the existing file system before creating the new ZFS pool. Obviously this is destructive and you’ll lose all the data on the drive:
sudo wipefs --all --force /dev/disk/by-id/scsi-0Linode_Volume_postgres

Create a new ZFS dataset and mount it:

sudo zfs create postgres_storage/data
sudo zfs set mountpoint=/mnt/postgres postgres_storage/data

There’s no need to modify /etc/fstab as ZFS will automatically manage mounting of datasets within a pool at boot time.

Step 2: install PostgreSQL and move data to other drive (and change the port)

Next, i had to install PostgreSQL and TimescaleDB. First, install PostgreSQL itself. Timescale’s latest supported version is 15. I had to change the instructions slightly from the PostgreSQL website because they use the legacy keystore.

# crate repository info
sudo sh -c 'echo "deb https://apt.postgresql.org/pub/repos/apt $(lsb_release -cs)-pgdg main" > /etc/apt/sources.list.d/pgdg.list'

# Import the repository signing key - note this instruction is different than the PostgreSQL website.
wget --quiet -O - https://www.postgresql.org/media/keys/ACCC4CF8.asc | sudo gpg --dearmour -o /etc/apt/trusted.gpg.d/postgres.gpg

# fix permissions on the new key
sudo chmod 644 /etc/apt/trusted.gpg.d/*.gpg

# Update the package lists:
sudo apt-get update

# Install version 15 PostgreSQL:
sudo apt-get -y install postgresql-15

ok great! Now lets move the database storage to our new ZFS volume. First, i need to stop the service:

sudo systemctl stop postgresql

then open the config file to edit:

sudo nano /etc/postgresql/15/main/postgresql.conf

I also want to change the port that PostgreSQL is on. So find the port line and edit it:

port = 15433

and move the data by updating the data_directory property:

data_directory = '/mnt/postgres/postgresql/15/main'

Finally, move the database, update permissions and restart PostgreSQL:

sudo cp -ra /var/lib/postgresql /mnt/postgres/
sudo chown -R postgres:postgres /mnt/postgres/
sudo chmod -R 700 /mnt/postgres/
sudo systemctl start postgresql

Verify the new location:

rob@server:~$ sudo -u postgres psql -c "show data_directory;"
          data_directory          
----------------------------------
 /mnt/postgres/postgresql/15/main
(1 row)

If there are any problems, check the logs with sudo cat /var/log/postgresql/postgresql-15-main.log

Delete the old storage so that we don’t get confused:

sudo rm -r /var/lib/postgresql/15/main

To make it easy to remember the port (i.e. so that i dont need to set it every time), i set it as an export:

echo "export PGPORT=15433" >> ~/.bashrc 
source ~/.bashrc

Step 3: Install TimescaleDB

First, log in to the new server, then add the TimescaleDB repo:

echo "deb https://packagecloud.io/timescale/timescaledb/$(lsb_release -i -s | tr '[:upper:]' '[:lower:]')/ $(lsb_release -c -s) main" | sudo tee /etc/apt/sources.list.d/timescaledb.list

#fix permissions
sudo chmod 644 /etc/apt/sources.list.d/*.list

Then add the TimescaleDB GPG key:

wget --quiet -O - https://packagecloud.io/timescale/timescaledb/gpgkey | sudo gpg --dearmor -o /etc/apt/trusted.gpg.d/timescaledb.gpg

# fix permissions
sudo chmod 644 /etc/apt/trusted.gpg.d/*.gpg

Update and install:

sudo apt update
sudo apt install timescaledb-2-postgresql-15

The extension needs to be preloaded, so edit the config file and add it. Assuming there are no other preloaded extensions, just run this:

echo "shared_preload_libraries = 'timescaledb'" >> /etc/postgresql/15/main/postgresql.conf

To finish off, run timescaledb-tune to set up the extension:

sudo timescaledb-tune

Increase max_locks_per_transaction to 1024 in config file /etc/postgresql/*/main/postgresql.conf. First, uncomment it, if commented out:

sudo sed -i.bak -E '/^[[:space:]]*#?[[:space:]]*max_locks_per_transaction[[:space:]]*=[[:space:]]*[0-9]+/s/^([[:space:]]*)#?/\1/' /etc/postgresql/*/main/postgresql.conf

Then set the value to 1024:

sudo sed -i.bak -E '/^[[:space:]]*max_locks_per_transaction[[:space:]]*=[[:space:]]*[0-9]+/s/[0-9]+/1024/' /etc/postgresql/*/main/postgresql.conf

restart PostgreSQL with sudo systemctl restart postgresql

If you have any other extensions installed, be sure to install them now. For example, i also have PostGIS installed, so i followed the instructions on the PostGIS website – it uses the same repo as PostgreSQL so i just needed to install the package with sudo apt-get install postgresql-15-postgis-3 and i was done.

Step 4: back up your existing DB

This part is pretty easy. Stop any processes which are using the database so you dont lose data. Then run this command, substituting your own details for -h, -p and –U. I believe tsdb is the default database name on timescale.com but check yours. Do this from the new server – the slowest part of the process is the restore, so we want the files locally. That also lets you redo the restore if it fails without incurring ingress costs.

pg_dump -Fd -j 1 -h c99aarsjk6.b099as2kb2.tsdb.cloud.timescale.com -p 15433 -U tsdbadmin -d tsdb -f _dump

I found that trying to run more than one job in parallel just failed with memory errors. I ended up with about 500GB of data for a ~2,500GB database.

Step 5: Create roles in the database

This was more tricky. You have to run a few more magic steps on the new server.

Timescale.com doesn’t let you dump roles because roles are cluster-wide. But if you’re migrating from your own server, you can do it with the command pg_dumpall --roles-only -U postgres_admin -h localhost -f roles.sql
You’ll need to filter roles.sql to only include the roles you’re interested in – but it’s much easier.

We need to database and roles i needed as pg_restore doesn’t create roles, only assigns privileges. First, start psql and connect to my hosted timescale.com database:

psql -h c99aarsjk6.b099as2kb2.tsdb.cloud.timescale.com -p 15433 -U tsdbadmin -d tsdb

You can find your roles with the \du command e.g.

sdb=> \du
   Role name      |            Attributes             |    Member of    
------------------+-----------------------------------+------------------
 conditionseditor | Cannot login                      | {}               
 functionrestapi  |                                   | {conditionseditor}
 functionwriter   |                                   | {conditionseditor}
 postgres         | Superuser, Create role, Create DB,| {}               
                  | Replication, Bypass RLS           |                  
 postgres_admin   | Superuser, Create DB              | {}               
 tsdbadmin        |                                   | {}               
 tsdbexplorer     | Cannot login                      | {}               

You’ll need to create these roles and add them together. Naturally, you should use long, complex passwords. To connect to the new database, create a postgres_admin user with a password. We do this by connecting as the postgres user:

sudo -u postgres psql

then create the user with a password:

CREATE ROLE postgres_admin;
ALTER ROLE postgres_admin WITH SUPERUSER INHERIT NOCREATEROLE CREATEDB LOGIN NOREPLICATION NOBYPASSRLS PASSWORD 'supersecretpassword';

Quit with \q.

Create a .pgpass file to store the password for login and set permissions on it to secure it. The docs say you can use * for hostname, port, database or username.

echo "localhost:*:*:postgres_admin:supersecretpassword" | tee ~/.pgpass
chmod 600 ~/.pgpass

We can then connect to the database using psql and our new postgres_admin user. You need to use -h so that you can specify -U.

psql -h localhost -U postgres_admin postgres

then create the roles:

CREATE ROLE conditionseditor WITH NOLOGIN;
CREATE ROLE functionrestapi WITH LOGIN PASSWORD '<pw_here>';
CREATE ROLE functionwriter WITH LOGIN PASSWORD '<pw_here>';
GRANT conditionseditor TO functionrestapi;
GRANT conditionseditor TO functionwriter;
CREATE ROLE tsdbadmin WITH LOGIN PASSWORD '<pw_here>';
CREATE ROLE tsdbexplorer NOLOGIN;

Don’t forget the admin users (tsdb… etc.).

Next, create your database, switch to it, and activate the timescaledb_pre_restore which does some magic. Inside psql:

CREATE DATABASE tsdb;
\c tsdb
CREATE EXTENSION IF NOT EXISTS timescaledb;
SELECT timescaledb_pre_restore();

then quit with \q.

Be sure to create the database with the same extension version as your current database server. Check the version with \dx timescaledb then use CREATE EXTENSION IF NOT EXISTS timescaledb VERSION 'xxx'; where xxx is the version in your current environment.

Step 6: actually restore the data

This step takes ages. I found that i need to avoid parallelism or i received referential integrity errors. Here’s the command which worked – obviously from the new server, as thats where i backed up to before. I do this using tmux in case my SSH connection drops:

pg_restore -Fd -j1 -h localhost -U postgres_admin -d tsdb -S postgres_admin --disable-triggers --exit-on-error --verbose _dump

You need the --disable-triggers and if you have that, you also need the -S flag to specify the superuser to run the disable commands as. We add --exit-on-error so that we dont lose track of things. When finished, log in to psql and run the post-restore script and update statistics using ANALYZE – this last step took a while too:

SELECT timescaledb_post_restore();
ANALYZE;

Step 7: Allow external network connections

By default, PostgreSQL doesn’t allow external connections. We can test that by using our connection string from another machine e.g.:

 ~  psql "postgres://functionwriter:<password>@server.example.com:15433/tsdb?sslmode=require"
psql: error: connection to server at "server.example.com" (2606:2800:220:1:248:1893:25c8:1946), port 15433 failed: Connection refused
	Is the server running on that host and accepting TCP/IP connections?
connection to server at "server.example.com" (1.2.3.4), port 15433 failed: Connection refused
	Is the server running on that host and accepting TCP/IP connections?

To do this, on the server, we need to adjust postgresql.conf to listen on our external addresses, and update pg_hba.conf to accept authentication requests from them too.

Set listen_addresses to '*'. First, uncomment the line listen_addresses:

sudo sed -i.bak -E '/^[[:space:]]*#?[[:space:]]*listen_addresses[[:space:]]*=[[:space:]]*/s/^#//' /etc/postgresql/*/main/postgresql.conf

Then, set value to '*'

sudo sed -i.bak -E "/^[[:space:]]*listen_addresses[[:space:]]*=[[:space:]]*/s/=.*/= '*'/" /etc/postgresql/*/main/postgresql.conf

Next, allow scram-sha-256 authentication from 0.0.0.0/0 for IPv4 and ::/0 for IPv6 we add them explicitly to the config file:

echo -e "# Allow all IP addresses with scram-sha-256 authentication\n# IPv4\nhost    all             all             0.0.0.0/0               scram-sha-256\n# IPv6\nhost    all             all             ::/0                    scram-sha-256" | sudo tee -a /etc/postgresql/*/main/pg_hba.conf

Then restart PostgreSQL:

sudo systemctl restart postgresql

I can test the connection from a different machine:

 ~  psql "postgres://functionwriter:<password>@server.example.com:15433/tsdb?sslmode=require"                                                                            ok 
psql (16.0, server 15.4 (Ubuntu 15.4-2.pgdg22.04+1))
SSL connection (protocol: TLSv1.3, cipher: TLS_AES_256_GCM_SHA384, compression: off)
Type "help" for help.

tsdb=> 

thats pretty much it! Now i just need to update the connection string in my apps, and i’m good to go.

Next steps

I should probably sort a few other things out:

  • backups – doesnt come for free any more. I need to figure something out, probably using pg_dump again.
  • Install fail2ban with a custom filter for postgresql
  • Automated updates – typically, i’d like new minor versions of TimescaleDB to be installed. Can i automate the psql command?
  • Automated updates – how to manage major version upgrades?
  • Updates to PostgreSQL – e.g. PostgreSQL 16 just came out.
  • of course i set up a firewall on linode, but can i do more?

The Multiplier Effect: How The Aggregation of Marginal Delays Derails Projects

The Aggregation of Marginal Gains is an improvement model attributed to Dave Brailsford. When he took over as the head of British Cycling in 2002, the team was near the bottom of the rankings. Brailsford’s approach was to look at their processes in tiny detail, and improve each part by 1%, the logic being that these would all add up to a significant improvement. He talks about shipping mattresses between hotels so that athletes get a better nights’ sleep, or fastidiously cleaning the team truck to prevent dust and debris undermining the fine tuning of the bikes. And it was a success – in the 2008 and 2012 olympics, the team won 70% of the gold medals for track cycling.

I want to propose a contrasting notion – the Aggregation of Marginal Delays – the slow accumulation of tiny lags and delays in a project that add up to a significant slip in delivery performance. These delays are often so small and (at the time) inconsequential that team members just brush them off. Perhaps you need to get approval from three people with busy calendars – it might take you a few days to get in their diary. Frustrating – but we’re all busy, right? Maybe you need to request something from another which takes half a day to released to you. Annoying – but the other team’s process is clearly laid out on their website – didn’t you read it? That person you needed to get advice from has taken the afternoon off to watch their kids nativity play. Who’d begrudge them that?

But these delays, each one small and explainable, add up, both quantitatively, and culturally. None of them is worth escalating – by the time you get this in front of someone who could change it, the delay is in the past. But the cumulative impact of a few hours here, half a day there, across dozens of events, across months of work, is significant. And it sets the tone for how things are done – we, as an organisation, start to feel that it’s an acceptable state of affairs – like I said, all of the causes of the delays are reasonable, none are malicious, or the result of incompetence. And because of that, when there are delays which could be avoided, they’re often not.

I’m afraid i don’t have a silver bullet here. We’ve tried lots of things to make the impact of these delays visible, but none have worked. In most cases, the cure is worse than the disease, creating massive overhead. Here’s what we’ve tried:

  • Immediate escalation – but this made people nervous, and often there was nothing to be done – are we really going to summon that team member back from their child’s school play to answer our question?
  • Flagging potential bottlenecks up front – although it did help somewhat to remind team members to consider when they might have to book things in advance, too much forward planning is somewhat wasteful, and hinders the agility of the team.
  • Pushing accountability down to teams – we already do this as much as we can. But the culture in our part of the organisation, which wants to move fast, is at odds with central IT services providing IT to the wider company, which needs to deliver a secure and reliable service which meets everyone’s needs. So we can’t make all the decisions.
  • Capture data on areas with consistent delays – we tried to use this to systematically improve the processes in those areas, but often the teams which own them weren’t interested in change. They felt our proposed ‘improvements’ would introduce too much risk. And it was quite burdensome to track every request too.

So what can you do? In short, treat it like any other Continuous Improvement exercise – gather data, plan, execute, review.

  • Introduce a system to track delays – partial data is better than none.
  • Hold other teams to their SLAs, and use your data to demonstrate this.
  • Run a “Delay Spotlight” in your team meetings, where team members can raise frustrating delays, and teams can brainstorm improvements. Focus pilots on areas you can control, rather than trying to change central teams.

The Aggregation of Marginal Delays is an inherent challenge in large , complex organisations, but one that we can chip away at with through collective analysis, data, communication and a mindset of continuous improvement. Remember that Marginal Gains accumulate too.

FIX: The key(s) in the keyring /etc/apt/trusted.gpg.d/???.gpg are ignored as the file is not readable by user ‘_apt’ executing apt-key

When running apt-get update i was seeing these errors:

W: http://security.ubuntu.com/ubuntu/dists/jammy-security/InRelease: The key(s) in the keyring /etc/apt/trusted.gpg.d/postgres.gpg are ignored as the file is not readable by user '_apt' executing apt-key.

I was getting this error after migrating keys from the old, legacy store to a the shiny new one.

A quick inspection shows that the new keys have different permissions to the existing ones

rob@localhost:~$ ls -alh /etc/apt/trusted.gpg.d/
total 40K
drwxr-xr-x 2 root root 4.0K Oct 13 14:00 .
drwxr-xr-x 8 root root 4.0K Oct 13 13:54 ..
-rw-r--r-- 1 root root 1.2K Sep  6 05:46 akamai-ubuntu-launchpad-ubuntu-ppa.gpg
-rw-r----- 1 root root 3.5K Oct 13 14:00 postgres.gpg
-rw-r----- 1 root root 2.8K Oct 13 14:00 timescaledb.gpg
-rw-r--r-- 1 root root 2.8K Mar 26  2021 ubuntu-keyring-2012-cdimage.gpg
-rw-r--r-- 1 root root 1.7K Mar 26  2021 ubuntu-keyring-2018-archive.gpg
-rw-r--r-- 1 root root 2.3K Oct 13 08:59 ubuntu-pro-cis.gpg
-rw-r--r-- 1 root root 2.2K Oct 13 08:57 ubuntu-pro-esm-apps.gpg
-rw-r--r-- 1 root root 2.3K Oct 13 08:57 ubuntu-pro-esm-infra.gpg

The fix is pretty simple. Pick one of the pre-existing GPG keys, and copy the permissions to all the other keys in the folder. In my case, i chose the ubuntu-pro-cis.gpg key, but you can pick any that doesnt report the permissions error. Pass it as a --reference argument to the chmod command:

rob@localhost:~$ sudo chmod --reference=/etc/apt/trusted.gpg.d/ubuntu-pro-cis.gpg /etc/apt/trusted.gpg.d/*.gpg
rob@localhost:~$ ls -alh /etc/apt/trusted.gpg.d/
total 40K
drwxr-xr-x 2 root root 4.0K Oct 13 14:00 .
drwxr-xr-x 8 root root 4.0K Oct 13 13:54 ..
-rw-r--r-- 1 root root 1.2K Sep  6 05:46 akamai-ubuntu-launchpad-ubuntu-ppa.gpg
-rw-r--r-- 1 root root 3.5K Oct 13 14:00 postgres.gpg
-rw-r--r-- 1 root root 2.8K Oct 13 14:00 timescaledb.gpg
-rw-r--r-- 1 root root 2.8K Mar 26  2021 ubuntu-keyring-2012-cdimage.gpg
-rw-r--r-- 1 root root 1.7K Mar 26  2021 ubuntu-keyring-2018-archive.gpg
-rw-r--r-- 1 root root 2.3K Oct 13 08:59 ubuntu-pro-cis.gpg
-rw-r--r-- 1 root root 2.2K Oct 13 08:57 ubuntu-pro-esm-apps.gpg
-rw-r--r-- 1 root root 2.3K Oct 13 08:57 ubuntu-pro-esm-infra.gpg

Problem solved!

FIX: Key is stored in legacy trusted.gpg keyring

While running apt-get update I was seeing errors:

W: https://packagecloud.io/timescale/timescaledb/ubuntu/dists/jammy/InRelease: Key is stored in legacy trusted.gpg keyring (/etc/apt/trusted.gpg), see the DEPRECATION section in apt-key(8) for details.

Although the warning is annoying, it doesn’t stop things updating. I understand the reasons why the legacy keyring is being removed.

Migrate existing keys to the new keyring

First, list the keys:

sudo apt-key list

In my case, i’ve got two – one for PostgreSQL and one for timescaledb. You will probably see a bunch of extra keys here too.

rob@localhost:~$ sudo apt-key list
Warning: apt-key is deprecated. Manage keyring files in trusted.gpg.d instead (see apt-key(8)).
/etc/apt/trusted.gpg
--------------------
pub   rsa4096 2011-10-13 [SC]
      B97B 0AFC AA1A 47F0 44F2  44A0 7FCC 7D46 ACCC 4CF8
uid           [ unknown] PostgreSQL Debian Repository

pub   rsa4096 2018-10-19 [SCEA]
      1005 FB68 604C E9B8 F687  9CF7 59F1 8EDF 47F2 4417
uid           [ unknown] https://packagecloud.io/timescale/timescaledb (https://packagecloud.io/docs#gpg_signing) <support@packagecloud.io>
sub   rsa4096 2018-10-19 [SEA]

Export the key by copying the last 8 characters of the identifier. Because I have two keys to export, i did this twice, giving each key a unique filename under /etc/apt/trusted.gpg.d/:

rob@localhost:~$ sudo apt-key export ACCC4CF8 | sudo gpg --dearmour -o /etc/apt/trusted.gpg.d/postgres.gpg
Warning: apt-key is deprecated. Manage keyring files in trusted.gpg.d instead (see apt-key(8)).
rob@localhost:~$ sudo apt-key export 47F24417 | sudo gpg --dearmour -o /etc/apt/trusted.gpg.d/timescaledb.gpg
Warning: apt-key is deprecated. Manage keyring files in trusted.gpg.d instead (see apt-key(8)).

Bingo – package updates now work! But if they don’t you might get this error:

W: http://security.ubuntu.com/ubuntu/dists/jammy-security/InRelease: The key(s) in the keyring /etc/apt/trusted.gpg.d/postgres.gpg are ignored as the file is not readable by user '_apt' executing apt-key.

In which case, check out this other post explaining how to solve it.