Common DB schema change mistakes

Common DB schema change mistakes | Database Lab · Instant clones of PostgreSQL databases · Postgres.ai.

Common DB schema change mistakes
'rake db:migrate' – a command known to any Ruby developer. But how many times did we step on the same rake?

In his article "Lesser Known PostgreSQL Features", @be_haki describes 18 Postgres features many people don't know. I enjoyed that article, and it inspired me to write about "anti-features" – things that everyone should avoid when working in probably the riskiest field of application development – so-called "schema migrations".

This is one of my favorite topics in the field of relational databases. We all remember how MongoDB entered the stage with two clear messages: "web-scale" (let's have sharding out-of-the-box) and "schemaless" (let's avoid designing schemas and allow full flexibility). In my opinion, both buzzwords are an oversimplification, but if you have experience in reviewing and deploying schema changes in relational databases, you probably understand the level of difficulty, risks, and pain of scaling the process of making schema changes. My personal score: 1000+ migrations designed/reviewed/deployed during 17+ years of using Postgres in my own companies and when consulting others such as GitLab, Chewy, Miro. Here I'm going to share what I've learned, describing some mistakes I've made or observed – so probably next time you'll avoid them.

Moreover, a strong desire to help people avoid such mistakes led me to invent of the Database Lab Engine – a technology for thin cloning of databases, essential for development and testing. With it, you can clone a 10 TiB database in 10 seconds, test schema changes, and understand the risks before deployment. Most cases discussed in this article can be easily detected by such testing, and it can be done automatically in CI/CD pipelines.

As usual, I'll be focusing on OLTP use cases (mobile and web apps), for which query execution that exceeds 1 second is normally considered too slow. Some cases discussed here are hard to notice in small databases with low activity. But I'm pretty confident that you'll encounter most of them when your database grows to ~10 TiB in size and its load reaches ~104–105 transactions per second (of course, some cases will be seen – unless deliberately prevented – much, much earlier).

I advise you to read GitLab's great documentation – their Migration Style Guide is full of wisdom written by those who have experience in deploying numerous Postgres schema changes in a fully automated fashion to a huge number of instances, including GitLab.com itself.

I also encourage everyone to watch PGCon-2022 – one of the key Postgres conferences; this time, it's happening online again. On Thursday, May 26, I'll give two talks, and one of them is called "Common DB schema change mistakes", you can find the slide deck here. If you missed it, no worries – [@DLangille](https://twitter.com/DLangille), who has organized the conference since 2006 (thank you, Dan!), promises to publish talk videos in a few weeks.

Terminology

The term "DB migration" may be confusing, it's often used to describe the task of switching from one database system to another, moving database and minimizing possible negative effects (such as long downtime).

In this article, I'm going to talk about the second meaning of the term – DB schema changes having the following properties:
continue reading →

Read in full here:

This thread was posted by one of our members via one of our news source trackers.

Corresponding tweet for this thread:

Share link for this tweet.