Hello @pzac! Thanks for reading the book and providing this feedback. These contradictory statements you pointed out are confusing. I’ll work on rewording that section of the Introduction. I’d also like to address the physical storage topic here and add some links.
While PostgreSQL physical storage is beyond the scope of the book, and there are dedicated resources for it[1], this section does try and provide an overview. It is important to me it’s accurate and useful, otherwise it should be cut. Concepts like pages, tuples/row versions, and TOAST, help users build a mental model of what’s happening, and get familiar with terms they’ll encounter in documentation, even if they mostly work with the “abstractions” of tables, columns, and indexes.
I’ll try and briefly communicate those two points more helpfully. There are at least two scenarios where user data doesn’t fit in a page.
One scenario is when a row update would go into a page that’s filled up, or the content would exceed the available space.
If the file is full, PostgreSQL adds a new empty page to the end of the file to increase the file size. [1]
The second scenario is for user data that’s variable in length, or is large sized like JSON text data. PostgreSQL allows up to 1GB of data to be stored in a column,[2] which is a lot!
Since PostgreSQL by default uses 8kb pages,[5] 1GB of data could not fit in that size. How does that work?
PostgreSQL uses a system called “TOAST” (Total Oversized Attribute Storage), which handles storing large sized data beyond the 8kb size limitation. TOAST uses a special toast table behind the scenes, and the data is chunked up and spread out among multiple pages.[3] This is handled transparently for users and adds minimal overhead.
If you’re like me, besides reading docs, you might also want to learn technical topics in other formats. The Postgres.fm podcast covered TOAST[4] in a past episode, and it’s worth a listen.
I’ll revise that section of the book and post revisions back here.
Thank you for this feedback. Please continue to bring up anything that’s contradictory or otherwise confusing, since by putting a spotlight on it, it helps me consider the educational quality of the wording and hopefully improve it.
[1]: The Internals of PostgreSQL : Chapter 1 Database Cluster, Databases, and Tables 1.3. Internal Layout of a Heap Table File
[2]: PostgreSQL: Documentation: 16: Appendix K. PostgreSQL Limits
[3]: PostgreSQL: Documentation: 16: 73.2. TOAST
[4]: Postgres FM | TOAST
[5]: PostgreSQL: Documentation: 16: 73.6. Database Page Layout