Saturday 23 November 2019

Which size fits me?

If you have ever used postgreSQL and wondered about the size of the tables, this post might help. A postgreSQL table is no simple table rather it has many things associated with it like index tables, toast tables, then there are other important stuff like free space map and visibility map. Now postgreSQL provides multiple options for measuring the size of table and there are interesting distinctions among them.

\dt+ table-name

This is one of the most common way to find the size of a table. On your psql client you may simple type it and it will return the information about the mentioned table including its size in kB. Internally, this size is same as
select * from pg_size_pretty(pg_relation_size('table-name')) -- for versions older than 9
select * from pg_size_pretty(pg_table_size('table-name')) -- for later versions
This brings us to our next function.

 

pg_table_size

This is the size of the table which includes the size of its heap, any associated toast table, free space map, and visibility map.  So, this is the actual size of the relation on the disk. Note that this does not include the size of any indexes associated with this table.

pg_relation_size

This is the measure to provide the disk space used by one particular fork -- main, init, fsm, or vm. Now, when you write pg_relation_size('table-name') it will be same as pg_relation_size('table-name', 'main'). So, if one wants to measure the space by one fork, this is the function to call.

Now, what about if we want to know the table size with all of its indexes, for that we move to this next function.

pg_total_relation_size

This is the function which gives you the total size of the relation (as in pg_table_size) plus the size of all the indexes associated with it.

Here is the time to introduce an interesting extension in this regards. 

pgstattuple

If you are little more into the table size and how postgreSQL stores it, you might have come across this extension. Basically, after installing this extension when you run
select * from pgstattuple('table-name')
it gives you several metrics relating to the size-metrics, and first one among them is table_len. Now, this table_len is same as the output of \dt+. 

Which one is for you?

Now when you know about different size functions, let's find out which suits your purpose best.

Are you trying to find the bloat size in your table?

pgstattuple would be a good place to get that. It gives you the number of dead tuples, and the size of total dead tuples, free space, etc. so it gives a clearer picture of your table actual and bloat size. There is a lighter version of it also called pgstattuple_approx, with lesser locks and accuracy, it might serve good when you aren't bothered about the last MB.

Are you interested in only finding the size of an index?

An interesting thing to remember here is that postgreSQL indexes are also saved as relations, so you may very well use the pg_relation/table_size with the index name as its argument to get the size of required index only.

Additionally, pgstattuple has pgstatindex function for this. So, you may also use select * from pgstatindex('index-name'). The difference between them would be same as explained above for the corresponding relation functions.

Are you trying to find out the size of only the toast table(s) associated with your table?

If you know the name of the toast table you are interested in, then you may use  any of the above mentioned functions as per the requirement. In case you are struggling about finding the associated toast table, try this,
select relname from pg_class where oid = (select  reltoastrelid from pg_class where relname='table-name');
As a shortcut, the name of your toast table is in the format pg_toast.pg_toast_oid_of_parent_table

Have fun with all your size endeavors!

No comments:

Post a Comment

Learnings from benchmarking PostgreSQL on TPC-H

After making its mark in the OLTP world,  PostgreSQL is moving towards catering the needs of OLAP environment. Hence, the recent advancem...