Wednesday, 31 October 2018

Using parallel sequential scan in PostgreSQL

Parallel sequential scan is the first parallel access method in PostgreSQL and is introduced in version 9.6.  The committer of this feature and my colleague at EnterpriseDB Robert Haas wrote an awesome blog on it, there is another great blog by another PostgreSQL committer and my colleague Amit Kapila. Both of these blogs explain this access method, its design, usage, and related parameters. 

Still, I could not help but notice that there are curiosities around the usage of this access method. Every now and then I could see a complaint saying parallel sequential scan is not getting selected or it is degrading the performance of a query.  So, I decided to write this blog to cater more practical scenarios and specifically focus on its less talked about aspect  -- where parallel sequential scan would (should) not improve the performance.

Before diving into the details of parallel SeqScan, let's first understand the basic infrastructure and terminology related to it in PostgreSQL. The processes that run in parallel and scan the tuples of a relation are called parallel workers or workers in short. There is one special worker namely leader which co-ordinates and collects the output of the scan from each  of the worker. This worker may or may not participate in scanning the relation depending on it's load in dividing and combining processes. End users can also control the involvement of leader in relation scan by GUC parameter parallel_leader_participation, it is a boolean parameter. 

Now, let's understand the concept of parallel scan in PostgreSQL by a simple example.
  • Let there be a table T (a int, b int) containing 100 tuples
  • Let's say we have two workers and one leader,
  • Cost of scanning one tuple is 10
  • Cost of communicating a tuple from worker to leader is 20
  • Cost of dividing the tuples among workers is 30
  • For simplicity, let's assume that leader gives 50 tuples to each of the worker
Now, let's analyse if parallel scan will be faster than non parallel scan,

Cost of SeqScan = 10*100 = 1000
Cost of Parallel SeqScan = 30 + (50 * 10)  + (50 * 20) * 2 = 2530

Here, we can see that though the cost of scanning the tuples is halved yet the cost of combining the total result is enough to make the overall cost of parallel SeqScan higher than non parallel SeqScan.

Now, let's say we want to list only the tuples which have a > 80, and there are only 20 (say) such tuples, then cost of SeqScan will remain same, but cost of parallel SeqScan can be given as,

Cost of Parallel SeqScan = 30 + (50 * 10) + (10 * 20) * 2 =  730

Hence, parallel SeqScan is likely to improve the performance of queries that require scanning a large amount of data but only few of them satisfy the selection criteria. To generalise this,

Cost of SeqScan = Cost of scanning one tuple * number of tuples
Cost of parallel SeqScan = Cost of dividing the work among workers + cost of combining the work
                                            from workers + cost of work done by a worker * number of workers

Let's dive into it a bit more,

Cost of dividing the work among workers is fairly constant depending on the relation size
Cost of combining the the work from workers = cost of communicating the selected tuples from each
                                                                              worker to the leader
Cost of work done by a worker = cost of scanning a tuple * number of tuples the respective worker
                                                      has received

Now, we can see that the cost of combining the work is dependent on the number of tuples received by each worker. Now, for the queries where all or almost all of the tuples are in the final result, we will paying more cost than its non parallel flavour, first in scanning the tuple and second in combining it to the final result.

In PostgreSQL, the cost determining the cost of dividing the work among workers is given as parallel_setup_cost, the cost of communicating the tuple from worker to leader is given by parallel_tuple_cost, and the number of workers is upper bounded by the GUC max_parallel_workers_per_gather.

So, if you are using a system a high frequency multiple processor then lowering the parallel_setup_cost and parallel_tuple_cost will help in selection of parallel scans. If there are not many processes running in parallel, then increasing max_parallel_workers_per_gather can leverage more parallel processes to improve the query performance. Another point to note is that the number of workers is further capped by max_worker_processes.

1 comment:

  1. minor math error:
    30 + (50 * 10) + (10 * 20) * 2 = 730 // false
    30 + (50 * 10) + (10 * 20) * 2 = 930 // true


My experience at PGConf Europe 2018

It was my first time at PGConf Europe this year, like many other firsts it was special, hence the blog. Let's start with some ...