Details
Description
Given
- Postgres database
- Perform the task query: taskService.createTaskQuery().taskUnassigned().withoutCandidateGroups().withoutCandidateUsers().count()
Problem
- Consider the following query and the query plan:
-- QUERY select count(distinct RES.ID_) from ACT_RU_TASK RES WHERE (1 = 1 and RES.ASSIGNEE_ IS NULL and (RES.ASSIGNEE_ is null and RES.ID_ not in ( select TASK_ID_ from ACT_RU_IDENTITYLINK ARIEL WHERE ARIEL.TYPE_ = 'candidate'))) -- QUERY PLAN Aggregate (cost=28712821.87..28712821.88 rows=1 width=8) (actual time=575362.672..575362.672 rows=1 loops=1) -> Gather (cost=2342.79..28712795.40 rows=10587 width=37) (actual time=293340.304..575364.177 rows=3 loops=1) Workers Planned: 2 Workers Launched: 2 -> Parallel Bitmap Heap Scan on act_ru_task res (cost=1342.79..28710736.70 rows=4411 width=37) (actual time=304801.026..575332.787 rows=1 loops=3) Recheck Cond: ((assignee_ IS NULL) AND (assignee_ IS NULL)) Filter: (NOT (SubPlan 1)) Rows Removed by Filter: 14033 Heap Blocks: exact=1683 -> Bitmap Index Scan on act_idx_task_assignee (cost=0.00..1340.15 rows=21173 width=0) (actual time=116.300..116.300 rows=42364 loops=1) Index Cond: ((assignee_ IS NULL) AND (assignee_ IS NULL)) SubPlan 1 -> Materialize (cost=0.00..6158.60 rows=139634 width=37) (actual time=0.004..27.630 rows=60595 loops=42102) -> Seq Scan on act_ru_identitylink ariel (cost=0.00..4369.43 rows=139634 width=37) (actual time=0.003..29.409 rows=60688 loops=22205) Filter: ((type_)::text = 'candidate'::text) Rows Removed by Filter: 1 Planning time: 0.632 ms Execution time: 575366.077 ms
- The query performs not well and the costs are with ~ 28,7 Mio. rather high
- NOT IN (SELECT ...) does not optimize well
- It uses a plain sub plan which has a bad performance [1]
Solutions
1. Rewrite the query with the help of a correlated subquery: NOT EXISTS (SELECT ...)
-- REWRITTEN QUERY select count(distinct RES.ID_) from ACT_RU_TASK RES WHERE (1 = 1 and RES.ASSIGNEE_ IS NULL and (RES.ASSIGNEE_ is null and NOT EXISTS ( select 1 from ACT_RU_IDENTITYLINK I WHERE I.TYPE_ = 'candidate' AND I.TASK_ID_ = RES.ID_ ) ) ) -- QUERY PLAN Aggregate (cost=13624.60..13624.61 rows=1 width=8) (actual time=138.484..138.484 rows=1 loops=1) -> Hash Anti Join (cost=7256.58..13615.49 rows=3645 width=37) (actual time=117.066..138.453 rows=3 loops=1) Hash Cond: ((res.id_)::text = (i.task_id_)::text) -> Seq Scan on act_ru_task res (cost=0.00..4569.86 rows=20607 width=37) (actual time=0.007..24.543 rows=42226 loops=1) Filter: ((assignee_ IS NULL) AND (assignee_ IS NULL)) Rows Removed by Filter: 44501 -> Hash (cost=4388.79..4388.79 rows=141183 width=37) (actual time=72.261..72.261 rows=142190 loops=1) Buckets: 65536 Batches: 4 Memory Usage: 2911kB -> Seq Scan on act_ru_identitylink i (cost=0.00..4388.79 rows=141183 width=37) (actual time=0.003..31.867 rows=142190 loops=1) Filter: ((type_)::text = 'candidate'::text) Rows Removed by Filter: 1 Planning time: 0.473 ms Execution time: 138.647 ms
2. Rewrite the query to use a LEFT JOIN with NULL check
-- REWRITTEN QUERY SELECT COUNT(distinct RES.id_) FROM ACT_RU_TASK RES LEFT JOIN ACT_RU_IDENTITYLINK I ON I.task_id_ = RES.id_ AND I.type_ = 'candidate' WHERE I.id_ IS NULL AND RES.assignee_ IS NULL -- QUERY PLAN Aggregate (cost=9856.38..9856.39 rows=1 width=8) (actual time=123.516..123.516 rows=1 loops=1) -> Hash Right Join (cost=5096.96..9856.38 rows=1 width=37) (actual time=122.414..123.481 rows=3 loops=1) Hash Cond: ((link.task_id_)::text = (task.id_)::text) Filter: (link.id_ IS NULL) Rows Removed by Filter: 57826 -> Seq Scan on act_ru_identitylink link (cost=0.00..4388.79 rows=141183 width=74) (actual time=0.008..31.148 rows=141212 loops=1) Filter: ((type_)::text = 'candidate'::text) Rows Removed by Filter: 1 -> Hash (cost=4569.86..4569.86 rows=42168 width=37) (actual time=36.236..36.236 rows=42099 loops=1) Buckets: 65536 Batches: 1 Memory Usage: 3349kB -> Seq Scan on act_ru_task task (cost=0.00..4569.86 rows=42168 width=37) (actual time=0.009..22.638 rows=42099 loops=1) Filter: (assignee_ IS NULL) Rows Removed by Filter: 44193 Planning time: 0.829 ms Execution time: 123.567 ms
Thoughts
- Both solutions perform compared to the original query way better
- The costs are reduced by the factor of ~ 2.000
- Investigate the performance of the solutions for non-postgres databases to avoid performance degradation
- The implementation effort of solution 1 might be less compared to solution 2
- In the withoutCandidateUser/Group case a LEFT JOIN is needed
- The query taskService.createTaskQuery().withCandidateUser().withoutCandidateGroup().or().priority(5).dueDate(now).endOr().count() would need an additional LEFT JOIN
[1] https://wiki.postgresql.org/wiki/Don't_Do_This#Don.27t_use_NOT_IN