Skip to main content
  1. Posts/

The Table I Wanted to Query Was Not in the Execution Plan

·1667 words·8 mins
liuzhilong62
Author
liuzhilong62
PostgreSQL DBA. Writing about database internals, production cases, and source code analysis.

Problem: The Queried Table Did Not Appear in the Execution Plan
#

SQL:

SELECT
  *
FROM
  (
    SELECT
      A.column1 as "column1",
      -- many A columns omitted in between
      A.column99 as "column99"
    from
      table_a A
      left join (
        SELECT
          lzl_id
        from
          table_a AA
          inner join table_b BB ON AA.lzl_key = BB.lzl_id
        where
          AA.column_code = '1'
        GROUP BY
          lzl_id
      ) B ON B.lzl_id = A.lzl_key
    where
      A.flagflagflag = '1'
      AND A.typetypetype = '2'
  ) TEMP
limit
  100
offset
  1000

Execution plan:

                                                            QUERY PLAN                                                            
----------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=2.84..5.68 rows=1 width=1105) (actual time=0.038..0.039 rows=0 loops=1)
   Buffers: shared hit=2
   ->  Seq Scan on table_a a  (cost=0.00..2.84 rows=1 width=1105) (actual time=0.036..0.037 rows=0 loops=1)
         Filter: (((flagflagflag)::text = '1'::text) AND ((typetypetype)::text = '2'::text))
         Rows Removed by Filter: 38
         Buffers: shared hit=2
 Planning Time: 0.184 ms
 Execution Time: 0.066 ms

As you can see, the SQL itself is fairly complex. Logically, the SQL queries 3 tables / accesses 2 tables total. I can understand table_a appearing in the execution plan, but table_b, which needed to be queried, wasn’t in the execution plan at all! The execution plan was simply a sequential scan of table_a.

The Analytical Journey
#

In the middle of the analysis, I actually considered many possibilities, but the most likely one was logical optimization — that is, the PostgreSQL optimizer determined that table_b didn’t need to be queried.

Observing the SQL, I noticed that the final query only selected columns from table_a, without any columns from table_b. Adding any column from the intermediate table B made the SQL execution plan appear “normal” — it accessed table_b:

explain SELECT
  *
FROM
  (
    SELECT
      A.column1 as "column1",
      -- many A columns omitted in between
      A.column99 as "column99",
      B.lzl_id -- added a column from intermediate table B
    from
      table_a A
      left join (
        SELECT
          lzl_id
        from
          table_a AA
          inner join table_b BB ON AA.lzl_key = BB.lzl_id
        where
          AA.column_code = '1'
        GROUP BY
          lzl_id
      ) B ON B.lzl_id = A.lzl_key
    where
      A.flagflagflag = '1'
      AND A.typetypetype = '2'
  ) TEMP
limit
  100
offset
  1000
---------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=14.69..17.67 rows=1 width=1113)
   ->  Nested Loop Left Join  (cost=11.72..14.69 rows=1 width=1113)
         Join Filter: (bb.lzl_id = a.lzl_key)
         ->  Seq Scan on table_a a  (cost=0.00..2.84 rows=1 width=1113)
               Filter: (((flagflagflag)::text = '1'::text) AND ((typetypetype)::text = '2'::text))
         ->  Group  (cost=11.72..11.74 rows=5 width=8)
               Group Key: bb.lzl_id
               ->  Sort  (cost=11.72..11.73 rows=5 width=8)
                     Sort Key: bb.lzl_id
                     ->  Nested Loop  (cost=0.15..11.66 rows=5 width=8)
                           ->  Seq Scan on table_a aa  (cost=0.00..2.70 rows=1 width=8)
                                 Filter: ((company_code)::text = '1'::text)
                           ->  Index Only Scan using idx_table_b_lzl_id on table_b bb  (cost=0.15..8.83 rows=13 width=8)
                                 Index Cond: (lzl_id = aa.lzl_key)

This seems related to LEFT JOIN, but a quick thought makes it seem incorrect — after all, the results from the right table should affect the final query result, so the right table shouldn’t be skipped. Let’s try a simple LEFT JOIN:

explain select lzlleft.a from lzlleft left join lzlright on lzlleft.a=lzlright.a;
                             QUERY PLAN                             
--------------------------------------------------------------------
 Hash Left Join  (cost=1.04..15.47 rows=320 width=4)
   Hash Cond: (lzlleft.a = lzlright.a)
   ->  Seq Scan on lzlleft  (cost=0.00..13.20 rows=320 width=4)
   ->  Hash  (cost=1.02..1.02 rows=2 width=4)
         ->  Seq Scan on lzlright  (cost=0.00..1.02 rows=2 width=4)

The right table is scanned. But, in intermediate table B, there’s the keyword GROUP BY. If we remove GROUP BY, then table_b is accessed regardless of whether we query columns from B.

Let’s add a GROUP BY in our test table and see the result:

> select * from lzlleft;
 a |  b  
---+-----
 1 | zzz
(1 row)

Time: 0.259 ms
> select * from lzlright;
 a |   b   
---+-------
 1 | qwer
 1 | poiuy 
 
> select lzlright.b from lzlleft full join  lzlright on lzlleft.b=lzlright.b group by lzlright.b;
   b    
--------
 [null]
 poiuy
 qwer
(3 rows)

This is where I realized that the result set from GROUP BY must have a certain property — uniqueness.

Let’s add GROUP BY in the test table:

 explain select lzlleft.a from lzlleft left join  (select  a from  lzlright group by a) c on lzlleft.a=c.a;
                        QUERY PLAN                        
----------------------------------------------------------
 Seq Scan on lzlleft  (cost=0.00..13.20 rows=320 width=4)

The right table is not queried!

Based on the principle of right-table uniqueness, we can also have some fun variations:

-- distinct ensures right-table uniqueness
> explain select lzlleft.a from lzlleft left join  (select distinct a from  lzlright) c on lzlleft.a=c.a;
                        QUERY PLAN                        
----------------------------------------------------------
 Seq Scan on lzlleft  (cost=0.00..13.20 rows=320 width=4) 
-- unique index ensures right-table uniqueness, even with just select a from lzlright
>  explain select lzlleft.a from lzlleft left join  (select  a from  lzlright) c on lzlleft.a=c.a;
                              QUERY PLAN                               
-----------------------------------------------------------------------
 Hash Left Join  (cost=17.20..49.12 rows=512 width=4)
   Hash Cond: (lzlleft.a = lzlright.a)
   ->  Seq Scan on lzlleft  (cost=0.00..13.20 rows=320 width=4)
   ->  Hash  (cost=13.20..13.20 rows=320 width=4)
         ->  Seq Scan on lzlright  (cost=0.00..13.20 rows=320 width=4)
(5 rows)

Time: 0.510 ms
> create unique index idx_right on lzlright(a);
CREATE INDEX
Time: 3.576 ms
> explain select lzlleft.a from lzlleft left join  (select  a from  lzlright) c on lzlleft.a=c.a;
                        QUERY PLAN                        
----------------------------------------------------------
 Seq Scan on lzlleft  (cost=0.00..13.20 rows=320 width=4)
(1 row)

Here’s a summary of the analysis: when the right table’s data is unique and only the left table’s data is being queried, there’s no need to actually access the right table. So this is not a bug, but a feature of the PostgreSQL optimizer — and it makes logical sense.

Source Code Analysis
#

No source code analysis this time~

The optimizer source code is just too difficult. I only looked at some optimizer source code comments. Search for the keyword unique-ify, and you’ll find this:

 * Also, this routine and others in this module accept the special JoinTypes
 * JOIN_UNIQUE_OUTER and JOIN_UNIQUE_INNER to indicate that we should
 * unique-ify the outer or inner relation and then apply a regular inner
 * join.  These values are not allowed to propagate outside this module,
 * however.  Path cost estimation code may need to recognize that it's
 * dealing with such a case --- the combination of nominal jointype INNER
 * with sjinfo->jointype == JOIN_SEMI indicates that. 

Special JoinTypes: JOIN_UNIQUE_INNER and JOIN_UNIQUE_OUTER — they try to unique-ify the outer and inner relations and then treat them as an inner join. Path cost estimation needs to consider this scenario.

Comparison with Oracle and MySQL Optimizers
#

Let’s compare whether Oracle and MySQL optimizers have similar logical optimization improvements.

-- Oracle
create table lzlleft(a number);
create table lzlright(a number);
 select lzlleft.a from lzlleft left join  (select  distinct a from  lzlright) c on lzlleft.a=c.a;
-- GROUP BY uniqueness
SQL>  select lzlleft.a from lzlleft left join  (select  a from  lzlright group by a) c on lzlleft.a=c.a; 

no rows selected


Execution Plan
----------------------------------------------------------
Plan hash value: 3533354041

---------------------------------------------------------------------------------
| Id  | Operation            | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |          |     1 |    26 |     5  (20)| 00:00:01 |
|*  1 |  HASH JOIN OUTER     |          |     1 |    26 |     5  (20)| 00:00:01 |
|   2 |   TABLE ACCESS FULL  | LZLLEFT  |     1 |    13 |     2   (0)| 00:00:01 |
|   3 |   VIEW               |          |     1 |    13 |     3  (34)| 00:00:01 |
|   4 |    HASH GROUP BY     |          |     1 |    13 |     3  (34)| 00:00:01 |
|   5 |     TABLE ACCESS FULL| LZLRIGHT |     1 |    13 |     2   (0)| 00:00:01 |
---------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("LZLLEFT"."A"="C"."A"(+))
-- DISTINCT uniqueness
SQL>  select lzlleft.a from lzlleft left join  (select  distinct a from  lzlright) c on lzlleft.a=c.a;

no rows selected


Execution Plan
----------------------------------------------------------
Plan hash value: 3859658234

---------------------------------------------------------------------------------
| Id  | Operation            | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |          |     1 |    26 |     5  (20)| 00:00:01 |
|*  1 |  HASH JOIN OUTER     |          |     1 |    26 |     5  (20)| 00:00:01 |
|   2 |   TABLE ACCESS FULL  | LZLLEFT  |     1 |    13 |     2   (0)| 00:00:01 |
|   3 |   VIEW               |          |     1 |    13 |     3  (34)| 00:00:01 |
|   4 |    HASH UNIQUE       |          |     1 |    13 |     3  (34)| 00:00:01 |
|   5 |     TABLE ACCESS FULL| LZLRIGHT |     1 |    13 |     2   (0)| 00:00:01 |
---------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("LZLLEFT"."A"="C"."A"(+))
-- MySQL
create table lzlleft(a int primary key);
create table lzlright(a int primary key);
-- GROUP BY uniqueness
explain select lzlleft.a from lzlleft left join  (select  a from  lzlright group by a) c on lzlleft.a=c.a; 
+----+-------------+------------+------------+-------+---------------+-------------+---------+-----------------+------+----------+-------------+
| id | select_type | table      | partitions | type  | possible_keys | key         | key_len | ref             | rows | filtered | Extra       |
+----+-------------+------------+------------+-------+---------------+-------------+---------+-----------------+------+----------+-------------+
|  1 | PRIMARY     | lzlleft    | NULL       | index | NULL          | PRIMARY     | 4       | NULL            |    1 |   100.00 | Using index |
|  1 | PRIMARY     | <derived2> | NULL       | ref   | <auto_key0>   | <auto_key0> | 4       | lzldb.lzlleft.a |    2 |   100.00 | Using index |
|  2 | DERIVED     | lzlright   | NULL       | index | PRIMARY       | PRIMARY     | 4       | NULL            |    1 |   100.00 | Using index |
+----+-------------+------------+------------+-------+---------------+-------------+---------+-----------------+------+----------+-------------+
-- DISTINCT uniqueness
explain select lzlleft.a from lzlleft left join  (select  distinct a from  lzlright) c on lzlleft.a=c.a;
+----+-------------+------------+------------+-------+---------------+-------------+---------+-----------------+------+----------+-------------+
| id | select_type | table      | partitions | type  | possible_keys | key         | key_len | ref             | rows | filtered | Extra       |
+----+-------------+------------+------------+-------+---------------+-------------+---------+-----------------+------+----------+-------------+
|  1 | PRIMARY     | lzlleft    | NULL       | index | NULL          | PRIMARY     | 4       | NULL            |    1 |   100.00 | Using index |
|  1 | PRIMARY     | <derived2> | NULL       | ref   | <auto_key0>   | <auto_key0> | 4       | lzldb.lzlleft.a |    2 |   100.00 | Using index |
|  2 | DERIVED     | lzlright   | NULL       | index | PRIMARY       | PRIMARY     | 4       | NULL            |    1 |   100.00 | Using index |
+----+-------------+------------+------------+-------+---------------+-------------+---------+-----------------+------+----------+-------------+

In summary, neither Oracle nor MySQL performs the optimization of eliminating the right table in a LEFT JOIN when only left-table columns are queried and the right table is unique — they both access the right table.

The PostgreSQL optimizer really has some impressive tricks.

Related

A Classic Case of Long Transaction, Table Bloat, and LIMIT Issues

·1309 words·7 mins
Slow Primary Key Update — Problem Analysis # A simple primary key update took over 1 second to execute. Due to high concurrency, the CPU was completely maxed out: 2024-04-01 10:19:36.084 CST,"lzlopr","lzl",158751,"10.33.78.149:51502",66055a6b.26c1f,172,"UPDATE",2024-03-28 19:54:19 CST,528/19816630,970251337,LOG,00000,"duration: 1218.688 ms plan: Query Text: update table_a set (omitted...)=$6 where column_id =$7 Update on table_a (cost=0.40..5.49 rows=1 width=2774) -> Index Scan using pk_id on table_a (cost=0.40..5.49 rows=1 width=2774) Index Cond: ((column_id)::text = $7)",,,,,,,,,"PostgreSQL JDBC Driver","client backend" The SQL itself is very simple — an update with a condition on the primary key. Looking at the execution plan, it used the pk_id primary key index, so there was no problem with the plan itself; the issue wasn’t a plan change.

Analyzing a 5MB SQL That Consumed 70GB of Memory

·2217 words·11 mins
Process Memory Analysis # "WAL writer process (PID 66902) was terminated by signal 6: Aborted",,,,,,,,,"","postmaster" The log shows postmaster process 66902 was killed. Checking OS-level process memory: since top doesn’t show PPID and ps doesn’t show USS, we need both: USER PID PPID PRI %CPU %MEM VSZ RSS WCHAN S STARTED TIME COMMAND postgres 211276 66478 19 8.7 10.6 57488380 56389972 - R 17:13:03 00:02:47 postgres: BIND postgres 211277 66478 19 7.8 9.6 52294700 51127480 - R 17:13:03 00:02:31 postgres: BIND postgres 222749 66478 19 22.7 9.3 51320000 49073368 - R 17:35:33 00:02:09 postgres: BIND postgres 39513 66478 19 2.9 6.8 38651084 36354736 ep_poll S 16:13:03 00:02:43 postgres: idle Using PPID to identify high-memory backend processes. Let’s examine process 211276:

Case Study: Analyzing Occasional Slow INSERT VALUES

·2017 words·10 mins
The business team reported that INSERT VALUES occasionally became slow. By the time I checked the active sessions, the slow write problem had already subsided. Later, I discovered that the slow write problem lasted less than half a minute, with INSERT VALUES taking 1-2 seconds. I wrote a script to capture active session information and managed to get the session data: wait_event | count ---------------------+------- [null] | 11 WALRead | 1 DataFileRead | 2 BgWriterMain | 1 WALWrite | 40 AutoVacuumMain | 1 ClientRead | 385 LogicalLauncherMain | 1 The most abnormal wait event was WALWrite with 40 sessions.