8000 Improve backfill performance by andrew-farries · Pull Request #389 · xataio/pgroll · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Improve backfill performance #389

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Oct 10, 2024
Merged

Conversation

andrew-farries
Copy link
Collaborator
@andrew-farries andrew-farries commented Oct 7, 2024

Improve backfill performance by lifting the call to pgroll.latest_version() out of the up and down trigger functions. The result of pgroll.latest_version() is instead passed as a config parameter to the function template and used as a constant inside the trigger function.

Trigger functions are invoked once per row in the affected table during migration start. Even though the execution of pgroll.latest_version is fast in isolation, lifting it out of the 'inner loop' like this has a significant effect on backfill durations.

The trigger functions used to look like this:

Old trigger function
DECLARE                                                                                                           
 "id" "public"."products"."id"%TYPE := NEW."id";                                                                 
 "name" "public"."products"."name"%TYPE := NEW."name";                                                           
 "price" "public"."products"."price"%TYPE := NEW."price";                                                        
 latest_schema text;                                                                                             
 search_path text;                                                                                               
BEGIN                                                                                                             
 SELECT 'public'  '_'  latest_version                                                                        
   INTO latest_schema                                                                                            
   FROM "pgroll".latest_version('public');                                                                       
                                                                                                                 
 SELECT current_setting                                                                                          
   INTO search_path                                                                                              
   FROM current_setting('search_path');                                                                          
                                                                                                                 
 IF search_path != latest_schema THEN                                                                            
   NEW."_pgroll_new_description" = UPPER(name);                                                                  
 END IF;                                                                                                         
                                                                                                                 
 RETURN NEW;                                                                                                     
END;

They now look like this:

New trigger function
DECLARE                                                                                                
  "duration" "public"."movies"."_pgroll_new_duration"%TYPE := NEW."_pgroll_new_duration";              
  "id" "public"."movies"."id"%TYPE := NEW."id";                                                        
  latest_schema text;                                                                                  
  search_path text;                                                                                    
BEGIN                                                                                                  
  SELECT current_setting                                                                               
    INTO search_path                                                                                   
    FROM current_setting('search_path');                                                               
                                                                                                       
  IF search_path = 'public_02_add_column' THEN                                                         
    NEW."duration" = EXTRACT(EPOCH FROM duration::interval)/60;                                        
  END IF;                                                                                              
                                                                                                       
  RETURN NEW;                                                                                          
END; 

The statement executed for each batch is like this:

Per batch SQL statement
WITH batch AS (
  SELECT id FROM movies ORDER BY id LIMIT 1000 FOR NO KEY UPDATE
), update AS (
  UPDATE movies SET id=movies.id FROM batch WHERE movies.id = batch.id RETURNING movies.id
)
SELECT LAST_VALUE(id) OVER() FROM update

Removing the call to pgroll.latest_version() from the up/down triggers has the following effect on the execution of the per-batch SQL statement used to perform backfills:

Before:

explain analyze for old triggers
WindowAgg  (cost=1222.07..1254.57 rows=1000 width=4) (actual time=626.087..626.349 rows=1000 loops=1)                                             
 CTE batch                                                                                                                                       
   ->  Limit  (cost=0.29..107.44 rows=1000 width=10) (actual time=0.083..6.263 rows=1000 loops=1)                                                
         ->  LockRows  (cost=0.29..3214.93 rows=30000 width=10) (actual time=0.082..5.942 rows=1000 loops=1)                                     
               ->  Index Scan using movies_pkey on movies  (cost=0.29..2914.93 rows=30000 width=10) (actual time=0.019..2.013 rows=1000 loops=1) 
 CTE update                                                                                                                                      
   ->  Update on movies movies_1  (cost=1092.00..1114.62 rows=1000 width=38) (actual time=12.715..621.034 rows=1000 loops=1)                     
         ->  Hash Join  (cost=1092.00..1114.62 rows=1000 width=38) (actual time=9.275..18.730 rows=1000 loops=1)                                 
               Hash Cond: (batch.id = movies_1.id)                                                                                               
               ->  CTE Scan on batch  (cost=0.00..20.00 rows=1000 width=32) (actual time=0.090..8.113 rows=1000 loops=1)                         
               ->  Hash  (cost=717.00..717.00 rows=30000 width=10) (actual time=9.094..9.095 rows=30000 loops=1)                                 
                     Buckets: 32768  Batches: 1  Memory Usage: 1487kB                                                                            
                     ->  Seq Scan on movies movies_1  (cost=0.00..717.00 rows=30000 width=10) (actual time=0.005..4.060 rows=30000 loops=1)      
 ->  CTE Scan on update  (cost=0.00..20.00 rows=1000 width=4) (actual time=12.716..625.786 rows=1000 loops=1)                                    
Planning Time: 0.278 ms                                                                                                                           
Trigger _pgroll_trigger_movies__pgroll_new_duration: time=286.197 calls=1000   <----------------                                                    
Trigger _pgroll_trigger_movies_duration: time=273.070 calls=1000  <---------------------------                                            
Execution Time: 626.851 ms                                                                                                                        

After:

explain analyze for new triggers
WindowAgg  (cost=1214.55..1247.05 rows=1000 width=4) (actual time=61.331..61.609 rows=1000 loops=1)                                               
 CTE batch                                                                                                                                       
   ->  Limit  (cost=0.29..99.92 rows=1000 width=10) (actual time=0.100..7.212 rows=1000 loops=1)                                                 
         ->  LockRows  (cost=0.29..2989.39 rows=30000 width=10) (actual time=0.099..7.085 rows=1000 loops=1)                                     
               ->  Index Scan using movies_pkey on movies  (cost=0.29..2689.39 rows=30000 width=10) (actual time=0.022..1.293 rows=1000 loops=1) 
 CTE update                                                                                                                                      
   ->  Update on movies movies_1  (cost=1092.00..1114.62 rows=1000 width=38) (actual time=14.347..60.762 rows=1000 loops=1)                      
         ->  Hash Join  (cost=1092.00..1114.62 rows=1000 width=38) (actual time=14.120..22.796 rows=1000 loops=1)                                
               Hash Cond: (batch.id = movies_1.id)                                                                                               
               ->  CTE Scan on batch  (cost=0.00..20.00 rows=1000 width=32) (actual time=0.107..8.023 rows=1000 loops=1)                         
               ->  Hash  (cost=717.00..717.00 rows=30000 width=10) (actual time=13.995..13.996 rows=30000 loops=1)                               
                     Buckets: 32768  Batches: 1  Memory Usage: 1487kB                                                                            
                     ->  Seq Scan on movies movies_1  (cost=0.00..717.00 rows=30000 width=10) (actual time=0.003..4.883 rows=30000 loops=1)      
 ->  CTE Scan on update  (cost=0.00..20.00 rows=1000 width=4) (actual time=14.348..61.195 rows=1000 loops=1)                                     
Planning Time: 0.219 ms                                                                                                                           
Trigger _pgroll_trigger_movies__pgroll_new_duration: time=11.956 calls=1000    <------------------------                                                                   
Trigger _pgroll_trigger_movies_duration: time=11.077 calls=1000     <--------------------------                                                                                
Execution Time: 61.982 ms                                                                                                                         

From a user-perspective, this change results in a backfill duration performance improvement of ~80%.

@andrew-farries andrew-farries marked this pull request as ready for review October 8, 2024 07:32
@andrew-farries andrew-farries requested a review from exekias October 8, 2024 07:32
Copy link
Member
@exekias exekias left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome improvement @andrew-farries! I wonder how much time is going into grabbing the search path, that's probably a quicker operation, right? may be worth testing at least so we know the overhead created by the trigger

Remove the parameter from all implementations of the `Operation`
interface.
Don't invoke `pgroll.latest_version` inside the trigger function -
instead use the value from the config struct.
@andrew-farries andrew-farries force-pushed the backfill-inner-loop-improvements branch from c9a84ac to df40355 Compare October 10, 2024 09:02
@andrew-farries
Copy link
Collaborator Author

Awesome improvement @andrew-farries! I wonder how much time is going into grabbing the search path, that's probably a quicker operation, right? may be worth testing at least so we know the overhead created by the trigger

It's not free for sure, but I don't see a way around having to check this in every invocation of the trigger at the moment. Unless we find a way to remove the need for clients to specify the schema on the search path on connection I think we are stuck with this check in the trigger for now.

@andrew-farries andrew-farries merged commit d7f8ec7 into main Oct 10, 2024
48 checks passed
@andrew-farries andrew-farries deleted the backfill-inner-loop-improvements branch October 10, 2024 09:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants
0