1 year ago

#383269

test-img

Sandeep

Efficient way to update a table in Data ware house which is used for reporting

We have a data ware house running on PostgreSQl, the database size is of 1.5TB.

We are using DROP & RECREATE logic on all our scripts to update the tables and use it for our reporting.Our data layer is updating every 45 mins to 60 minutes, and from this data layer we are building our business layer tables which is used for our reporting.

So basically our business layer table scripts looks like below,

BEGIN;
DROP TABLE IF EXISTS table1_temp CASCADE;
CREATE TABLE table1_temp AS

 --All calculations, Insertions & Update

DROP TABLE IF EXISTS table1;
ALTER TABLE table1_temp RENAME TO table1;
COMMIT;
ANALYZE table1;

We run all our business layer scripts using JENKINS, and have different pipelines,But due to the above logic, often many of the important scripts are BLOCKING each other and the jobs are getting aborted after the set timeout period(Usually the statement DROP TABLE IF EXISTS table1; is not executing and often gets into DEADLOCK).

So i would like to ask for recommendation on replacing the above DROP & RECREATE logic with an efficient way to update the table.

Any suggestions will be of very helpful.

postgresql

data-warehouse

0 Answers

Your Answer

Accepted video resources