Quantcast
Channel: Teradata Forums - All forums
Viewing all articles
Browse latest Browse all 27759

Large-scale update question - forum topic by rtefft

$
0
0

I have a table with 700m rows, evenly distributed 1-column integer NUPI plus 3-column integer USI (1st col us the also PI).  Queries run fine, but when we push updates or deletes to this table in ETL jobs, we have problems.  We use a stage table with identical structure to the 700m target table, and use a joined Update statement to update corresponding records in the target table each night.  There are an everage of 1-3m rows for updating in the stage table. The single update statement runs 3+ hours despite Explains estimating 10-12 minutes We have looked at partitions but the wide variety of incoming 1-3m rows prevents partition elimination.
From a conceptual point of view, what would be a good approach for pushing millions of updates/deletes to a 700m row table like this?  
The incoming stage and target tables have the same structure:


CREATE MULTISET TABLE deck_prod_tbls.inventory_Event_usage ,NO FALLBACK ,
     NO BEFORE JOURNAL,
     NO AFTER JOURNAL,
     CHECKSUM = DEFAULT,
     DEFAULT MERGEBLOCKRATIO
     (
      Event_Gnbr DECIMAL(15,0) NOT NULL,
      Inv_Gnbr DECIMAL(15,0) NOT NULL,
      Usage_Parm_Code VARCHAR(8) CHARACTER SET LATIN NOT CASESPECIFIC NOT NULL,
   ...(other cols)...
PRIMARY INDEX XPI_Inventory_Event_Usage ( Inv_Gnbr )
UNIQUE INDEX ( Event_Gnbr ,Inv_Gnbr ,Usage_Parm_Code )
INDEX ( Source_Data_Samp_Ord ,Source_Data_Samp_Part_Ord ,Source_Event_Db_Id ,
Source_Event_Id ,Source_Name );

Stage table with 1-3m rows in STAGEDB.INV_EVENT_USAGE_CH, the 700m row target table is TABLESDB.INVENTORY_EVENT_USAGE.  Similar issue with deletes.  

update t
  from TABLESDB.INVENTORY_EVENT_USAGE t,
    STAGEDB.INV_EVENT_USAGE_CH c
  set
    Incremental_Usage_Qty       = c.Incremental_Usage_Qty,
    Rec_Status_Code             = c.Rec_Status_Code,
    Source_Create_Dt            = c.Source_Create_Dt,
    Source_Data_Samp_Ord        = c.source_data_samp_ord,
    Source_Data_Samp_Part_Ord   = c.source_data_samp_part_ord,
    Source_Data_Type_DB_ID      = c.source_data_type_db_id,
    Source_Data_Type_ID         = c.source_data_type_id,
    Source_Event_DB_ID          = c.source_event_db_id,
    Source_Event_Id             = c.source_event_id,
    Source_Event_Inv_Id         = c.source_event_inv_id,
    Source_Inv_DB_ID            = c.source_inv_db_id,
    Source_Inv_Id               = c.source_inv_id,
    Source_Rev_DB_ID            = c.source_rev_db_id,
    Source_Rev_Dt               = c.source_rev_dt,
    Usage_Since_New_Qty         = c.usage_since_new_qty,
    Usage_Since_Overhaul_Qty    = c.usage_since_overhaul_qty,
    Source_Name                 = c.source_name,
    Last_Alter_Ts               = c.last_alter_ts
  where t.event_gnbr      = c.event_gnbr
    and t.inv_gnbr        = c.inv_gnbr
    and t.usage_parm_code = c.usage_parm_code;

  --
  --  Delete
  --
  delete TABLESDB.INVENTORY_EVENT_USAGE
  where TABLESDB.INVENTORY_EVENT_USAGE.event_gnbr      = STAGEDB.INV_EVENT_USAGE_DEL.event_gnbr
    and TABLESDB.INVENTORY_EVENT_USAGE.inv_gnbr        = STAGEDB.INV_EVENT_USAGE_DEL.inv_gnbr
    and TABLESDB.INVENTORY_EVENT_USAGE.usage_parm_code = STAGEDB.INV_EVENT_USAGE_DEL.usage_parm_code;

Each of the 2 statements above runs 3+ hours when about 2m rows are being updated... insanely slow.
Thanks in advance,
Rich

Forums: 

Viewing all articles
Browse latest Browse all 27759

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>