Design change / Perf Optimization Suggestion required

Table: RETAIL_SHIPMENT / Records - 7 Billion / Partitions - 120 Weekly Partitions
Table: CURR_YAGO_CAL - Has Weekend Dates and their Corresponding Year Ago Dates. Sample 2 records

Cal_Dt Yago_Dt

12/26/2009 12/27/2008

12/26/2009 12/26/2009

1/2/2010 1/3/2009

1/2/2010 1/2/2010

I have a query which looking at some metrics for a date and its Year Ago date. The Problem I am facing is when I pass the filter on the CAL_DT and Join the Shipment Table in CAL_DT it accesses only one partition of the RETAIL_SHIPMENT which is what we need.
Problem arises when we filter on the CAL_DT and join on the YAGO_DT with the RETAIL_SHIPMENT then the optimizer is doing a complete table scan which is taking hell lot of time.
Below are the two versions of the query and their corresponding Explain Plans. PLease see explain step no 4 and the last 2 statements of the query. Can anyone please share a way to resolve this or may be propose a different design to just read 2 partitions of RETAIL_SHIPMENT. Please ignore the low/high confidences for now as we have collected all the relevant stats and also the explain is from DEV hence low no of records. Any help would be appreciated.

SEL count(1) FROM

MIM_TBL.CURR_YAGO_CAL TMP

INNER JOIN

(

SELECT RTL_ID RetailID,

CATG_ID CATGID,

Shipment_Dt ,

MAX(Catg_Store_Selling_13Wk_Cnt) AS CatgStoreCount

FROM Mim_tbl.retail_shipment_t

GROUP BY 1,2,3

) CATG_STORE_AGG

CATG_STORE_AGG.SHipment_DT = TMP.CAL_DT

WHERE TMP.CAL_DT = '2013-05-11'

1) First, we lock a distinct MIM_TBL."pseudo table" for read on a

RowHash to prevent global deadlock for MIM_TBL.TMP.

2) Next, we lock a distinct Mim_tbl."pseudo table" for read on a

RowHash to prevent global deadlock for Mim_tbl.retail_shipment_t.

3) We lock MIM_TBL.TMP for read, and we lock

Mim_tbl.retail_shipment_t for read.

4) We do an all-AMPs SUM step to aggregate from a single partition of

Mim_tbl.retail_shipment_t with a condition of (

"Mim_tbl.retail_shipment_t.Shipment_Dt = DATE '2013-05-11'") with

a residual condition of ("Mim_tbl.retail_shipment_t.Shipment_Dt =

DATE '2013-05-11'") , grouping by field1 (

Mim_tbl.retail_shipment_t.Rtl_Id

,Mim_tbl.retail_shipment_t.Catg_Id

,Mim_tbl.retail_shipment_t.Shipment_Dt). Aggregate Intermediate

Results are computed locally, then placed in Spool 3. The size of

Spool 3 is estimated with low confidence to be 91,369 rows (

3,746,129 bytes). The estimated time for this step is 0.11

seconds.

5) We execute the following steps in parallel.

1) We do an all-AMPs RETRIEVE step from Spool 3 (Last Use) by

way of an all-rows scan into Spool 1 (used to materialize

view, derived table or table function CATG_STORE_AGG)

(all_amps) (compressed columns allowed), which is built

locally on the AMPs. The size of Spool 1 is estimated with

low confidence to be 91,369 rows (2,284,225 bytes). The

estimated time for this step is 0.02 seconds.

2) We do an all-AMPs RETRIEVE step from MIM_TBL.TMP by way of a

traversal of index # 4 without accessing the base table with

a residual condition of ("MIM_TBL.TMP.Cal_Dt = DATE

'2013-05-11'") into Spool 8 (all_amps) (compressed columns

allowed), which is duplicated on all AMPs. The size of Spool

8 is estimated with high confidence to be 12 rows (204 bytes).

The estimated time for this step is 0.01 seconds.

6) We do an all-AMPs JOIN step from Spool 8 (Last Use) by way of an

all-rows scan, which is joined to Spool 1 (Last Use) by way of an

all-rows scan with a condition of ("CATG_STORE_AGG.SHIPMENT_DT =

DATE '2013-05-11'"). Spool 8 and Spool 1 are joined using a

product join, with a join condition of ("SHIPMENT_DT = Cal_Dt").

The result goes into Spool 7 (all_amps) (compressed columns

allowed), which is built locally on the AMPs. The size of Spool 7

is estimated with low confidence to be 91,369 rows (1,370,535

bytes). The estimated time for this step is 0.03 seconds.

7) We do an all-AMPs SUM step to aggregate from Spool 7 (Last Use) by

way of a cylinder index scan. Aggregate Intermediate Results are

computed globally, then placed in Spool 9. The size of Spool 9 is

estimated with high confidence to be 1 row (23 bytes). The

estimated time for this step is 0.02 seconds.

8) We do an all-AMPs RETRIEVE step from Spool 9 (Last Use) by way of

an all-rows scan into Spool 5 (group_amps), which is built locally

on the AMPs. The size of Spool 5 is estimated with high

confidence to be 1 row (25 bytes). The estimated time for this

step is 0.00 seconds.

9) Finally, we send out an END TRANSACTION step to all AMPs involved

in processing the request.

-> The contents of Spool 5 are sent back to the user as the result of

statement 1. The total estimated time is 0.18 seconds.

SEL count(1) FROM

MIM_TBL.CURR_YAGO_CAL TMP

INNER JOIN

(

SELECT RTL_ID RetailID,

CATG_ID CATGID,

Shipment_Dt ,

MAX(Catg_Store_Selling_13Wk_Cnt) AS CatgStoreCount

FROM Mim_tbl.retail_shipment_t

GROUP BY 1,2,3

) CATG_STORE_AGG

CATG_STORE_AGG.SHipment_DT = TMP.YAGO_DT

WHERE TMP.CAL_DT = '2013-05-11'

1) First, we lock a distinct MIM_TBL."pseudo table" for read on a

RowHash to prevent global deadlock for MIM_TBL.TMP.

2) Next, we lock a distinct Mim_tbl."pseudo table" for read on a

RowHash to prevent global deadlock for Mim_tbl.retail_shipment_t.

3) We lock MIM_TBL.TMP for read, and we lock

Mim_tbl.retail_shipment_t for read.

4) We do an all-AMPs SUM step to aggregate from

Mim_tbl.retail_shipment_t by way of an all-rows scan with no

residual conditions , grouping by field1 (

Mim_tbl.retail_shipment_t.Rtl_Id

,Mim_tbl.retail_shipment_t.Catg_Id

,Mim_tbl.retail_shipment_t.Shipment_Dt). Aggregate Intermediate

Results are computed locally, then placed in Spool 3. The input

table will not be cached in memory, but it is eligible for

synchronized scanning. The size of Spool 3 is estimated with high

confidence to be 7,381,353 rows (302,635,473 bytes). The

estimated time for this step is 5 minutes and 16 seconds.

5) We execute the following steps in parallel.

1) We do an all-AMPs RETRIEVE step from Spool 3 (Last Use) by

way of an all-rows scan into Spool 1 (used to materialize

view, derived table or table function CATG_STORE_AGG)

(all_amps) (compressed columns allowed), which is built

locally on the AMPs. The size of Spool 1 is estimated with

high confidence to be 7,381,353 rows (184,533,825 bytes).

The estimated time for this step is 1.31 seconds.

2) We do an all-AMPs RETRIEVE step from MIM_TBL.TMP by way of an

all-rows scan with a condition of ("MIM_TBL.TMP.Cal_Dt = DATE

'2013-05-11'") into Spool 8 (all_amps) (compressed columns

allowed), which is duplicated on all AMPs. The size of Spool

8 is estimated with high confidence to be 12 rows (204 bytes).

The estimated time for this step is 0.01 seconds.

6) We do an all-AMPs JOIN step from Spool 8 (Last Use) by way of an

all-rows scan, which is joined to Spool 1 (Last Use) by way of an

all-rows scan. Spool 8 and Spool 1 are joined using a single

partition hash_ join, with a join condition of ("SHIPMENT_DT =

Yago_Dt"). The result goes into Spool 7 (all_amps) (compressed

columns allowed), which is built locally on the AMPs. The size of

Spool 7 is estimated with low confidence to be 102,518 rows (

1,537,770 bytes). The estimated time for this step is 0.39

seconds.

7) We do an all-AMPs SUM step to aggregate from Spool 7 (Last Use) by

way of a cylinder index scan. Aggregate Intermediate Results are

computed globally, then placed in Spool 9. The size of Spool 9 is

estimated with high confidence to be 1 row (23 bytes). The

estimated time for this step is 0.02 seconds.

8) We do an all-AMPs RETRIEVE step from Spool 9 (Last Use) by way of

an all-rows scan into Spool 5 (group_amps), which is built locally

on the AMPs. The size of Spool 5 is estimated with high

confidence to be 1 row (25 bytes). The estimated time for this

step is 0.00 seconds.

9) Finally, we send out an END TRANSACTION step to all AMPs involved

in processing the request.

-> The contents of Spool 5 are sent back to the user as the result of

statement 1. The total estimated time is 5 minutes and 17 seconds.

Forums:

Analytics

Design change / Perf Optimization Suggestion required - forum topic by mjasrotia

Trending Articles

Practice Sheet of Right form of verbs for HSC Students

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

How to win at Markstrat (Markstrat Tips and Tricks) – Vodites

Ominde Commission Report and Recommendations – Ominde Report of 1964

Bureau of Internal Revenue: Regional Offices (Directory)

GO 53 on Enhancement of Ex-gratia upto 5 Lakhs Toddy Tappers in Telangana

Cakewalk CA-2A Leveling Amplifier v2.0.1.97 WiN, v2.0.1.96 OSX Incl Keygen

Mp3 Download: Mdu - Kunjenjenjena

How the kill the job , when DTP request running for long hours.

Microsoft Intune から展開しているアプリのアップデートについて

18-year-old girl was beaten for half an hour by two Northampton men in 'an...

Car crash in Dunton Bassett leaves driver in critical condition

Macky 2, Two Others In Road Accident

Application log 00000000000000089514: Could not convert queue DLVST90CLNT

Detroit mafia: D’Anna Brothers agree to plea deal

Delivery block field greyed out using VA02

Muloraki Au

【個人撮影】スマホのプライベート映像♪「中に出さないで///」カラオケ屋での生ハメ撮りが流出ｗ【リベンジポルノ】＠PornHub

BREAKING NEWS: Diamond Platnumz Is Reported Dead After Ghastly Car Accident

FIAT 500 B0111 B0112