How to validate whether my PI selection is right?

The thing about the Primary Index is that ISN'T all about data distribution. If you select a PI that perfectly distributes the data across the AMPS, but is never used as a Join column or in a query predicate, you're doing yourself and your database a dis-service. A key priority of the PI is that it should define the most common access path, some data skew is perfectly acceptable. There are probably as many definitions for accetable skew as there are people who can spell "skew"...so don't look for one "perfect" answer, personally I try to keep data skew below 20-25. When you define the same PI on two commonly joined tables, you enable your TD system to create AMP local joins, with no redistribution. You can see this in the explain plan. Remember, data has to be on the same AMP to be joined. If you have two tables with different PI's, the optimizer will pick the smaller (if it can) of the two tables, assuming you have current statistics collected, do a FTS on the smaller table, bring it into spool, re-hash the rows on that table in spool and push it back down to the AMPs according to the re-hashed column ( the PI of the larger table) and THEN join your rows. If you can define the same PI on two freuqently joined tables and give up some skew, you can eliminate the FTS, re-hash and redistribution. Consider the Sales Order Header and Sales Order Line Item tables, a fairly common occurence. The natural identifier for the Header Table and by default, initial PI selection, is some thing SalesOrderHeaderID. The Line Item table also has a natural identifier, but in this case it happens to be a composite PI, composed of SalesOrderHeaderID ( to relate each line back to its parent header row) and LineItemID (to differentiate each line item within the order). If you select SalesOrderHeaderId, LineItemID as a composite PI on the Line Item Table, you've now hashed ALL of your Item rows to different amps than your header rows, perfectly distributed, but NOT co-located. The query that joins headers and line items now has to grab the line item table complete a full table scan, rehash it to the salesorderheaderid and then join it. All you've achieved is data distribution at the more complex processing. If you accept some skew and define the PI of the LineItem table as SalesOrderHeaderId alone, the child rows will hash to the same amps as their parent rows, run the same query and you've now got amp local operations...

How to validate whether my PI selection is right? - response (3) by VandeBergB

Trending Articles

Bath man appears in court charged with attempted murder of a man...

MACLEAN, Allan

Black Angus Grilled Artichokes

Practice Sheet of Right form of verbs for HSC Students

Police blotter for Jan. 12

99 God Status for Whatsapp, Facebook

Rajasthan Board 12th Science Result 2018 name wise- RBSE 12th commerce result...

Notorious Naushad of Ippa gang nabbed

Child Kidnapping: Amy McNeil was kidnapped on her way to school by 5 adults;...

Sonible Smartlimit v1.1.5-R2R

NCERT Solutions for Class 9th Sanskrit Chapter 3 पाथेयम्

मतलबी दोस्त स्टेट्स | Matlabi Dost Status in Hindi – Selfish Friends Status

Arrow Flash 2 – Sinhala Dubbed – Episode 23 – 20th March 2016

[GET] AI Traffic Goldmine

[E² Plugin] HDF-Radio

Universal Multi-Patch v1.3 By RADIXX11

IWAN – Thanks and Praise ( Throw Back Thursday )

RONALD P SONDERGAARD Arrested by Miami-Dade County Corrections on Mar 03, 2017

मुख मैथुन से उठाएं सेक्स का भरपूर मज़ा, जानें क्या है इसका सही तरीकामुख मैथुन...

HSSC Excise & Taxation Inspector Result 2017 Scorecard/ Category Wise Merit List