Harpreet,
Your option four does indeed eliminate the duplicate row and PI checks, but is also most susceptible to duplicate rows. The duplicate tuple check should be handled by some other mechanism, Changed Data Capture processes, ELT or a Unique Secondary Index. The USI will maintain uniqueness without affecting the distribution of the data across the amps.
At some point you'll hear or read an Alison Torres presentation, the first concern of picking the PI on a Teradata table is the primary access path, with the caveat of minor skewing. For that reason, you'll see a lot of tables with NUPI's and USI's on them. Depending upon the size of the table, with some quick testing you'll see that the increased I/O of the USI subtable is offset by having amp-local joins with large and/or frequently joined tables.
↧