For the query below , imanagerid = 100 and iempid are being applied after join condition.
select coalesce(t1.managerid,10) as imanagerid
,coalesce(t1.empid,1) as iempid
from test.t1
full outer join
test.t2
on t1.managerid = t2.managerid
and t1.empid = t2.empid
where imanagerid = 100
and iempid = 1 ;
Look at explain :
1) First, we lock a distinct test."pseudo table" for read on a
RowHash to prevent global deadlock for test.t2.
2) Next, we lock a distinct test."pseudo table" for read on a RowHash
to prevent global deadlock for test.t1.
3) We lock test.t2 for read, and we lock test.t1 for read.
4) We do an all-AMPs RETRIEVE step from test.t2 by way of an all-rows
scan with no residual conditions into Spool 2 (all_amps), which is
redistributed by the hash code of (test.t2.empid) to all AMPs.
Then we do a SORT to order Spool 2 by row hash. The size of Spool
2 is estimated with low confidence to be 2 rows (42 bytes). The
estimated time for this step is 0.01 seconds.
5) We do an all-AMPs JOIN step from Spool 2 (Last Use) by way of a
RowHash match scan, which is joined to test.t1 by way of a RowHash
match scan with no residual conditions. Spool 2 and test.t1 are
full outer joined using a merge join, with condition(s) used for
non-matching on left table ("(NOT (empid IS NULL )) AND (NOT
(managerid IS NULL ))"), on right table ("(NOT (test.t1.empid IS
NULL )) AND (NOT (test.t1.managerid IS NULL ))"), with a join
condition of ("(test.t1.empid = empid) AND (test.t1.managerid =
managerid)"). The result goes into Spool 3 (all_amps), which is
built locally on the AMPs. The size of Spool 3 is estimated with
low confidence to be 2 rows (42 bytes). The estimated time for
this step is 0.04 seconds.
6) We do an all-AMPs RETRIEVE step from Spool 3 (Last Use) by way of
an all-rows scan with a condition of ("((( CASE WHEN (NOT (empid
IS NULL )) THEN (empid) ELSE (1) END ))= 1) AND ((( CASE WHEN (NOT
(managerid IS NULL )) THEN (managerid) ELSE (10) END ))= 100)")
into Spool 1 (group_amps), which is built locally on the AMPs.
The size of Spool 1 is estimated with low confidence to be 2 rows
(86 bytes). The estimated time for this step is 0.04 seconds.
7) Finally, we send out an END TRANSACTION step to all AMPs involved
in processing the request.
-> The contents of Spool 1 are sent back to the user as the result of
statement 1. The total estimated time is 0.09 seconds.
Is this the rule .. of outer join..? The condition should be applied as filter condition before join .But this is not the case here.
Could you please explain this behaviour.?
Forums: