Wednesday, August 11, 2010

Oracle FULL JOIN or UNION ALL

Someone was kind enough to argue with me recently about doing a UNION ALL instead of a FULL JOIN in Oracle (in a MERGE). This same poop chute also posted a spam link that probably led to some Trojan downloader.

To make a long post short, Oracle usually does what it thinks is the best way of fetching data and it is spot on most of the time. Hints and fresh statistics can be used when mother optimizer is not right.

Consider this table:

CREATE TABLE megacrap AS
SELECT
LEVEL AS mega_id
, TO_CHAR(TRUNC(SYSDATE)-LEVEL,'MM/DD/YYYY') AS mega_varchar
, TRUNC(SYSDATE,'YEAR')+LEVEL AS mega_date
FROM DUAL
CONNECT BY ROWNUM BETWEEN 1 AND 4000;
ALTER TABLE megacrap ADD(CONSTRAINT pk_megacrap PRIMARY KEY(mega_id));

Updated with this MERGE:

MERGE INTO megacrap m
USING (
SELECT
NVL(v1.mega_id,t1.mega_id) AS mega_id
, v1.mega_varchar
, v1.mega_date
FROM megacrap t1
FULL JOIN (
SELECT
LEVEL AS mega_id
, TO_CHAR(TRUNC(SYSDATE)-LEVEL,'MM/DD/YYYY') AS mega_varchar
, DECODE(LEVEL,420,NULL,TRUNC(SYSDATE,'YEAR')+LEVEL) AS mega_date
FROM DUAL
CONNECT BY ROWNUM BETWEEN 1 AND 4100 --- 100 new rows
) v1 ON v1.mega_id = t1.mega_id
WHERE v1.mega_id IS NULL OR v1.mega_id NOT IN(1,2,69) --- delete 1,2,69
) u
ON (m.mega_id = u.mega_id)
WHEN MATCHED THEN
UPDATE SET
m.mega_varchar = u.mega_varchar
, m.mega_date = u.mega_date
WHERE DECODE(m.mega_varchar,u.mega_varchar,1,0) = 0 OR DECODE(m.mega_date,u.mega_date,1,0) = 0
DELETE WHERE m.mega_varchar IS NULL
WHEN NOT MATCHED THEN
INSERT VALUES (
u.mega_id
, u.mega_varchar
, u.mega_date
);

Guess what the plan looks like: MERGE STATEMENT

  • MERGE MEGACRAP
    • VIEW
      • HASH JOIN OUTER
        • VIEW
          • UNION ALL
            • HASH JOIN RIGHT OUTER
              • VIEW
                • COUNT
                  • CONNECT BY WITHOUT FILTERING
                    • FAST DUAL
              • INDEX PK_MEGACRAP FAST FULL SCAN
            • NESTED LOOPS ANTI
              • VIEW
                • COUNT
                  • CONNECT BY WITHOUT FILTERING
                    • FAST DUAL
              • INDEX PK_MEGACRAP UNIQUE SCAN
        • TABLE ACCESS MEGACRAP FULL

Mother optimizer decided a UNION ALL was the best option. Now which SQL statement do you want to code and maintain? The FULL JOIN above or the UNION ALL with GROUP BY or UNION ALL with GROUP BY and ANTI JOIN?

Oracle Database SQL Language Reference 11g Release 2 (11.2)