Wednesday, August 11, 2010

Oracle FULL JOIN or UNION ALL

Someone was kind enough to argue with me recently about doing a UNION ALL instead of a FULL JOIN in Oracle (in a MERGE). This same poop chute also posted a spam link that probably led to some Trojan downloader.

To make a long post short, Oracle usually does what it thinks is the best way of fetching data and it is spot on most of the time. Hints and fresh statistics can be used when mother optimizer is not right.

Consider this table:

CREATE TABLE megacrap AS
SELECT
LEVEL AS mega_id
, TO_CHAR(TRUNC(SYSDATE)-LEVEL,'MM/DD/YYYY') AS mega_varchar
, TRUNC(SYSDATE,'YEAR')+LEVEL AS mega_date
FROM DUAL
CONNECT BY ROWNUM BETWEEN 1 AND 4000;
ALTER TABLE megacrap ADD(CONSTRAINT pk_megacrap PRIMARY KEY(mega_id));

Updated with this MERGE:

MERGE INTO megacrap m
USING (
SELECT
NVL(v1.mega_id,t1.mega_id) AS mega_id
, v1.mega_varchar
, v1.mega_date
FROM megacrap t1
FULL JOIN (
SELECT
LEVEL AS mega_id
, TO_CHAR(TRUNC(SYSDATE)-LEVEL,'MM/DD/YYYY') AS mega_varchar
, DECODE(LEVEL,420,NULL,TRUNC(SYSDATE,'YEAR')+LEVEL) AS mega_date
FROM DUAL
CONNECT BY ROWNUM BETWEEN 1 AND 4100 --- 100 new rows
) v1 ON v1.mega_id = t1.mega_id
WHERE v1.mega_id IS NULL OR v1.mega_id NOT IN(1,2,69) --- delete 1,2,69
) u
ON (m.mega_id = u.mega_id)
WHEN MATCHED THEN
UPDATE SET
m.mega_varchar = u.mega_varchar
, m.mega_date = u.mega_date
WHERE DECODE(m.mega_varchar,u.mega_varchar,1,0) = 0 OR DECODE(m.mega_date,u.mega_date,1,0) = 0
DELETE WHERE m.mega_varchar IS NULL
WHEN NOT MATCHED THEN
INSERT VALUES (
u.mega_id
, u.mega_varchar
, u.mega_date
);

Guess what the plan looks like: MERGE STATEMENT

  • MERGE MEGACRAP
    • VIEW
      • HASH JOIN OUTER
        • VIEW
          • UNION ALL
            • HASH JOIN RIGHT OUTER
              • VIEW
                • COUNT
                  • CONNECT BY WITHOUT FILTERING
                    • FAST DUAL
              • INDEX PK_MEGACRAP FAST FULL SCAN
            • NESTED LOOPS ANTI
              • VIEW
                • COUNT
                  • CONNECT BY WITHOUT FILTERING
                    • FAST DUAL
              • INDEX PK_MEGACRAP UNIQUE SCAN
        • TABLE ACCESS MEGACRAP FULL

Mother optimizer decided a UNION ALL was the best option. Now which SQL statement do you want to code and maintain? The FULL JOIN above or the UNION ALL with GROUP BY or UNION ALL with GROUP BY and ANTI JOIN?

Oracle Database SQL Language Reference 11g Release 2 (11.2)

Tuesday, April 20, 2010

The Oracle CBO statistical time bomb

Ever have a query or process that runs fine for years and years and years but then one day just slows to a crawl, going from seconds to hours in execution time? If so, you have run into what I call the Oracle CBO statistical time bomb.

Oracle 10g/11g has a daily job that queries for "stale" and missing optimizer statistics on Oracle table/index objects. Those optimizer statistics are used by the optimizer when choosing the best access plan, join types, index and table scans to use for a SQL query. The default for "stale" is if 10 percent of the data have changed (static in 10g, can be changed in 11g).

Let's assume we have this really, really big invoice table that grows at a constant rate over time. This table has an index on the financial quarter. Each quarter roughly 25,000 new rows are inserted, and then updated 3-5 times each (as status changes, etc. whatever), and some idiot needs to clean up a couple hundred bogus rows each month. Here is the table for 2010:

Financial    Total  Insert  Update  Delete  Changed  Percent
Quarter Rows Rows Rows Rows Rows Changed
2010 Q1 1,000,000 25,000 80,000 169 105,169 %10.51 - daily job gathers statistics
2010 Q2 1,024,836 25,000 80,000 169 105,169 %10.26 - daily job gathers statistics
2010 Q3 1,049,672 25,000 80,000 169 105,169 %10.01 - daily job gathers statistics
2010 Q4 1,074,508 25,000 80,000 169 105,169 %09.78 - not stale yet, still have 2010 Q3 statistics

Without current statistics, the Oracle cost-based optimizer thinks one index and access method would be more efficient than a different index and access method (when it really, really, really is not).

I'll need an Oracle Guru to correct me but here is what happens in 2010 Q4:

Your super-optimized, meticulously coded query, queries the invoices by quarter and by another column existing in a massive subquery.

For 2010 Q1,Q2,Q3 the CBO says there are 25,000 rows reading the quarter index and likely less rows using an index on another column (customer number perhaps) if it does the massive subquery first. The CBO got it right and the massive subquery executes first then reads the table via index on the other column.

For 2010 Q4 the CBO says there are no/few rows for Q4 (when there are really 25,000) so read the invoices using the quarter index and then do a nested loop each time on the massive subquery because the cost of the read by quarter index times the cost of the massive subquery is less than the plan it took above (when it really is not). Instead of doing that nested loop on a couple rows (yawn) it does that nested loop massive subquery 25,000 times. Yikes.

What to do? Plan ahead.

Go to Tahiti and research DBMS_STATS in the Oracle Database PL/SQL Packages and Types Reference. Know your tables and diffuse the time bomb before it goes off. Let the daily job handle the stale statistics. Tables that contain an index on a period of time (quarter, year, etc.) or code based on a period of time (i.e. the Spring 2010 line of Mountain Dew with Ginseng and Xanax) should be researched as candidates for statistics gathering outside of Oracle's daily job. Gather table statistics before the big processing periods.

References

Oracle and SSIS; ORA-00903 during verification

Can you tell I have been suppressing my pain and hate up into a nice little ball, instead of talking about my feelings while laying in some lush, peaceful green meadow, sipping Chamomile tea and popping anti-depressants like candy.

This issue is one of those classic, what the **** is going on here errors.

Here is what I am suffering with:

  • Windows XP and Windows Server 2003
  • SQL Server 2008 Integration Services
  • Native OLE DB\Oracle Provider for OLE DB, Oracle 11g R1 or Oracle 10g
  • OLE DB Source (Oracle) using a query.

The Problem

Getting ORA-00903 table or view not found during execution on verification step on Data Flow OLE DB Oracle source; even though at design time (in the OLE DB source editor dialog) the SQL parses correctly AND pulls back test data in the source AND the query runs successfully everywhere else using the same credentials.

The Solution

Drink.

Smoke.

Break something.

Then:

  • in Oracle, CREATE a VIEW for the query
  • if the query contains a user function you will also need to in Oracle, GRANT EXECUTE ON schema.function TO view_schema_owner WITH GRANT OPTION
  • in Oracle, GRANT SELECT on VIEW to the USER used by SSIS (if necessary)
  • in SSIS, use the Advanced Editor on the OLD DB source; Component Properties; AccessMode = OpenRowset, OpenRowset = SCHEMA.view (standard editor will not pull up the view and you are limited to a drop-down)

There that was painless, right?

I have been told that shutting down Visual Studio sometimes helps (it did not). Shutting down Visual Studio works if you switch databases in the connection manager and then get ORA-00903 table or view not found in the OLE DB source. One co-worker stated he rebooted his machine and then things worked (it did not).

I'm sure I was doing something stupid like trying to use a subquery or SYSDATE but WTF; it parsed, it pulled back test data, it works everywhere but the verification step when debugging it.

****!

Oracle and SSIS; SQL tasks and procedures

I am posting today to hopefully prevent others from the hell I have been through. I really have no idea why I continue to do this as my psyche at this point is so eroded that I should be deriving pleasure from the suffering of others like some serial killer in the making.

Getting Oracle and Microsoft to play nice is like trying to get Sean Hannity and Nancy Pelosi to share a quiet evening together making babies while discussing health care and what to spend their money on. If that reference is dated then consider the choice of wiping with toilet paper or the large grain sandpaper and then using the sandpaper; or perhaps shoving a sea urchin up your nose then pulling it out the other nostril.

Here is what I am suffering with:

  • Windows XP and Windows Server 2003
  • SQL Server 2008 Integration Services
  • Native OLE DB\Oracle Provider for OLE DB, Oracle 11g R1 or Oracle 10g
  • Execute SQL Task trying to execute a PROCEDURE or PACKAGE.PROCEDURE that has input or output parameters

Connection Manager

The only non-default property is RetainSameConnection set to True (connections to Oracle are expensive). Personally, I see no need for resource pooling but that is another post.

Execute SQL Task

General

ConnectionType
OLE DB
SQLSourceType
SQLStatement:
{call your_stupid_package.dumb_ass_procedure(?,?)}
or
BEGIN your_stupid_package.dumb_ass_procedure(?,?); END;
IsQueryStoredProcedure
False (usually greyed out, but can be set via Properties, but has no bearing on anything)

Parameter Mapping; Add each parameter:

Variable Name
whatever User::variable you have defined already
Direction
Input or Output
Data Type
VARCHAR (I have not been able to get anything else to work yet)
Parameter Name
the left to right sequence number of the parameter starting at zero (so, if you only have one the parameter name is 0)
Parameter Size
the size of the variable

Conclusion

I would cite references but that vast majority of searches dead-end with single posts in forums where some poor slob went begging for help and found none; or sometimes found a reply from another kindred spirit also suffering through the same agony. Those who did have solutions were not passing parameters or were using an Expression to build the query string with parameters (doing their own SQL injection; which also does not help those using output parameters).

The above works, with a couple warnings, but works.

Thursday, February 25, 2010

Oracle MERGE, type 2 dimension

I decided to look into this topic while learning SQL Server Integration Services and debating with my caffeine soaked brain if the slowly changing dimension task container would be more efficient than a T-SQL MERGE statement. I began wondering why the hell we were planning on shipping about 40 gigabytes per day out of Oracle into tables in SQL Server that are truncated (emptied) and rebuilt then query to determine what 2 kilobytes worth of changes took place that day. Why on earth would we want to stress out that multi-core, refrigerator sized server when an underpowered server and network and storage can do all the work?

Fact tables and dimensions are part of data warehousing. A slowly changing dimension is a dimension that slowly changes over time (duh). There are three types of slowly changing dimensions:

  • Type 1: Update the old values with the new values (do not give a crap about history)
  • Type 2: Create a new row in the dimension with a new primary (surrogate) key, and optionally update existing rows
  • Type 3: Update the old values with the new values, and add additional data to the table (date of the change, keep one level of history)

You would figure type 3 would come before type 2 but who cares..? This dimension uses a generated surrogate key for uniqueness (a sequence) and dates to keep track of when the row was effective with a NULL end date indicating this row is the current version of the foreign primary key value and all others contain historical data and attributes.

The same logic used to maintain slowly changing dimensions can also be applied to tracking historical changes to tables.

The code that follows assumes only additions and changes are taking place; after all nobody ever deletes data do they? The dimension table is simple, no clustering, no partitioning, bit-map indexes, or compression, and is purely for example:

CREATE TABLE crap_dimension
( dim_pk_surrogate NUMBER
, dim_version_beg_date DATE
, dim_version_end_date DATE
, dim_pk_foreign VARCHAR2(10)
, dim_description VARCHAR2(60)
, dim_bulb VARCHAR2(05)
, dim_sum VARCHAR2(08)
)

The MERGE SQL statement is best used for inserts and updates. Here is the single statement used for the entire dimension update:

MERGE /*+ APPEND */
INTO crap_dimension m
USING (
--- Step 1 BEG :: Select new and changed data for the USING section of the MERGE
SELECT
DECODE(v2.dim_upsert_level,1,v2.dim_pk_surrogate,0) AS dim_pk_surrogate
, v1.dim_pk_foreign
, v1.dim_description
, v1.dim_bulb
, v1.dim_sum
FROM (
--- Step 1a BEG :: Collect data from your OLTP or ODS or POS
SELECT
crap_id AS dim_pk_foreign
, crap_desc AS dim_description
, crap_poop AS dim_bulb
, crap_turd AS dim_sum
FROM crap_oltp_or_ods_source
--- Step 1a END
--- Step 1b BEG :: Remove identical data attributes already in the dimension
MINUS
SELECT
, dim_pk_foreign
, dim_description
, dim_bulb
, dim_sum
FROM crap_dimension
WHERE dim_version_end_date IS NULL
--- Step 1b END
) v1
--- Step 1c BEG :: two rows are needed for updates, update existing and insert new data, LEFT JOIN cartesian product of
LEFT JOIN (
SELECT
dim_pk_surrogate
, dim_pk_foreign
, dim_upsert_level
FROM crap_dimension
, (SELECT LEVEL AS dim_upsert_level FROM DUAL CONNECT BY LEVEL BETWEEN 1 AND 2)
WHERE dim_version_end_date IS NULL
) v2 ON v2.dim_pk_foreign = v1.dim_pk_foreign
--- Step 1c END
--- Step 1 END
) u
ON (m.dim_pk_surrogate = u.dim_pk_surrogate)
--- Step 2 BEG :: If update, mark the end date
WHEN MATCHED THEN UPDATE
SET m.dim_version_end_date = SYSDATE
--- Step 2 END
--- Step 3 BEG :: Insert new data and the new versions of changed data
WHEN NOT MATCHED THEN INSERT
VALUES (
dim_pk_surrogate_seq.NEXTVAL
, SYSDATE
, NULL
, u.dim_pk_foreign
, u.dim_description
, u.dim_bulb
, u.dim_sum
)
--- Step 3 END
;

I tried to break down the statement into steps.

Step 1a collects data from your ODS or OLTP and can be a simple query or a gigantic, complicated parallel query with a hundred joins and aggregate steps. The important thing to remember is we only want to do this expensive step once. When 1a completes we will have a fairly large result set.

Step 1b subtracts all identical data attribute rows that already exist in the dimension from the previous result set. This should significantly reduce the working set of rows at the expense of a full table scan through our dimension. When 1b completes the working set will contain new data and changed data from the source tables.

Step 1c creates data for updates. If data in the source has changed then two rows will be needed for the dimension; one row to update the existing data making the row historical and one row to insert for the new data making it the current row. The surrogate and foreign primary keys for all current rows are Cartesian joined to values 1 and 2 (one full table scan then hash join instead of two full table scans) and then left joined to the previous result set. New data from the source will not match data and will remain as one row with a NULL surrogate primary key. Changes from the source will match and join to the two rows in the query creating the two rows needed to do the update and insert. The surrogate primary key will be 0 for inserts (assuming the sequence started at 1 or higher) and valued for updates.

Step 1 finishes with rows containing changes to be applied to the dimension. Step 2 updates historical rows simply setting the end date to the current date. Step 3 inserts new data from the source and the new current version of the row for changed data.