Monday 27 July 2009

UNION ALL - Performance & Accuracy

Time for my first on topic Oracle titbit - UNION ALL

Take the following example. I had two mutually exclusive tables, fairly flat and uncomplicated. Both queries returned 17 rows.

I wanted all rows returned, regardless of uniqueness.

The original solution used UNION.
SELECT col_a, col_b, col_c
FROM table_1
UNION
SELECT col_a, col_b, col_c
FROM table_2

I knew in this case not only was this not optimal, it was also inaccurate. This case should have used UNION ALL
SELECT col_a, col_b, col_c
FROM table_1
UNION ALL
SELECT col_a, col_b, col_c
FROM table_2

Such a minor difference in syntax, but in this case ~30% faster and 34 rows returned.
Had two of these records had the same column values, less than 34 rows would have been returned. Sure, in other scenarios this may be accurate, but in others this may cause data integrity issues, throw out reports or lose data.

In terms of speed, my basic test harness iterated these two queries 50000 times and got the following results:
7.27 secs -- union
.0001454 secs per iteration

5.54 secs -- union all
.0001108 secs per iteration

Should this query only be executed once per day, probably not an issue in terms of speed. However if it were required thousands of times per minute then this would probably make an impact.

I think every little bit helps. Take the anology of Aussie Rules - Australia's favourite sport. Commentators say all those one percenters count, they can make the difference to the end result. Imagine taking 100 separate Oracle efficiency tips and bundling them within your software. Collectively, how much wait time will it ultimately save the user?

No comments: