Monday 19 October 2009

Building simple test scenarios

When I talk to people learning about how to write queries in Oracle - particularly business analysts - a typical conversation would consist of basics such as single row functions, joins and aggregates. However when the subject of DDL is raised, such as CREATE TABLE, I often get the reaction - "Oh, I don't need to know that".

I beg to differ. Regularly I create little tables to test theories with; confirm behaviour; and demonstrate test cases - in fact my schema is littered with them. I think this is an important skill, particularly the latter. If you're having trouble getting some task done, it is often helpful to simplify the problem to the lowest common denominator - a basic test case. Colleagues, DBAs, Oracle support - they'll all ask for the same thing. Visit asktom.oracle.com, he requests and uses them all the time.

So I think one should fluent with creating and dropping small test tables. Take the example of demonstrating the handling of nulls by the AVG group function.
-- always good to have drop at the top so you can repeat the test from scratch
-- ... and clean up your schema!
drop table avg_test;
create table avg_test (a number);
insert into avg_test values (1);
insert into avg_test values (2);
insert into avg_test values (3);
insert into avg_test values (null);
select sum(a)
      ,avg(a)
      ,avg(nvl(a,0))
      ,sum(a)/count(a) my_avg
      ,sum(a)/count(*) my_Avg2
      ,count(a)
      ,count(*)
      ,count(1)
      ,count(null)
from avg_test;

    SUM(A)     AVG(A) AVG(NVL(A,0))     MY_AVG    MY_AVG2
---------- ---------- ------------- ---------- ----------
         6          2           1.5          2        1.5

  COUNT(A)   COUNT(*)   COUNT(1) COUNT(NULL)
---------- ---------- ---------- -----------
         3          4          4           0
So by creating a basic table with just one column, I can insert four rows for this particular case that allows me to demonstrate & confirm the behaviour of avg vs sum/count; and how count(*) differs from count of a particular field. This obviously harmonises with the oft forgot sentence in the documentation:
If you specify the asterisk (*), then this function returns all rows, including duplicates and nulls.

With this in mind, I can then relate back to my actual data and see how the following two queries can differ in behaviour just because of the expression I've used within the count function.
SELECT  count(*), daily_rate
FROM resources
GROUP by daily_rate
HAVING COUNT(*) > 1;

  COUNT(*) DAILY_RATE
---------- ----------
         2        100
         4
         2        120

SELECT  count(daily_rate), daily_rate
FROM resources
GROUP by daily_rate
HAVING COUNT(daily_rate) > 1;

COUNT(DAILY_RATE) DAILY_RATE
----------------- ----------
                2        100
                2        120
We've always got to consider nulls, even with aggregated functions!

In the context of a person learning how Oracle behaves, the simple process of creating small test tables can be quite valuable.

No comments: