## Wednesday, 8 August 2012

### Fun with SQL analytical functions

I had an interesting SQL problem at work recently, and I came up with a solution that I'm not sure is completely ideal - so I thought I'd attempt to replicate it here.

Some of the complexity is lost while I obscure and simplify the problem (it involved a hierarchical query), but I think the key elements remain.

Consider a table of codes with an order sequence.
```create table my_values (code varchar2(10), order_seq number(5));
insert into my_values values ('A', 10);
insert into my_values values ('A', 20);
insert into my_values values ('A', 30);
insert into my_values values ('B', 40);
insert into my_values values ('B', 50);
insert into my_values values ('C', 60);
insert into my_values values ('C', 70);
insert into my_values values ('C', 80);
insert into my_values values ('D', 90);```
My requirement was to see the next and previous code. For instance, when listing B records, I wanted to see A and C in the same row - the solution screamed analytical functions so I started with my favourite:
```select code, order_seq
,row_number() over (partition by code order by order_seq) rn
from my_values
order by order_seq;

CODE       ORDER_SEQ RN
---------- --------- --
A                 10  1
A                 20  2
A                 30  3
B                 40  1
B                 50  2
C                 60  1
C                 70  2
C                 80  3
D                 90  1```
I struck the results down to just the first row for each code, incorporating lag/lead to get the info I needed.
```select s.code
,lag(s.code) over (order by s.order_seq) my_lag
from
(
select code, order_seq
,row_number() over (partition by code
order by order_seq) rn
from my_values
order by order_seq
) s
where s.rn = 1
order by order_seq;

---------- ---------- ----------
A                     B
B          A          C
C          B          D
D          C                    ```
Finally, to combine my results I created an in-line view with a subquery factoring statement.
```with sub as (
select s.code
,lag(s.code) over (order by s.order_seq) my_lag
from
(
select code, order_seq
,row_number() over (partition by code
order by order_seq) rn
from my_values
order by order_seq
) s
where s.rn = 1)
from my_values m, sub
where m.code = sub.code;

---------- --------- ---------- ----------
A                 10            B
A                 20            B
A                 30            B
B                 40 A          C
B                 50 A          C
C                 60 B          D
C                 70 B          D
C                 80 B          D
D                 90 C                     ```
So my question is - any ideas for a simpler solution?

Maxim said...

Not really simpler, just another way

select
code
,order_seq
,last_value(preceding ignore nulls) over(order by order_seq) preceding
,last_value(following ignore nulls) over(order by order_seq desc) following
from (
select t.*
, nullif(lead(code) over(order by order_seq),code) following
, nullif(lag(code) over(order by order_seq),code) preceding
from my_values t
) t
order by order_seq

Best regards

Maxim

Timo Raitalaakso said...

The Maxims solution may be improved to do only two window sorts:

select
code
,order_seq
,last_value(preceding ignore nulls) over(order by order_seq) preceding
,first_value(following ignore nulls) over(order by order_seq rows between current row and unbounded following) following
from (
select t.*
, nullif(lead(code) over(order by order_seq),code) following
, nullif(lag(code) over(order by order_seq),code) preceding
from my_values t
) t
order by order_seq
;

And with a dirty interpretation of the test data it is possible with only one window sort:

select t.*
, nullif(first_value(code) over(order by ascii(code) range 1 preceding),code) pr
, nullif(last_value(code) over(order by ascii(code) range between current row and 1 following),code) fo
from my_values t
;