Monday, December 2, 2013

Analytical Function In ORACLE

Syntax:
Function(arg1,..., argn) OVER ( [PARTITION BY <...>] [ORDER BY <....>] [<window_clause>] )

Your Choice of Window
An analytic function might or might not include a windowing clause. A windowing clause is a set of parameters or keywords that defines the group (or window) of rows within a particular partition that will be evaluated for analytic function computation.
Narrowing Your Viewpoint
In its most basic form, a window can be specified in one of three mutually exclusive ways. Table 1 shows the types of parameters that can be passed to the ROWS or RANGE windowing clauses.
Windowing Clause Parameter
Description
current row
The window begins and ends with the current row being processed.
UNBOUNDED PRECEDING
The window begins with the first row of the current partition and ends with the current row being processed.
numeric expression PRECEDING
ROWS clause – The window begins with the row that is numeric expression rows preceding the current row and ends with the current row being processed.
RANGE clause – The window begins with the row whose ORDER BY value is numeric expression rows less than, or preceding, the current row and ends with the current row being processed.

Table 1:
Windowing clause parameters

Sql to find top nth salary

select * from (
                        select empno, ename, job, sal, dense_rank() over (order by sal) ra_nk
                        from emp
                    )
where ra_nk = :n_th_sal

select * from (
                        select empno, ename, job, sal, rank() over (order by sal) ra_nk
                        from emp
                    )
where ra_nk = :n_th_sal

Note:
·         rank skip the sequence if it finds more than one matching where as dense rank wont skip the sequence.
·         Analytical functions not allowed in WHERE or HAVING clauses of a select statement
·         In order by clause on analytical statement, we can use NULLS LAST, NULLS FIRST to display nulls at the end or beginning
ROW_NUMBER

select ename, job, sal, row_number() over (order by 1) row_num
from emp


NTILE

select first_name, last_name, salary, ntile (4) over ( order by salary)
 from employee

How NTILE Works:

In the above query we are using order by salary in ascending order. In the table employee we have in total 11 rows. 11/4 gives nearly 4.
So first 4 rows assigned with 1, next 4 rows with 2, next 4 rows with 3 and remaining rows have 4 has NTILE value.


PERCENT_RANK

select  first_name, last_name, salary, rank() over (order by salary) ra_nk, percent_rank() over ( order by salary) p_r
from employee
Percent_rank (PR) = (Rank-1)/(Number of rows-1) is the formula used for calculation of PR.

CUME_DIST

select  first_name, last_name, salary, rank() over (order by salary) ra_nk, percent_rank () over (order by salary) p_r,round (cume_dist() over ( order by salary), 2) c_d
from employee
Cumulative Distribution = the highest rank for that row (cr)/number of rows (nr)

Sql to compare salary of employee with avg.sql department wise

select empno, ename, sal, deptno, round(avg(sal) over (partition by deptno order by deptno),2) avg_sal , sal - round(avg(sal) over (partition by deptno order by deptno),2) diff,
sign ( sal - round(avg(sal) over (partition by deptno order by deptno),2)) si_gn,
decode( sign (sal - round(avg(sal) over (partition by deptno order by deptno),2)), -1, 'Less Than Avg. Sal', 'More Than Avg.Sal') sal_status
from emp

SQL uses PRECEDING & FOLLOWING Clauses

select ename, sal, deptno, sum(sal) over (partition by deptno order by sal rows 1 preceding) sum_rows
from emp
select ename, sal, deptno, sum(sal) over (partition by deptno order by sal rows between 0 preceding and 1 following ) sum_rows
from emp

Sql to find what fraction of total salary goes to each individual

select ename, sal, deptno, round ( ratio_to_report(sal) over () , 2) ratio
from emp

Sql uses FIRST_VALUE -  LAST_VALUE

select ename, sal, deptno, first_value(sal) over (order by deptno rows between 2 preceding and 1 following) f_val
from emp
select ename, sal, deptno, last_value(sal) over (order by deptno rows between 2 preceding and 1 following) f_val
from emp

SQL to find Running Total till the current row

select empno, ename, deptno, sal, sum(sal) over ( order by sal rows between unbounded preceding and current row) running_total
from  emp

SQL to find Running Total of all the rows

select empno, ename, deptno, sal, sum(sal) over ( order by sal rows between unbounded preceding and unbounded following) running_total
from  emp
Note: the unbounded following clause fetches rows till the last row. Combination of unbounded preceding and unbounded following works like a sum() on the given column

SQL to find Running Total from current row to till the last row

select empno, ename, deptno, sal, sum(sal) over ( order by sal rows between current row and unbounded following) running_total
from  emp

Write a SQL to display ename, sal, previous salary and next salary from emp table

select ename, job, deptno, sal, lag(sal) over (order by sal) prev_sal, lead(sal) over (order by sal) next_sal
from emp


LISTAGG in 11g

LISTAGG ( column | expression,
delimiter ) WITHIN GROUP (ORDER BY column | expression)

LISTAGG performs as an analytic function if you add an OVER clause:
OVER (PARTITION BY column | expression)

select department_id, listagg(first_name,'-') within group (order by first_name)
from employee
group by department_id
select first_name || '-' || last_name, salary,  department_id, listagg(first_name,'-') within group (order by first_name)  over (partition by department_id) listagg_analytic
from employee

PIVOT & UNPIVOT
Sql to display sum(sal) department wise in column format

select department_id, sum(salary)
from employee
group by department_id

Sql to display sum(sal) department wise in row format
select * from (
                        select department_id, salary from employee
                    )
 pivot (sum(salary) for department_id in (10,20,30,40,50))

select * from (
                        select department_id, salary, to_char(hire_date, 'dd-mon-yyyy') from employee
                    )
 pivot (sum(salary) for department_id in (10,20,30,40,50))

Displaying the sum total salaries of employees per department for a particular year

select * from (
                        select department_id, salary, to_char(hire_date, 'yyyy') hire_date from employee
                    )
 pivot (sum(salary) for (department_id,hire_date) in ((10,2004),(20,2008),(30,2010),(40,2007))

Pivoting on and displaying multiple aggregate columns

select * from (
                        select department_id, salary,  hire_date from employee
                    )
 pivot (sum(salary) sal, max(hire_date) hdate for (department_id) in (10,20,30,40))

Unpivoting the pivoted Data

CREATE TABLE pivoted_emp_data AS
    select *
      from (select department_id, hire_date, salary
              from employee)
      PIVOT (SUM(salary) sum_sals,
             MAX(hire_date) latest_hire
      FOR department_id IN (10 AS Acc, 20 AS Pay, 30 AS IT, NULL))

select hire_date, salary
      from pivoted_emp_data
         UNPIVOT INCLUDE NULLS
          ((hire_date, salary)
             FOR department_id IN (
             (acc_latest_hire, acc_sum_sals) AS 'Accounting',
             (pay_latest_hire, pay_sum_sals) AS 'Payroll',
             (it_latest_hire, it_sum_sals) AS 'IT',
             (null_latest_hire, null_sum_sals) AS 'Unassigned'
        ))
   order by hire_date, salary


SQL to find the count of records by using GROUP BY

select count(*), certified, region
from employ
group by certified, region

ROLLUP:

Along with the above results if we want subtotal with respect to each group and overall total then we have to use ROLLUP in GROUP BY Clause

select count(*), certified, region
from employ
group by rollup(certified,region)
Total of n,Y records -> 3+4 = 7
Total of n records -> 3+1 = 4
Total of Y records -> 2+1 = 3

Displaying the above result along with its row numbers or serial numbers

select rownum rno , count(*), certified, region
from employ
group by rollup(certified,region)

The above sql gives “ORA-00979: not a GROUP BY expression” Error so we have to go for row_number analytic function.

select row_number() over (order by certified, region) r_no , count(*), certified, region
from employ
group by rollup(certified,region)

Note: Every time we use row_number(), we have to specify the order by clause after over statement row_number() over (order by certified, region), in absence of this statement we will get error as ORA-30485: missing ORDER BY expression in the window specification

CUBE (A ROLLUP Extension):
If we observe the above result after using ROLLUP, we see sub totals calculated only for the first column in group by statement i.e. CERTIFIED, Now if we want sub totals for all the columns in the group by we have to go for CUBE.

select row_number() over (order by certified, region) r_no , count(*), certified, 
region
from employ
group by cube(certified,region)

GroUPinG WITH rollUP AND CUbe

If we observe the above image, row no #3, 6, 7, 8, 9 have null values either in certified or in region columns. This means they are summary columns.
GROUPING function helps to flag these rows which contain nulls. After flagging these values we can use wither DECODE or CASE to change null to some meaningful text.

CUBE + GROUPING

select row_number() over (order by certified, region) r_no , count(*), certified, region, grouping(certified), grouping(region)
from employ
group by cube(certified,region)

select row_number() over (order by certified, region) r_no , count(*), certified, region, grouping(certified), grouping(region)
,decode(grouping(certified),0,null,'Count By Certified')  "Count Certified"  , decode(grouping(region),0,null,'Count By Region') "Count Region"
from employ
group by cube(certified,region)

ROLLUP + GROUPING

select row_number() over (order by certified, region) r_no , count(*), certified, region, grouping(certified), grouping(region)
,decode(grouping(certified),0,null,'Count By Certified')  "Count Certified"  , decode(grouping(region),0,null,'Count By Region') "Count Region"
from employ
group by rollup(certified,region)