Syntax:
Function(arg1,..., argn)
OVER ( [PARTITION BY <...>] [ORDER BY <....>]
[<window_clause>] )
Your Choice of Window
An analytic function might or might not include a windowing
clause. A windowing clause is a set of parameters or keywords that defines the
group (or window) of rows within a particular partition that will
be evaluated for analytic function computation.
Narrowing Your Viewpoint
In its most basic form, a window can be specified in one of three mutually
exclusive ways. Table 1 shows the types of parameters that can be passed to the
ROWS or RANGE windowing clauses.
Windowing Clause Parameter
|
Description
|
current row
|
The window begins and ends with the current
row being processed.
|
UNBOUNDED PRECEDING
|
The window begins with the first row of the
current partition and ends with the current row being processed.
|
numeric expression PRECEDING
|
ROWS clause – The window begins with the
row that is numeric expression rows preceding the current row and ends with
the current row being processed.
RANGE clause – The window begins with the row whose ORDER BY value is numeric expression rows less than, or preceding, the current row and ends with the current row being processed. |
Table 1: Windowing clause parameters
Sql to find top nth salary
select * from (
select empno, ename, job, sal, dense_rank() over (order by sal) ra_nk
from emp
)
where ra_nk = :n_th_sal
select * from (
select empno, ename, job, sal, rank() over (order by sal) ra_nk
from emp
)
where ra_nk = :n_th_sal
Note:
·
rank
skip the sequence if it finds more than one matching where as dense rank wont
skip the sequence.
·
Analytical
functions not allowed in WHERE or HAVING clauses of a select statement
·
In
order by clause on analytical statement, we can use NULLS LAST, NULLS FIRST to
display nulls at the end or beginning
ROW_NUMBER
select ename, job, sal, row_number() over (order by 1) row_num
from emp
NTILE
select first_name, last_name, salary, ntile (4) over ( order by salary)
from employee
How
NTILE Works:
In the above query we are using order by salary in
ascending order. In the table employee we have in total 11 rows. 11/4 gives
nearly 4.
So first 4 rows assigned with 1, next 4 rows with 2, next
4 rows with 3 and remaining rows have 4 has NTILE value.
PERCENT_RANK
select
first_name, last_name, salary, rank() over (order by salary) ra_nk, percent_rank() over ( order by salary) p_r
from employee
Percent_rank (PR) = (Rank-1)/(Number of
rows-1) is the formula used for calculation of PR.
CUME_DIST
select
first_name, last_name, salary, rank() over (order by salary) ra_nk, percent_rank () over (order by salary) p_r,round (cume_dist() over ( order by salary), 2) c_d
from employee
Cumulative Distribution = the highest rank
for that row (cr)/number of rows (nr)
Sql to compare salary of employee with avg.sql
department wise
select empno, ename, sal, deptno, round(avg(sal) over (partition by deptno order by deptno),2) avg_sal , sal - round(avg(sal) over (partition by deptno order by deptno),2) diff,
sign ( sal - round(avg(sal) over (partition by deptno order by deptno),2)) si_gn,
decode( sign (sal - round(avg(sal) over (partition by deptno order by deptno),2)), -1, 'Less Than Avg. Sal', 'More Than Avg.Sal') sal_status
from emp
SQL uses PRECEDING & FOLLOWING Clauses
select ename, sal, deptno, sum(sal) over (partition by deptno order by sal rows 1 preceding) sum_rows
from emp
select ename, sal, deptno, sum(sal) over (partition by deptno order by sal rows between 0 preceding and 1 following ) sum_rows
from emp
Sql to find what fraction of total salary goes to
each individual
select ename, sal, deptno, round ( ratio_to_report(sal) over () , 2) ratio
from emp
Sql uses FIRST_VALUE - LAST_VALUE
select ename, sal, deptno, first_value(sal) over (order by deptno rows between 2 preceding and 1 following) f_val
from emp
select ename, sal, deptno, last_value(sal) over (order by deptno rows between 2 preceding and 1 following) f_val
from emp
SQL to find Running Total till the current row
select empno, ename, deptno, sal, sum(sal) over ( order by sal rows between unbounded preceding and current row) running_total
from
emp
SQL to find Running Total of all the rows
select empno, ename, deptno, sal, sum(sal) over ( order by sal rows between unbounded preceding and unbounded following) running_total
from
emp
Note: the unbounded following clause
fetches rows till the last row. Combination of unbounded preceding and
unbounded following works like a sum() on the given column
SQL to find Running Total from current row to till
the last row
select empno, ename, deptno, sal, sum(sal) over ( order by sal rows between current row and unbounded following) running_total
from
emp
Write a SQL to display ename, sal, previous salary
and next salary from emp table
select ename, job, deptno, sal, lag(sal) over (order by sal) prev_sal, lead(sal) over (order by sal) next_sal
from emp
LISTAGG in 11g
LISTAGG ( column | expression,
delimiter ) WITHIN GROUP (ORDER BY column | expression)
LISTAGG performs as an analytic function if you add an OVER clause:
OVER (PARTITION BY column | expression)
select department_id, listagg(first_name,'-') within group (order by first_name)
from employee
group by
department_id
select first_name || '-' || last_name, salary, department_id, listagg(first_name,'-') within group (order by first_name) over (partition by department_id) listagg_analytic
from employee
PIVOT & UNPIVOT
Sql to display sum(sal)
department wise in column format
select department_id, sum(salary)
from employee
group by
department_id
Sql to display sum(sal)
department wise in row format
select * from (
select department_id, salary from employee
)
pivot (sum(salary) for department_id
in (10,20,30,40,50))
select * from (
select department_id, salary, to_char(hire_date, 'dd-mon-yyyy') from employee
)
pivot (sum(salary) for department_id in (10,20,30,40,50))
Displaying the sum total
salaries of employees per department for a particular year
select * from (
select department_id, salary, to_char(hire_date, 'yyyy') hire_date from employee
)
pivot (sum(salary) for (department_id,hire_date) in ((10,2004),(20,2008),(30,2010),(40,2007))
Pivoting on and displaying
multiple aggregate columns
select * from (
select department_id, salary, hire_date from employee
)
pivot (sum(salary) sal, max(hire_date) hdate for (department_id) in (10,20,30,40))
Unpivoting the pivoted Data
CREATE TABLE pivoted_emp_data AS
select *
from (select department_id, hire_date, salary
from employee)
PIVOT (SUM(salary) sum_sals,
MAX(hire_date) latest_hire
FOR department_id IN (10 AS Acc, 20 AS Pay, 30 AS IT, NULL))
select hire_date, salary
from pivoted_emp_data
UNPIVOT INCLUDE NULLS
((hire_date, salary)
FOR department_id IN (
(acc_latest_hire, acc_sum_sals) AS 'Accounting',
(pay_latest_hire, pay_sum_sals) AS 'Payroll',
(it_latest_hire, it_sum_sals) AS 'IT',
(null_latest_hire, null_sum_sals) AS 'Unassigned'
))
order by
hire_date, salary
SQL to find the count of records by using GROUP BY
select count(*), certified, region
from employ
group by
certified, region
ROLLUP:
Along with the above results if we want subtotal
with respect to each group and overall total then we have to use ROLLUP in
GROUP BY Clause
select count(*), certified, region
from employ
group by rollup(certified,region)
Total of n,Y records -> 3+4 = 7
|
Total of n records -> 3+1 = 4
|
Total of Y records -> 2+1 = 3
|
Displaying the above result along with its row
numbers or serial numbers
select rownum rno , count(*), certified, region
from employ
group by rollup(certified,region)
The above sql gives
“ORA-00979: not a GROUP BY expression” Error so we have to go for row_number
analytic function.
select row_number() over (order by certified, region) r_no , count(*), certified, region
from employ
group by rollup(certified,region)
Note: Every time
we use row_number(), we have to specify the order by clause after over
statement row_number() over (order by
certified, region), in absence of this
statement we will get error as ORA-30485: missing ORDER BY expression in the
window specification
CUBE (A ROLLUP Extension):
If we observe the above result after using ROLLUP,
we see sub totals calculated only for the first column in group by statement
i.e. CERTIFIED, Now if we want sub totals for all the columns in the group by
we have to go for CUBE.
select row_number() over (order by certified, region) r_no , count(*), certified,
region
from employ
group by cube(certified,region)
GroUPinG WITH rollUP AND CUbe
If we observe the above image, row no #3, 6, 7, 8, 9 have
null values either in certified or in region columns. This means they are
summary columns.
GROUPING function helps to flag these rows which contain nulls.
After flagging these values we can use wither DECODE or CASE to change null to
some meaningful text.
CUBE + GROUPING
select row_number() over (order by certified, region) r_no , count(*), certified, region, grouping(certified), grouping(region)
from employ
group by cube(certified,region)
select row_number() over (order by certified, region) r_no , count(*), certified, region, grouping(certified), grouping(region)
,decode(grouping(certified),0,null,'Count By Certified') "Count Certified" , decode(grouping(region),0,null,'Count By Region') "Count
Region"
from employ
group by cube(certified,region)
ROLLUP + GROUPING
select row_number() over (order by certified, region) r_no , count(*), certified, region, grouping(certified), grouping(region)
,decode(grouping(certified),0,null,'Count By Certified') "Count Certified" , decode(grouping(region),0,null,'Count By Region') "Count
Region"
from employ
group by rollup(certified,region)