Window funcitons in SQL
Syntax
of Window functions:
<<WindowFunction>>
OVER
(
PARTITION
BY <<List of columns>>
ORDER
BY <<List of columns>> <<ASC|DESC>>
ROWS|RANGE
---This is optional and default is RANGE
)
<<WindowFunciton>>
specifies which window function we need to use
OVER
clause require three items,
a.
Partitioning criteria
b.
Ordering of the data Ascending or Descending
c.
ROWS or RANGE, its an optional filter and default is RANGE BETWEEN
UNBOUNDED PRECEDING AND CURRENT ROW.
We
have many window functions in SQL:
Window
Function |
Description |
ROW_NUMBER |
It
will give consecutive numbers to the specified partition starting
from 1 |
RANK |
It
will give consecutive numbers to DISTINCT values for the specified
partition starting from 1 and it will skip the numbers in between.
You can clearly understand this with the below example |
DENSE_RANK |
It
will give consecutive numbers to DISTINCT values for the specified
partition starting from 1 and it won't skip the numbers in
between. |
NTILE(N) |
It
will divide the given partition into 'N' number of parts |
FIRST_VALE |
It
will give the first value in given partitioning |
LAST_VALUE |
It
will give the last value in given partitioning |
LEAD |
It
will give the next value in given partition and NULL value if no
next value is present |
LAG |
It
will give the previous value in given partition and NULL value if
no previous value is present |
Let's
create a table to understand these functions:
CREATE
TABLE
dbo.DistrictSales
(
SalesID
INT
NOT
NULL
IDENTITY(1,
1),
SalesGroup
NVARCHAR(30)
NOT
NULL,
DistrictName
NVARCHAR(30)
NOT
NULL,
SalesAmount
INT
NOT
NULL
);
GO
INSERT
INTO
dbo.DistrictSales
(SalesGroup,
DistrictName,
SalesAmount)
VALUES
('Telangana',
'Karimnagar',
22000),
('Telangana',
'Hyderabad',
32000),
('Telangana',
'Warangal',
28000),
('Telangana',
'Nalgonda',
18000),
('Telangana',
'RangaReddy',
28000),
('Telangana',
'Medak',
21000),
('Telangana',
'Khammam',
18000),
('AndhraPradesh',
'Guntur',
19000),
('AndhraPradesh',
'Vizag',
22000),
('AndhraPradesh',
'Vijaywada',
18000),
('AndhraPradesh',
'Srikakulam',
16000),
('AndhraPradesh',
'Amaravathi',
16000),
('AndhraPradesh',
'Ananthpur',
32000),
('TamilNadu',
'Kanyakumari',
17000),
('TamilNadu',
'Chennai',
19000),
('TamilNadu',
'Coimbatore',
22000);
GO
SELECT
SalesGroup
,DistrictName
,SalesAmount
,ROW_NUMBER()
OVER(PARTITION
BY
SalesGroup
ORDER
BY
SalesAmount
DESC)
AS
RowNUmber
,RANK()
OVER(PARTITION
BY
SalesGroup
ORDER
BY
SalesAmount
DESC)
AS
BasicRank
,DENSE_RANK()
OVER(PARTITION
BY
SalesGroup
ORDER
BY
SalesAmount
DESC)
AS
DenseRank
,NTILE(3)
OVER(PARTITION
BY
SalesGroup
ORDER
BY
SalesAmount)
AS
NTil
,FIRST_VALUE(SalesAmount)
OVER(PARTITION
BY
SalesGroup
ORDER
BY
SalesAmount
DESC)
AS
FirstValue
,LAST_VALUE(SalesAmount)
OVER(PARTITION
BY
SalesGroup
ORDER
BY
SalesAmount
DESC
ROWS
BETWEEN
UNBOUNDED
PRECEDING
AND
UNBOUNDED
FOLLOWING)
AS
LastValue
,LEAD(SalesAmount,
1)
OVER(PARTITION
BY
SalesGroup
ORDER
BY
SalesAmount
DESC)
AS
NextValue
,LAG(SalesAmount,
1)
OVER(PARTITION
BY
SalesGroup
ORDER
BY
SalesAmount
DESC)
AS
PrevValue
FROM
dbo.DistrictSales
ORDER
BY
SalesGroup,
SalesAmount
DESC
If
we see the above screen shot,
RowNumber
gave sequence number to the given partition.
Rank
gave sequence numbers to the given partition but the rank will be
repeated if the same number is repeated in the partition and next
number will be skipped. For example same number is repeated twice
then it will give same rank to them and will skip next two numbers
while giving the rank to the next distinct value. You can see the
highlighted section for this scenario.
DenseRank
is same as RANK but won't skip the number in between.
Ntile
divides the given partition into specified number of groups.
Remaining
functions we can easily understand by seeing the result set.
If
you observe the query you will identify some difference in LAST_VALUE
syntax compared to other window functions.
SELECT
SalesGroup
,DistrictName
,SalesAmount
,LAST_VALUE(SalesAmount)
OVER(PARTITION
BY
SalesGroup
ORDER
BY
SalesAmount
DESC)
AS
LastValue
,LAST_VALUE(SalesAmount)
OVER(PARTITION
BY
SalesGroup
ORDER
BY
SalesAmount
DESC
ROWS
BETWEEN
UNBOUNDED
PRECEDING
AND
UNBOUNDED
FOLLOWING)
AS
ActualLastValue
FROM
dbo.DistrictSales
ORDER
BY
SalesGroup,
SalesAmount
DESC
Result:
If
you see the result set, in LastValue column data is changing every
time where as in ActualLastValue column it is not for the specified
partition.
If
you remember the syntax, the default value for the partition is RANGE
BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW. As per this
LastValue will be calculated from the current row to previous rows
available in the partition that is the reason you are seeing
different values in LastValue column but not in ActualLastValue
column.
For
ActualLastValue column we are specifying ROWS
BETWEEN
UNBOUNDED
PRECEDING
AND
UNBOUNDED
FOLLOWING
that means it will consider all the values
available in that partition unlike considering the values available
till the current row in that partition.
By using Window funcitons we can calculate cumilative sum/count as well as incremental sum/count till the current row.
SELECT
SalesGroup
,SalesAmount
,COUNT(SalesAmount)
OVER(PARTITION
BY
SalesGroup)
AS
CumilativeCount
,COUNT(SalesAmount)
OVER(PARTITION
BY
SalesGroup
ORDER
BY
SalesAmount
DESC)
AS
CountTillCurrentRow_Default
,COUNT(SalesAmount)
OVER(PARTITION
BY
SalesGroup
ORDER
BY
SalesAmount
DESC
ROWS
BETWEEN
UNBOUNDED
PRECEDING
AND
CURRENT
ROW)
AS
CountTillCurrentRow
,SUM(SalesAmount)
OVER(PARTITION
BY
SalesGroup)
AS
CumilativeSum
,SUM(SalesAmount)
OVER(PARTITION
BY
SalesGroup
ORDER
BY
SalesAmount
DESC)
AS
SumTIllCurrentRow_Default
,SUM(SalesAmount)
OVER(PARTITION
BY
SalesGroup
ORDER
BY
SalesAmount
DESC
ROWS
BETWEEN
UNBOUNDED
PRECEDING
AND
CURRENT
ROW)
AS
SumTIllCurrentRow
FROM
dbo.DistrictSales
Result:
Labels: analytical funcitons in sql, denserank, first value in sql, lag in sql, last value in sql, lead in sql, rank, rownumner, Window functions in sql
0 Comments:
Post a Comment
Subscribe to Post Comments [Atom]
<< Home