SQL: Window Functions

by Mark Nielsen
Copyright July 2021


The purpose of this document is to show how to do window functions. It is also to stop people from asking this in interviews as it is very lame and really just a set of functions in MySQL. Are we expected to remember all functions? Yes according to lame people who make lame tests. And yes because some interviews will always have those people. There is so much with MySQL this is only small part. The reason why it is lame is non-DBAs love to ask about these functions because they don't understand its lame and they want to make themselves feel superior. People look at things to ask about MySQL on the internet because they know nothing usually and want to feel superior. So I hope someone can get use of this.

A lot of the explanations are too complicated. Its really simple.


  1. Links
  2. What do window functions do
  3. The MySQL setup.
  4. Window Functioning made with normal (any) group command
  5. New Window Functions

Links



What window functions do.

Window functioning is when you take any group function and display the same result in more than one line. Normally in a valid query the results are shown one result per line. This is true for previous groups functions and what they call Window Functions. I think the name Window Functions is bad.

With window functions, it really just allows you to show the results of the same grouped result on multiple lines. For example, I want to show the entire sum of money for each state, and everybody in the state. I want to show one person per line and the repeating sum of money the person is on that state. Thus, the group function is calculated and displayed for each person in the state. Normally the sum the money for a state us displayed on one line. The state and then the sum. But with window functions you can apply the result of the sum to more than one line --- For each person in the state. An example is shown below plus more window functions.

Why can this be bad? It might be bad because it has to calculate the sums, and then redo the results with the sum in multiple lines. This might cause it so be slower. I think they optimize this, but I am sure in some cases it can slow things down.


MySQL setup

Issue these queries in MySQL prompt

  create database if not exists temp;
  use temp;
  drop database if exists MY_STUPID_DATABASE;

  create database if not exists MY_STUPID_DATABASE;
  use MY_STUPID_DATABASE;
  
  create table income (person varchar(64), state varchar(64), amount int);
  insert into income values ('mark', 'CA', 10), ('john', 'CA', 20), ('heidi','CA', 30),
  ('Melisa', 'MA', 100), ('Faith', 'MA', 200),('Queen', 'MA', 300),
  ('Collin', 'IN', 51), ('Tristan', 'OH', 61),('John', 'MA', 71);
  


Window Functioning made with normal (any) group command

Any group command can be made into function like window functions. You have to use partitions.

For example:

	SELECT person, state, SUM(amount) OVER (PARTITION BY state) as state_income
        FROM income;

	SELECT person, state, count(amount) OVER (PARTITION BY state) as state_no_of_entries
	FROM income;
      


New Window functions

  SELECT person, state, amount,
  CUME_DIST()   OVER w AS 'CUME_DIST',
  DENSE_RANK()   OVER w AS 'DENSE_RANK',
  PERCENT_RANK()   OVER w AS 'PERCENT_RANK',
  RANK()   OVER w AS 'RANK',
  ROW_NUMBER()   OVER w AS 'ROW_NUMBER'
  from income
WINDOW w AS (ORDER BY amount)
;

  SELECT person, state, amount,
  FIRST_VALUE(amount)   OVER w AS 'FIRST_VALUE',
  LAST_VALUE(amount)   OVER w AS 'LAST_VALUE',
  LAG(amount)        OVER w AS 'lag',
  LEAD(amount)       OVER w AS 'lead',
  NTH_VALUE(amount, 2)   OVER w AS 'NTH_VALUE',
  NTILE(4)   OVER w AS 'NTILE'
  
  from income
  WINDOW w AS (PARTITION BY state ORDER BY amount)
  ;

  SELECT person, state, amount,
  CUME_DIST()   OVER  (w ORDER BY amount) AS 'CUME_DIST',
  FIRST_VALUE(amount)   OVER w AS 'FIRST_VALUE'
  from income
  WINDOW w AS (PARTITION BY state)
  ;
  
  SELECT person, state, amount,
  CUME_DIST()   OVER  w AS 'CUME_DIST',
  FIRST_VALUE(amount)   OVER w2 AS 'FIRST_VALUE'
  from income
  WINDOW w AS (order BY state),
         w2 AS (PARTITION BY state);