A lot of the explanations are too complicated. Its really simple.
With window functions, it really just allows you to show the results of the same grouped result on multiple lines. For example, I want to show the entire sum of money for each state, and everybody in the state. I want to show one person per line and the repeating sum of money the person is on that state. Thus, the group function is calculated and displayed for each person in the state. Normally the sum the money for a state us displayed on one line. The state and then the sum. But with window functions you can apply the result of the sum to more than one line --- For each person in the state. An example is shown below plus more window functions.
Why can this be bad? It might be bad because it has to calculate the sums, and then
redo the results with the sum in multiple lines. This might cause it so be slower.
I think they optimize this, but I am sure in some cases it can slow things down.
create database if not exists temp; use temp; drop database if exists MY_STUPID_DATABASE; create database if not exists MY_STUPID_DATABASE; use MY_STUPID_DATABASE; create table income (person varchar(64), state varchar(64), amount int); insert into income values ('mark', 'CA', 10), ('john', 'CA', 20), ('heidi','CA', 30), ('Melisa', 'MA', 100), ('Faith', 'MA', 200),('Queen', 'MA', 300), ('Collin', 'IN', 51), ('Tristan', 'OH', 61),('John', 'MA', 71);
For example:
SELECT person, state, SUM(amount) OVER (PARTITION BY state) as state_income FROM income; SELECT person, state, count(amount) OVER (PARTITION BY state) as state_no_of_entries FROM income;
SELECT person, state, amount, CUME_DIST() OVER w AS 'CUME_DIST', DENSE_RANK() OVER w AS 'DENSE_RANK', PERCENT_RANK() OVER w AS 'PERCENT_RANK', RANK() OVER w AS 'RANK', ROW_NUMBER() OVER w AS 'ROW_NUMBER' from income WINDOW w AS (ORDER BY amount) ; SELECT person, state, amount, FIRST_VALUE(amount) OVER w AS 'FIRST_VALUE', LAST_VALUE(amount) OVER w AS 'LAST_VALUE', LAG(amount) OVER w AS 'lag', LEAD(amount) OVER w AS 'lead', NTH_VALUE(amount, 2) OVER w AS 'NTH_VALUE', NTILE(4) OVER w AS 'NTILE' from income WINDOW w AS (PARTITION BY state ORDER BY amount) ; SELECT person, state, amount, CUME_DIST() OVER (w ORDER BY amount) AS 'CUME_DIST', FIRST_VALUE(amount) OVER w AS 'FIRST_VALUE' from income WINDOW w AS (PARTITION BY state) ; SELECT person, state, amount, CUME_DIST() OVER w AS 'CUME_DIST', FIRST_VALUE(amount) OVER w2 AS 'FIRST_VALUE' from income WINDOW w AS (order BY state), w2 AS (PARTITION BY state);