NTILE() Alternative for Buckets with a Fixed Number of Records
NTILE(x)
is a window function that distributes rows into a specified number of approximately equal buckets. Each row is assigned a bucket number representing the group to which the row belongs. The input into the function is a positive integer which represents the number of buckets into which the rows are evenly divided. Eg: NTILE(100)
will create 100 buckets, with each bucket containing roughly a similar number of rows.
If we want to split a table into buckets comprising an exact number records, then we would need to know exactly how many groups are required to fulfil that equation. For example, if we know the table size is 1,000 rows and we want buckets of exactly 100, then we can deduce that overall we will require 10 buckets in total.
However, what if we do not know the final table size? What if we just want buckets of 100 rows to be selected from an unknown table size? Then we cannot use NTILE() to assign groups, apart from taking an educated guess as to how many buckets there ought to be to evenly distribute rows into groups of 100.
So an alternative to NTILE() is the below mathematical function which will operate similarly.
FLOOR((ROW_NUMBER()OVER(ORDER BY date)-1) /100) +1 AS Bucket
In this function, the output results in assigning rows into buckets of exactly 100 rows. However the eventual total number of buckets is unknown (at least until after the query executes).
Thus, this will output X buckets, each with 100 rows. Contrast this to NITLE which will output 100 buckets, each with X rows.