powered by
Window function: returns the cumulative distribution of values within a window partition, i.e. the fraction of rows that are below the current row.
cume_dist(x = "missing")# S4 method for missing cume_dist()
# S4 method for missing cume_dist()
empty. Should be used with no argument.
N = total number of rows in the partition cume_dist(x) = number of values before (and including) x / N
This is equivalent to the CUME_DIST function in SQL.
CUME_DIST
Other window_funcs: dense_rank, lag, lead, ntile, percent_rank, rank, row_number
dense_rank
lag
lead
ntile
percent_rank
rank
row_number
# NOT RUN { df <- createDataFrame(mtcars) ws <- orderBy(windowPartitionBy("am"), "hp") out <- select(df, over(cume_dist(), ws), df$hp, df$am) # }
Run the code above in your browser using DataLab