SQL SERVER中PERCENTILE_CONT和PERCENTILE_DISC
PERCENTILE_CONT和PERCENTILE_DISC
看下面一组SQL语句:
WITH testas( select N'LeeWhoeeUniversity' as name,10 as score UNION ALL select N'LeeWhoeeUniversity',20 UNION ALL select N'LeeWhoeeUniversity',30 UNION ALL select N'LeeWhoeeUniversity',40 UNION ALL select N'LeeWhoeeUniversity',50 UNION ALL select N'DePaul',60 UNION ALL select N'DePaul',70 UNION ALL select N'DePaul',80 UNION ALL select N'DePaul',90 UNION ALL select N'DePaul',100)select name,score,PERCENT_RANK() over(partition by name order by score) as per_rnk,PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY score) over(partition by name) as percont0_5,PERCENTILE_CONT(0.6) WITHIN GROUP (ORDER BY score) over(partition by name) as percont0_6,PERCENTILE_CONT(0.7) WITHIN GROUP (ORDER BY score) over(partition by name) as percont0_7,PERCENTILE_CONT(0.75) WITHIN GROUP (ORDER BY score) over(partition by name) as percont0_75,PERCENTILE_DISC(0.5) WITHIN GROUP (ORDER BY score) over(partition by name) as perdist0_5,PERCENTILE_DISC(0.6) WITHIN GROUP (ORDER BY score) over(partition by name) as perdist0_6,PERCENTILE_DISC(0.7) WITHIN GROUP (ORDER BY score) over(partition by name) as perdist0_7,PERCENTILE_DISC(0.75) WITHIN GROUP (ORDER BY score) over(partition by name) as perdist0_75from test
运行结果:
name score per_rnk percont0_5 percont0_6 percont0_7 percont0_75 perdist0_5 perdist0_6 perdist0_7 perdist0_75
LeeWhoeeUniversity 10 0 30 34 38 40 30 30 40 40
LeeWhoeeUniversity 20 0.25 30 34 38 40 30 30 40 40
LeeWhoeeUniversity 30 0.5 30 34 38 40 30 30 40 40
LeeWhoeeUniversity 40 0.75 30 34 38 40 30 30 40 40
LeeWhoeeUniversity 50 1 30 34 38 40 30 30 40 40
DePaul 60 0 80 84 88 90 80 80 90 90
DePaul 70 0.25 80 84 88 90 80 80 90 90
DePaul 80 0.5 80 84 88 90 80 80 90 90
DePaul 90 0.75 80 84 88 90 80 80 90 90
DePaul 100 1 80 84 88 90 80 80 90 90
简单理解,PERCENT_RANK前面介绍过,求出score百分比后,PERCENTILE_CONT和PERCENTILE_DISC就是根据百分比求出对应的score。但不同的是,某一百分比没有对应的score时,PERCENTILE_CONT会根据百分比的偏差计算出一个新值,此值可能并不存在于score中。PERCENTILE_DISC得出的是偏向于最近百分比对应的score值,因此此值肯定存在于score中。
如PERCENTILE_CONT(0.6)对应的值为34,偏差值我猜测是这样计算出的:百分比75%和50%PERCENTILE_CONT计算出的值分别是40和30.那么等式:(40-30)/(0.75-0.5)=新偏差值/0.6-0.5。PERCENTILE_CONT(0.6)在50%上的偏差值=4。所以,PERCENTILE_CONT(0.6)对应PERCENTILE_CONT(0.5)+4=34。
再看PERCENTILE_DISC(0.6),直接将最近的0.5计算出的,PERCENTILE_DISC值取过来了。