聚合函数
描述
聚合函数对跨行值进行操作,以执行数学计算,例如求和、平均值、计数、最小值/最大值、标准差和估计,以及一些非数学操作。
语法
aggregate_function(input1 [, input2, ...]) FILTER (WHERE boolean_expression)
参数
-
聚合函数
请参阅内置聚合函数文档,以获取 Spark 聚合函数的完整列表。
-
布尔表达式
指定任何计算结果为布尔类型值的表达式。两个或多个表达式可以使用逻辑运算符(AND、OR)组合在一起。
示例
请参阅内置聚合函数文档,以获取 Spark 聚合函数的所有示例。
有序集聚合函数
这些聚合函数与其它聚合函数使用不同的语法,以便指定一个表达式(通常是列名)来对值进行排序。
语法
{ PERCENTILE_CONT | PERCENTILE_DISC }(percentile) WITHIN GROUP (ORDER BY { order_by_expression [ ASC | DESC ] [ NULLS { FIRST | LAST } ] [ , ... ] }) FILTER (WHERE boolean_expression)
参数
-
百分位数
您要查找的值的百分位数。百分位数必须是介于 0.0 和 1.0 之间的常数。
-
order_by_expression
在聚合值之前用于对其进行排序的表达式(通常是列名)。
-
布尔表达式
指定任何计算结果为布尔类型值的表达式。两个或多个表达式可以使用逻辑运算符(AND、OR)组合在一起。
示例
CREATE OR REPLACE TEMPORARY VIEW basic_pays AS SELECT * FROM VALUES
('Diane Murphy','Accounting',8435),
('Mary Patterson','Accounting',9998),
('Jeff Firrelli','Accounting',8992),
('William Patterson','Accounting',8870),
('Gerard Bondur','Accounting',11472),
('Anthony Bow','Accounting',6627),
('Leslie Jennings','IT',8113),
('Leslie Thompson','IT',5186),
('Julie Firrelli','Sales',9181),
('Steve Patterson','Sales',9441),
('Foon Yue Tseng','Sales',6660),
('George Vanauf','Sales',10563),
('Loui Bondur','SCM',10449),
('Gerard Hernandez','SCM',6949),
('Pamela Castillo','SCM',11303),
('Larry Bott','SCM',11798),
('Barry Jones','SCM',10586)
AS basic_pays(employee_name, department, salary);
SELECT * FROM basic_pays;
+-----------------+----------+------+
| employee_name|department|salary|
+-----------------+----------+------+
| Anthony Bow|Accounting| 6627|
| Barry Jones| SCM| 10586|
| Diane Murphy|Accounting| 8435|
| Foon Yue Tseng| Sales| 6660|
| George Vanauf| Sales| 10563|
| Gerard Bondur|Accounting| 11472|
| Gerard Hernandez| SCM| 6949|
| Jeff Firrelli|Accounting| 8992|
| Julie Firrelli| Sales| 9181|
| Larry Bott| SCM| 11798|
| Leslie Jennings| IT| 8113|
| Leslie Thompson| IT| 5186|
| Loui Bondur| SCM| 10449|
| Mary Patterson|Accounting| 9998|
| Pamela Castillo| SCM| 11303|
| Steve Patterson| Sales| 9441|
|William Patterson|Accounting| 8870|
+-----------------+----------+------+
SELECT
department,
percentile_cont(0.25) WITHIN GROUP (ORDER BY salary) AS pc1,
percentile_cont(0.25) WITHIN GROUP (ORDER BY salary) FILTER (WHERE employee_name LIKE '%Bo%') AS pc2,
percentile_cont(0.25) WITHIN GROUP (ORDER BY salary DESC) AS pc3,
percentile_cont(0.25) WITHIN GROUP (ORDER BY salary DESC) FILTER (WHERE employee_name LIKE '%Bo%') AS pc4,
percentile_disc(0.25) WITHIN GROUP (ORDER BY salary) AS pd1,
percentile_disc(0.25) WITHIN GROUP (ORDER BY salary) FILTER (WHERE employee_name LIKE '%Bo%') AS pd2,
percentile_disc(0.25) WITHIN GROUP (ORDER BY salary DESC) AS pd3,
percentile_disc(0.25) WITHIN GROUP (ORDER BY salary DESC) FILTER (WHERE employee_name LIKE '%Bo%') AS pd4
FROM basic_pays
GROUP BY department
ORDER BY department;
+----------+-------+--------+-------+--------+-----+-----+-----+-----+
|department| pc1| pc2| pc3| pc4| pd1| pd2| pd3| pd4|
+----------+-------+--------+-------+--------+-----+-----+-----+-----+
|Accounting|8543.75| 7838.25| 9746.5|10260.75| 8435| 6627| 9998|11472|
| IT|5917.75| NULL|7381.25| NULL| 5186| NULL| 8113| NULL|
| Sales|8550.75| NULL| 9721.5| NULL| 6660| NULL|10563| NULL|
| SCM|10449.0|10786.25|11303.0|11460.75|10449|10449|11303|11798|
+----------+-------+--------+-------+--------+-----+-----+-----+-----+