集合操作符

描述

集合操作符用于将两个输入关系合并为一个关系。 Spark SQL 支持三种类型的集合操作符

请注意,输入关系必须具有相同数量的列,并且对应列的数据类型必须兼容。

EXCEPT

EXCEPTEXCEPT ALL 返回在一个关系中找到但不在另一个关系中找到的行。 EXCEPT(或者,EXCEPT DISTINCT)仅采用不同的行,而 EXCEPT ALL 不会从结果行中删除重复项。请注意,MINUSEXCEPT 的别名。

语法

[ ( ] relation [ ) ] EXCEPT | MINUS [ ALL | DISTINCT ] [ ( ] relation [ ) ]

示例

-- Use number1 and number2 tables to demonstrate set operators in this page.
SELECT * FROM number1;
+---+
|  c|
+---+
|  3|
|  1|
|  2|
|  2|
|  3|
|  4|
+---+
  
SELECT * FROM number2;
+---+
|  c|
+---+
|  5|
|  1|
|  2|
|  2|
+---+

SELECT c FROM number1 EXCEPT SELECT c FROM number2;
+---+
|  c|
+---+
|  3|
|  4|
+---+

SELECT c FROM number1 MINUS SELECT c FROM number2;
+---+
|  c|
+---+
|  3|
|  4|
+---+

SELECT c FROM number1 EXCEPT ALL (SELECT c FROM number2);
+---+
|  c|
+---+
|  3|
|  3|
|  4|
+---+

SELECT c FROM number1 MINUS ALL (SELECT c FROM number2);
+---+
|  c|
+---+
|  3|
|  3|
|  4|
+---+

INTERSECT

INTERSECTINTERSECT ALL 返回在两个关系中都找到的行。 INTERSECT(或者,INTERSECT DISTINCT)仅采用不同的行,而 INTERSECT ALL 不会从结果行中删除重复项。

语法

[ ( ] relation [ ) ] INTERSECT [ ALL | DISTINCT ] [ ( ] relation [ ) ]

示例

(SELECT c FROM number1) INTERSECT (SELECT c FROM number2);
+---+
|  c|
+---+
|  1|
|  2|
+---+

(SELECT c FROM number1) INTERSECT DISTINCT (SELECT c FROM number2);
+---+
|  c|
+---+
|  1|
|  2|
+---+

(SELECT c FROM number1) INTERSECT ALL (SELECT c FROM number2);
+---+
|  c|
+---+
|  1|
|  2|
|  2|
+---+

UNION

UNIONUNION ALL 返回在任一关系中找到的行。 UNION(或者,UNION DISTINCT)仅采用不同的行,而 UNION ALL 不会从结果行中删除重复项。

语法

[ ( ] relation [ ) ] UNION [ ALL | DISTINCT ] [ ( ] relation [ ) ]

示例

(SELECT c FROM number1) UNION (SELECT c FROM number2);
+---+
|  c|
+---+
|  1|
|  3|
|  5|
|  4|
|  2|
+---+

(SELECT c FROM number1) UNION DISTINCT (SELECT c FROM number2);
+---+
|  c|
+---+
|  1|
|  3|
|  5|
|  4|
|  2|
+---+

SELECT c FROM number1 UNION ALL (SELECT c FROM number2);
+---+
|  c|
+---+
|  3|
|  1|
|  2|
|  2|
|  3|
|  4|
|  5|
|  1|
|  2|
|  2|
+---+