What is the difference between UNION and UNIONALL?

UNION
The UNION command is used to select related information from
two tables, much like the JOIN command. However, when using
the UNION command all selected columns need to be of the
same data type. With UNION, only distinct values are selected.

UNION ALL
The UNION ALL command is equal to the UNION command, except
that UNION ALL selects all values.

The difference between Union and Union all is that Union all
will not eliminate duplicate rows, instead it just pulls all
rows from all tables fitting your query specifics and
combines them into a table.

A UNION statement effectively does a SELECT DISTINCT on the
results set. If you know that all the records returned are
unique from your union, use UNION ALL instead, it gives
faster results.

Example:
Table 1 : First,Second,Third,Fourth,Fifth
Table 2 : First,Second,Fifth,Sixth

Result Set:
UNION: First,Second,Third,Fourth,Fifth,Sixth (This will
remove duplicate values)
UNION ALL:
First,First,Second,Second,Third,Fourth,Fifth,Fifth,Sixth,Sixth
(This will repeat values)
 
................................................................................
More than a year ago I had written article SQL SERVER – Union vs. Union All – Which is better for performance? I have got many request to update this article. It is not fair to update already written article so I am rewriting it again with additional information.
UNION
The UNION command is used to select related information from two tables, much like the JOIN command. However, when using the UNION command all selected columns need to be of the same data type. With UNION, only distinct values are selected.
UNION ALL
The UNION ALL command is equal to the UNION command, except that UNION ALL selects all values.
The difference between Union and Union all is that Union all will not eliminate duplicate rows, instead it just pulls all rows from all tables fitting your query specifics and combines them into a table.
A UNION statement effectively does a SELECT DISTINCT on the results set. If you know that all the records returned are unique from your union, use UNION ALL instead, it gives faster results.
Run following script in SQL Server Management Studio to see the result between UNION ALL and UNION. 
/* Declare First Table */
DECLARE @Table1 TABLE (ColDetail VARCHAR(10))
INSERT INTO @Table1
SELECT 'First'
UNION ALL
SELECT 'Second'
UNION ALL
SELECT 'Third'
UNION ALL
SELECT 'Fourth'
UNION ALL
SELECT 'Fifth'
/* Declare Second Table */
DECLARE @Table2 TABLE (ColDetail VARCHAR(10))
INSERT INTO @Table2
SELECT 'First'
UNION ALL
SELECT 'Third'
UNION ALL
SELECT 'Fifth'
/* Check the data using SELECT */
SELECT *
FROM @Table1
SELECT *
FROM @Table2
/* UNION ALL */
SELECT *
FROM @Table1
UNION ALL
SELECT *
FROM @Table2
/* UNION */
SELECT *
FROM @Table1
UNION
SELECT
*
FROM @Table2
GO
In our example we have two tables: @Table1 and @Table2.
Now let us run UNION ALL and UNION together and see the resultset as well as Execution Plan compared to complete set of query. You can always turn on actual execution plan using CTRL+M.
We can see from the resultset of UNION ALL that it returns everything from both the table but from UNION it is very clear that only DISTINCT rows from both the table is only retrieved.
Additionally, when comparing the execution plan of UNION ALL and UNION it is also quite clear that UNION ALL is way less expensive than UNION as it does not have DISTINCT SORT operation.
Let me know what do you think about this article. If you have any suggestion for improvement please let me know and I will update articles according to that.

0 comments:

Post a Comment