Union (SQL)

From Wikipedia, the free encyclopedia

In SQL the UNION operator combines the results of two SQL queries into a single table of all matching rows. The two queries must have matching fields and data types in order to join them. Any duplicate records are automatically removed unless UNION ALL is used.

UNION can be useful in data warehouse applications where tables aren't perfectly normalized [1]. A simple example would be a database having tables sales2005 and sales2006 that have identical structures but are separated because of performance considerations. A UNION query could combine results from both tables.

Contents

[edit] Examples

sales2005
+--------+--------+
| person | amount |
+--------+--------+
| Joe    |   1000 |
| Bob    |   5000 |
+--------+--------+

sales2006
+--------+--------+
| person | amount |
+--------+--------+
| Joe    |   2000 |
| Zach   |  35000 |
+--------+--------+

SELECT * FROM sales2005
UNION
SELECT * FROM sales2006

result:
+--------+--------+
| person | amount |
+--------+--------+
| Joe    |   1000 |
| Bob    |   5000 |
| Joe    |   2000 |
| Zach   |  35000 |
+--------+--------+

Note that UNION ALL would also give the same result that UNION does in the above example because all the rows of data are unique.

However, suppose instead that in sales2006 Joe's amount is the same as from 2005. Then it's apparent that the UNION operation only gives unique rows:

sales2005
+--------+--------+
| person | amount |
+--------+--------+
| Joe    |   1000 |
| Bob    |   5000 |
+--------+--------+

sales2006
+--------+--------+
| person | amount |
+--------+--------+
| Joe    |   1000 |
| Zach   |  35000 |
+--------+--------+

SELECT * FROM sales2005
UNION
SELECT * FROM sales2006

result:
+--------+--------+
| person | amount |
+--------+--------+
| Joe    |   1000 |
| Bob    |   5000 |
| Zach   |  35000 |
+--------+--------+

Here's the result with UNION ALL:

sales2005
+--------+--------+
| person | amount |
+--------+--------+
| Joe    |   1000 |
| Bob    |   5000 |
+--------+--------+

sales2006
+--------+--------+
| person | amount |
+--------+--------+
| Joe    |   1000 |
| Zach   |  35000 |
+--------+--------+

SELECT * FROM sales2005
UNION ALL
SELECT * FROM sales2006

result:
+--------+--------+
| person | amount |
+--------+--------+
| Joe    |   1000 |
| Bob    |   5000 |
| Joe    |   1000 |
| Zach   |  35000 |
+--------+--------+

The discussion of full outer joins also has an example that uses UNION.

[edit] See also

Topics in database management systems (DBMS) ( view talk edit )

Concepts
Database | Database model | Relational database | Relational model | Relational algebra | Primary key - Foreign key - Surrogate key - Superkey
Database normalization | Referential integrity | Relational DBMS | Distributed DBMS | ACID

Objects
Trigger | View | Table | Cursor | Log | Transaction | Index | Stored procedure | Partition

Topics in SQL
Select | Insert | Update | Merge | Delete | Join | Union | Create | Drop
Comparison of syntax

Implementations of database management systems

Types of implementations
Relational | Flat file | Deductive | Dimensional | Hierarchical | Object oriented | Temporal

Products
Caché | db4o | dBASE | Firebird | Helix database | DB2 | Informix | Ingres | InterBase | Microsoft SQL Server | MySQL | OpenLink Virtuoso | Oracle | PostgreSQL | SQLite | Sybase IQ | Sybase | Teradata | Visual FoxPro | Comparison - relational | Comparison - object-relational

Components
Query language | Query optimizer | Query plan | ODBC | JDBC
Lists
List of object-oriented database management systems
List of relational database management systems


[edit] References

  1. ^ "a UNION ALL views technique for managing maintenance and performance in your large data warehouse environment ... This UNION ALL technique has saved many of my clients with issues related to time-sensitive database designs. These databases usually have an extremely volatile current timeframe, month, or day portion and the older data is rarely updated. Using different container DASD allocations, tablespaces, tables, and index definitions, the settings can be tuned for the specific performance considerations for these different volatility levels and update frequency situations." Terabyte Data Warehouse Table Design Choices - Part 2 (URL accessed on July 25, 2006)

[edit] External links