Null (SQL)

From Wikipedia, the free encyclopedia

In SQL:2003 NULL is a special marker used to indicate that a data value is unknown. NULL is also an SQL reserved keyword used to identify the NULL special marker.

In SQL database management systems that follow the SQL standard, columns in tables can be designated as nullable or non-nullable through the use of SQL NOT NULL constraints. Columns that are designated nullable may contain the NULL marker in place of actual data, non-nullable columns cannot.

NULL is untyped in SQL. Therefore, it is sometimes mandatory to explicitly cast NULLs so that the DBMS can evaluate results to the intended data type. For example, if overloaded functions are supported, the DBMS cannot resolve to the correct function without knowing the data types of all paramaters.

Contents

[edit] Comparisons and Three-Valued Logic

NULL is not considered a value because it lacks an inherent property of values: NULLs cannot be compared with other values of the same data type. NULL is an untyped marker indicating the absence of a value, and not an actual data value. Comparisons of actual data values with NULL never result in True or False, but always in a third logical result: Unknown. Comparison of a NULL with any another NULL also results in Unknown, as shown below:

 10 = NULL       -- Results in Unknown
 
 NULL <> 'Hello' -- Results in Unknown
 
 NULL > NULL     -- Results in Unknown

Because SQL implements a third logical result (Unknown), SQL implementations must provide for a specialized three-valued logic (3VL). SQL three-valued logic is shown in the table below:

p q p AND q p OR q
True True True True
True False False True
True Unknown Unknown True
False True False True
False False False False
False Unknown False Unknown
Unknown True Unknown True
Unknown False False Unknown
Unknown Unknown Unknown Unknown
p NOT p
True False
False True
Unknown Unknown

To test if the values in a column are NULL or are not NULL, the IS NULL and IS NOT NULL operators must be used in SQL comparison predicates. The syntax looks like the following:

 column IS NULL
 
 column IS NOT NULL

Because of this, special care must be taken when joining tables on columns that are declared nullable.

[edit] Math and String Concatenation Operators

Because NULL is not a data value, but a marker for an unknown or not-existent value, using mathematical operators on NULL results in NULL. For instance, all of the following examples result in NULL:

 NULL + NULL        -- Result is NULL
 
 10 * NULL          -- Result is NULL
 
 0 / NULL           -- Result is NULL

String concatenation operations, common in SQL, also result in NULL when one of the operands is NULL. The following example demonstrates the result of using NULL with the SQL || string concatenation operator:

 'Fish ' || NULL || 'Chips'   -- Result is NULL

(Note: some DBMS systems use the + operator for string concatenation instead of || or CONCAT. This is not conforming to the SQL standard, however.)

[edit] Aggregate Functions and NULL

SQL defines aggregate functions to simplify server-side aggregate calculations on data. Almost all aggregate functions perform a NULL elimination step, so that NULL values are not included in the final result of the calculation. This implicit NULL elimination, however, can have an impact on aggregate function results. The only aggregate function that does not implicitly eliminate NULL is the COUNT(*) function.

[edit] Grouping and Sorting NULL

Because SQL:2003 defines all NULL markers as being unequal to one another, a special definition was required in order to group NULLs together when performing certain operations. SQL defines any two values that are equal to one another, or any two NULLs, as not distinct. The definition of not distinct allows SQL to group and sort NULLs when the GROUP BY clause (and other keywords that perform grouping) are used.

The SQL standard does not explicitly define a default sort order for NULLs. Instead, on conforming systems, the user can specify for each element in the order by list whether NULLs are to be sorted before or after all values using the NULLS FIRST or NULLS LAST clauses, respectively. Not all DBMS vendors implement this functionality, however. Vendors who do not implement this functionality may specify different treatments for NULL sorting in the DBMS.

[edit] NULL-Handling Functions

SQL defines two functions to explicitly handle NULLs. The COALESCE function accepts a list of parameters, returning the first non-NULL value from the list:

 COALESCE(value1, value2, value3, ...)

COALESCE is defined as shorthand for the following SQL CASE expression:

 CASE WHEN value1 IS NOT NULL THEN value1
      WHEN value2 IS NOT NULL THEN value2
      WHEN value3 IS NOT NULL THEN value3
      END

Some SQL DBMS's implement vendor-specific functions similar to COALESCE. Microsoft SQL Server, for instance, implements an ISNULL function which is similar in functionality to COALESCE.

The NULLIF function accepts two parameters. If the first parameter is equal to the second parameter, NULLIF returns NULL. Othewise, the value of the first parameter is returned.

 NULLIF(value1, value2)

Thus, NULLIF is an abbrevation for the following CASE expression:

 CASE WHEN value1 = value2 THEN NULL ELSE value1 END

[edit] Controversy

In The Relational Model for Database Management: Version 2, Dr. E.F. Codd suggested that SQL NULL was flawed and should be replaced by two different NULL-type markers. The markers he proposed were to stand for "Missing but Applicable" and "Missing but Inapplicable", known as A-values and I-values, respectively. Codd's recommendation, if accepted, would have required the implementation of a four-valued logic in SQL. Others have suggested adding additional NULL-type markers to Codd's recommendation to include even more reasons that a data value might be "Missing", increasing the complexity of SQL.

At various times, proposals have been put forth to implement multiple user-defined NULL markers in SQL. Because of the complexity of the NULL-handling and logic systems required to support these NULL markers, none of these proposals have gained widespread acceptance.

Some Relational Management experts like the authors of The Third Manifesto, Chris Date and Hugh Darwen, have suggested that SQL NULL is inherently flawed and should be eliminated altogether. These experts often point to inconsistencies and flaws in the implementation of SQL NULL-handling (particularly in aggregate functions) as proof that the entire concept of NULL is flawed and should be removed from the Relational Model. Others, like author Fabian Pascal, have stated "How the function calculation should treat missing values is not governed by the relational model."

[edit] See also

[edit] External links

In other languages