Higher normal forms

Contents:
1. Multivalued dependencies
2. Fourth normal form
3. Fifth normal form

A brief history of normal forms:

FormInventorYear
First Normal FormCodd??
Second Normal FormCodd1972
Third Normal FormCodd1972
Boyce-Codd Normal FormHeath
Boyce & Codd
1971
1974
Fourth Normal FormFagin1977
Fifth Normal FormFagin1979

These are summarized in A simple guide to five normal forms in relational database theory, a nice 1983 ACM article by William Kent.

1. Multivalued dependencies

Definition: A1 A2AnB1 B2Bm is an MVD (a multivalued dependency) for a relation R if whenever two tuples s and t in R agree on A1, A2, … An, there must exist a tuple u in R that:

Example 1:

dealers table
namemakeproduct
SmithFordcar
SmithFordtruck
SmithToyotacar
SmithToyotatruck
JonesToyotacar
JonesToyotatruck
JonesVWcar

The idea here is that a dealer can sell several makes of vehicles, but for each make the dealer deals, the dealer must sell all products in the line. Thus, Jones, having chosen to work with Toyota, must sell both its trucks and its cars. However, we're imagining that VW doesn't produce trucks, so VW only requires Jones to sell their cars to work with VW.

Notice that name ⇒ make is not an MVD: If it were, then we could take the next-to-last row as t and the last row as s (since they agree on name) and conclude that there must be a row that agrees with both on name (Jones), with s on make (VW) and with t on product (truck). However, the tuple (Jones, VW, truck) is not in the table, so name ⇒ make is not an MVD.

However, make ⇒ product is an MVD. Consider the rows agreeing on, say, Toyota, for their make: If you take the set of all values for product in these rows {car, truck} and the set of all values for name in these rows {Smith, Jones}, then combinations of something from one set and something from the other set are included in the Toyota rows. This is required by being a MVD.

Example 2:

posts(name, email, subject, body) where somebody posting a message may list multiple e-mail addresses, in which case there will be a row added for each e-mail address, with the same subject and body.

In this case, name ⇒ email is an MVD.

Notes:

2. Fourth normal form

A relation is in 4NF (fourth normal form) if every MVD has a left side that is a superkey.

While we won't look at an algorithm for breaking a non-4NF table into 4NF tables, we can look at what we would do to solve the problem for our two examples.

Example 1:

We need just two tables.

dealer_makes table
namemake
SmithFord
SmithToyota
JonesToyota
JonesVW
    
make_products table
makeproduct
Fordcar
Fordtruck
Toyotacar
Toyotatruck
VWcar

Example 1:

We need two tables:

addresses(nameemail)
posts(namesubjectbody)

3. Fifth normal form

Going back to our automobile dealers, suppose agents could also specialize in some vehicle types.

dealers table
namemakeproduct
SmithFordcar
SmithFordtruck
SmithToyotacar
SmithToyotatruck
JonesToyotacar
JonesVWcar

(This is the same as Example 2 above, except we deleted (Jones, Toyota, truck) since we're imagining that Jones specializes only in cars.)

This has no MVDs, so it is in fourth normal form. Nonetheless, the table contains some redundancy: For example, it lists that Smith deals with both cars and trucks twice, and it also lists that Smith deals for Ford and Toyota twice. For this reason, it is not in fifth normal form.

(You might think that that name ⇒ make was an MVD based on what appears in the above table. But if Jones were to add minivans as a specialty, and if VW were to make minivans but Toyota were not, then (Jones, VW, minivan) would go into the table. We could then look at the Jones-Toyota-car and Jones-VW-minivan rows and conclude there should also be a Jones-Toyota-minivan row, which there is not.)

We won't give a formal definition of fifth normal form. But in this case, we can split the table into three smaller tables to reduce the redundancy — and because we can perform this split, it's not in fifth normal form.

dealer_products table
nameproduct
Smithcar
Smithtruck
Jonescar
    
dealer_makes table
namemake
SmithFord
SmithToyota
JonesToyota
JonesVW
    
make_products table
makeproduct
Fordcar
Fordtruck
Toyotacar
Toyotatruck
VWcar