First Normal Form
As a reminder, normalization is the process of applying specific rules, called normal forms, to your database. These rules must be applied in order, starting with the First Normal Form.
For a database to be in First Normal Form (1NF), every table in that database must have the following two properties:
Table 3.3 shows how part of a table for storing books might look. The Authors column violates the first of the 1NF standards by storing multiple values. If you were to fix it by turning in into Table 3.4, it would still not be 1NF. That structure has repeating columns of associated data (Author1, Author2, and so on).
Table 3.3. This table is not 1NF compliant because the Authors column is not atomic.Non-1NF Table |
---|
Column | Example |
---|
ID | 1 | Title | C++ Programming | Authors | Larry Ullman and Andreas Signer |
Table 3.4. This table is still not 1NF compliant because of the repeating author columns.Still Non-1NF Table |
---|
Column | Example |
---|
ID | 1 | Title | C++ Programming | Author1 | Larry Ullman | Author2 | Andreas Signer |
I'll begin the normalization process by checking the existing structure for 1NF compliance.
If you do a search online, you can find the formal rules of database normalization. They may be the best example of totally obtuse computer-ese you'll ever encounter. Because the rules are so densely written, the question becomes how to translate these into a comprehensible format. The answer is: with a bit of poetic license.
What I'm getting at is this: if you check out five sources that describe the normal forms in layman's terms, you'll see five slightly different explanations. You'll even find the line blurred between where one normal form ends and another begins. This is perfectly fine. What's important is that the spirit of normalization and the end result will be the same, regardless of how these complex rules are simplified.
|
To make a database 1NF compliant:
1. | Identify any field that contains multiple pieces of information.
Looking back at Table 3.2, two columns are not 1NF compliant: Client Information and Expense Category & Description. You might think that the two date fields also fail to be atomic (they each contain a day, a month, and a year), but subdividing into separate day, month, and year columns may be taking things too far (see the "Overruling Normalization" sidebar at the end of the chapter).
As for the second aspect of the 1NF rule, there aren't any reasons for concern with the current structure. Again, you might think that the two date columns break this rule, but while they store the same type of data, the two columns aren't repetitions of each other.
| 2. | Break up any fields found in Step 1 into separate fields (Table 3.5).
Table 3.5. After running through the 1NF rules, I've separated two fieldsClient Information and Expense Category & Description, compare with Table 3.2into atomic subfields.Accounting Database, 1NF |
---|
Item | Key |
---|
Invoice Number | Primary (PK) | Invoice Date | n/a | Invoice Amount | n/a | Invoice Description | n/a | Date Invoice Paid | n/a | Client Name | n/a | Client Street Address | n/a | Client City | n/a | Client State | n/a | Client Zip | n/a | Client Phone | n/a | Expense Amount | n/a | Expense Category | n/a | Expense Description | n/a | Expense Date | n/a |
To fix this problem, I'll separate Client Information into Client Name, Client Street Address, Client City, Client State, Client Zip, and Client Phone. Next, I'll turn Expense Category & Description into Expense Category and Expense Description. | 3. | Double-check that all new fields created in Step 2 pass the 1NF test.
Sometimes the changes you make will create new violations of a rule. Repeatedly inspect your data model so that it is perfectly compliant before moving on to the next normal form.
|
Tip
|