05 Jul normalization
A process used in modeling data, especially in relational databases, to minimize redundancy. Data normalization techniques can save space and improve data quality by simplifying changes to data. See Holowczak’s tutorial on how to do it, and Margaret Rouse on what it is. Relational database normalization relies on Primary and foreign keys that link tables together in one-to-one or one-to-many relationships.
First normal form (1NF) is a minimal level of normalization in which the database:
- contains any number two-dimensional tables with rows and columns named to represent the core object;
- each table’s columns contain data on the attributes of the core object and the column names usually name the attribute;
- each row or record in each table represents a unique instance of the core object or attribute and must contain at least one value different from any other row (no duplicate rows are allowed);
- All entries in any column must be of the data type (date fields must contain valid dates and amount fields must contain valid numbers).
Second normal form (2NF) is an intermediate level of normalization in which each column in a table that is not a determiner of the contents of another column, must itself be a function of the other columns in the table. For example, in a “Purchases” table with columns containing “Customer ID”, “Product”, and “Price”, the price would be a function of the customer ID and the specific product and may be derived from a separate “Prices” table.
Third normal form (3NF) is the defacto standard for transactional database modeling (reporting databases are often “denormalized”). 3NF states that all columns in a table not dependent on the primary key should be removed from the table, and placed in a separate table or eliminated. Another way of putting this is that only foreign key columns should be used to reference another table, and no other columns from the parent table should exist in the referenced table.