Skip to main content

DATABASE DESIGN & NORMALIZATION

 

Database Design and Normalization

Introduction

  • Good database design is crucial for efficient data management and retrieval
  • Normalization is a technique used to design good table structures
  • Goals of normalization:
    1. Minimize data redundancy
    2. Ensure data consistency
    3. Simplify data maintenance
    4. Avoid anomalies in data operations (insertion, deletion, update)

Problems with Unnormalized Tables

1. Repetition Anomaly

  • Definition: Unnecessary repetition of data across multiple rows
  • Example: Repeating project details (Project_id, Project_name) for each employee working on the project
  • Consequences:
    • Wastes storage space
    • Slows down data retrieval
    • Increases risk of data inconsistencies

2. Insertion Anomaly

  • Definition: Difficulty in inserting certain data without including other, potentially unavailable data
  • Example: Unable to add a new employee without assigning them to a project
  • Consequences:
    • Limits flexibility in data entry
    • May lead to incomplete or inaccurate data

3. Deletion Anomaly

  • Definition: Unintended loss of data when deleting a record
  • Example: Deleting an employee's project assignment also removes their personal details
  • Consequences:
    • Risk of losing important, unrelated data
    • Complicates data management and historical record-keeping

4. Updating Anomaly

  • Definition: Inconsistency when updating repeated data
  • Example: Updating an employee's salary requires changes in multiple rows
  • Consequences:
    • Increases risk of data inconsistencies
    • Makes data maintenance more time-consuming and error-prone

Normalization Process

Normalization involves organizing data into progressively more structured forms:

First Normal Form (1NF)

  • Objectives:
    1. Eliminate repeating groups
    2. Create separate tables for each set of related data
    3. Identify each record with a unique field (primary key)
  • Rules:
    1. Each column should contain atomic (indivisible) values
    2. Each column should have a unique name
    3. The order of data stored doesn't matter
    4. Each record should be unique
  • Example:
    • Before 1NF:
      | Emp_no | Emp_name | Project_ids | Project_names | |--------|----------|------------------|-------------------------| | E1 | John | P1, P2 | MERCURY, VENUS |
    • After 1NF:
      | Emp_no | Emp_name | Project_id | Project_name | |--------|----------|------------|--------------| | E1 | John | P1 | MERCURY | | E1 | John | P2 | VENUS |

Second Normal Form (2NF)

  • Objectives:
    1. Meet all 1NF requirements
    2. Remove partial dependencies
  • Rules:
    1. Be in 1NF
    2. All non-key attributes must depend on the entire primary key
  • Example:
    • Before 2NF:
      | Emp_no | Project_id | Emp_name | Project_name | Grade | Salary | |--------|------------|----------|--------------|-------|--------| | E1 | P1 | John | MERCURY | G3 | 50000 | | E1 | P2 | John | VENUS | G3 | 50000 |
    • After 2NF:
      Employee Table: | Emp_no | Emp_name | Grade | Salary | |--------|----------|-------|--------| | E1 | John | G3 | 50000 | Project Table: | Project_id | Project_name | |------------|--------------| | P1 | MERCURY | | P2 | VENUS | Employee_Project Table: | Emp_no | Project_id | |--------|------------| | E1 | P1 | | E1 | P2 |

Third Normal Form (3NF)

  • Objectives:
    1. Meet all 2NF requirements
    2. Remove transitive dependencies
  • Rules:
    1. Be in 2NF
    2. No non-key attribute should depend on another non-key attribute
  • Example:
    • Before 3NF:
      | Emp_no | Emp_name | Grade | Salary | |--------|----------|-------|--------| | E1 | John | G3 | 50000 |
    • After 3NF:
      Employee Table: | Emp_no | Emp_name | Grade | |--------|----------|-------| | E1 | John | G3 | Grade_Salary Table: | Grade | Salary | |-------|--------| | G3 | 50000 |

Benefits of Normalization

  1. Reduced data redundancy
  2. Improved data consistency
  3. Greater flexibility in data queries
  4. Easier data maintenance and updates
  5. Better support for data integrity constraints
  6. More accurate representation of real-world entities and relationships

Key Concepts

  • Primary Key: A unique identifier for each record in a table
  • Foreign Key: A field that refers to the primary key in another table
  • Composite Key: A primary key that consists of two or more attributes
  • Atomic Values: Values that cannot be further divided
  • Partial Dependency: When a non-key attribute depends on only part of a composite key
  • Transitive Dependency: When a non-key attribute depends on another non-key attribute

Practical Considerations

  • While full normalization (up to 3NF or beyond) is theoretically ideal, practical implementations may sometimes denormalize for performance reasons
  • Always consider the specific requirements of your application and the nature of data access patterns
  • Normalization is an iterative process; start with 1NF and progressively refine the structure

Conclusion

  • Normalization is a powerful technique for creating efficient, consistent, and flexible database designs
  • Understanding and applying normalization principles is crucial for database designers and administrators
  • Always balance theoretical ideals with practical considerations in real-world applications

Comments

Popular posts from this blog

ORACLE Express Edition: Getting Started

1. Introduction to Oracle Database 21c Express Edition (XE) - Free, lightweight version of Oracle Database - Ideal for learning and small-scale applications - Limited to 12GB of user data and uses up to 2GB of RAM 2. Installation and Setup 2.1 Installing Oracle 21c XE 1. Download Oracle 21c XE from: https://www.oracle.com/database/technologies/xe-downloads.html 2. Run the installer:    - Windows: Double-click the .exe file    - Linux: Use `rpm` or `yum` command 3. Follow the installation wizard:  Accept the license agreement Choose an installation location (default is usually fine) Set a password for the SYS, SYSTEM, and PDBADMIN accounts (write this down!) Select the option to start the database service automatically (recommended)  4. Complete the installation: Wait for the installation process to finish Note down the database connection details provided at the end The default container database (CDB) name is XE The default pluggable database (PDB) nam...

Practical 1: Getting Started with MYSQL

 Getting Started with MySQL Introduction to MySQL Definition: MySQL is an open-source relational database management system (RDBMS) Uses: Web applications, data warehousing, e-commerce, logging applications Key features: Speed, reliability, scalability, and ease of use Installing MySQL Download MySQL Community Server from official website Follow installation wizard for your operating system Set root password during installation Verify installation: mysql --version MySQL Command-line Client Accessing MySQL: mysql -u root -p Basic commands: SHOW DATABASES ; CREATE DATABASE mydb ; USE mydb ; SHOW TABLES ; MySQL Workbench Introduction: Visual tool for database design and management Key features: SQL development Data modeling Server administration Example: Creating a new connection New Connection > Enter details (hostname, username, password) PHPMyAdmin Web-based MySQL administration tool Often comes pre-installed with web hosting packag...

MYSQL Constraints

 PK - Primary Key: Uniquely identifies each record in a table. NN - Not Null: Ensures a column cannot have a NULL value. UQ - Unique: Ensures all values in a column are different. B - Binary: Stores binary byte strings. UN - Unsigned: For numeric types, allows only non-negative values. ZF - Zero Fill: Pads numeric values with zeros to the left. AI - Auto Increment: Automatically generates a unique number for new records. G - Generated Column: Value is computed from an expression. PK - Primary Key A primary key uniquely identifies each record in a table. It must contain unique values and cannot have NULL values.  Example: CREATE TABLE Students ( StudentID INT PRIMARY KEY , Name VARCHAR ( 50 ) , Age INT ) ; Here, StudentID is the primary key. NN - Not Null  This constraint ensures that a column cannot have NULL values.  Example: CREATE TABLE Employees ( EmployeeID INT PRIMARY KEY , Name VARCHAR ( 50 ) NOT NULL , Email VA...