Normal Forms of DBMS | From 1st to 5th and Boyce-Codd

Database Normal Forms Explained with Easy Examples


Published: 31 Dec 2024


Normalization Forms in DBMS Explained

Did you know that normal forms in DBMS are essential for reducing redundancy and improving data integrity? If you’ve ever struggled with messy, error-prone databases, these principles offer a solution. By organizing data efficiently, normal forms ensure a structured and reliable database, much like organizing a chaotic library.

Example of Normalization in Database with a Table

  1. Removing duplicate data from student records.
  2. Splitting the address into the street, city, and ZIP code columns.
  3. Linking orders to customers through unique IDs.
  4. Creating a separate table for product categories.
  5. Eliminating redundant supplier information.

What is the 1st Normal Form in DBMS with Example?

There are several DBMS normalization forms, such as the first normal form, the second normal form, the third normal form, the Boyce-Codd normal form, the fourth normal form, and the fifth normal form.

Normalized Table After Applying 1NF

1st Normal Form (1NF) ensures that a database table has only atomic (single) values in each column, meaning no multiple values or lists are stored in one cell. It also eliminates repeating groups, ensuring each row is unique. In simple terms, 1NF makes sure each cell in a table contains just one piece of information, and each record is distinct. This helps in organizing data properly and reduces redundancy.

Example of 1st Normal Form (1NF)

Unnormalized Table (Before 1NF)

Student IDNameCourses
1JohnMath, English
2EmilyMath
3MichaelHistory, Science
Issues
  • The “Courses” column contains multiple values (lists of courses).
  • The table has repeating groups of data.

Normalized Table (After 1NF)

Student IDName Course
1JohnMath
1JohnEnglish
2EmilyMath
3MichealHistory
3MichealScience
Result
  • Each column now has only a single value.
  • There are no repeating groups in any cell.
  • Each row contains unique information.

What is 2nd Normal Form with a Solved Example

2nd Normal Form (2NF) builds on 1NF by ensuring that all non-key columns depend on the primary key, not just the part. This means that if a table has a composite primary key (more than one column), no non-key column should rely on just a part of that key. In simple terms, 2NF removes partial dependencies, making the table more organized. It helps reduce redundancy and improves data integrity by ensuring that each piece of information is stored in the right place.

Instructor Table in 2NF with Example

Unnormalized Table (Before 2NF)

IDCourse TeacherPhone
1101Mr. Smith123-456-7890
1102Ms. Johnson234-567-8901
2101Mr. Smith123-456-7890
3103Mr. Brown345-678-9012
Issues
  • The primary key is a combination of StudentID and CourseID.
  • The Instructor and InstructorPhone depend only on CourseID, not the entire primary key.

Normalized Table (After 2NF)

Stuent IDStudent ID
1101
1102
2101
3103

Instructor Table

Course IDInstructorInstructor ID
101Mr. Smith123-456-7890
102Ms. Johnson234-567-8901
103Mr. Brown345-678-9012
Result
  • Now, Instructor and InstructorPhone depend only on CourseID and are stored in a separate table.
  • The table is in 2NF, as all non-key attributes depend on the entire primary key.

How to Normalize a Table to 3rd Normal Form

3rd Normal Form (3NF) removes transitive dependencies, which occur when a non-key column depends on another non-key column. In simple terms, if one non-key attribute relies on another non-key attribute, it should be separated into different tables. 3NF ensures that each non-key column is directly dependent only on the primary key, not on other non-key columns. This reduces redundancy and improves data organization by keeping each piece of information in its proper place.

Example of 3rd Normal Form (3NF)

Unnormalized Table (Before 3NF)

IDCourseTchrPhEmail
1101Mr. Smith123-456-7890[email protected]
1102Mr. Joshon234-567-8901[email protected]
2101Mr. Smith123-456-7890[email protected]
3103Mr. Brown345-678-9012[email protected]
Issues
  • The InstructorPhone and InstructorEmail depend on the Instructor, which is not the primary key.
  • These non-key attributes should be moved to a separate table to eliminate transitive dependency.

Normalized Tables (After 3NF)

Student IDCourse IDInstructor
1101Mr. Smith
1102Ms. Johnson
2101Mr. Smith
3103Mr. Brown
Result
  • The InstructorPhone and InstructorEmail are now stored in a separate table linked to the Instructor.
  • This eliminates transitive dependency and ensures that each non-key attribute depends only on the primary key, making the table compliant with 3NF.

Boyce-Codd Normal Form (BCNF)

Boyce-Codd Normal Form (BCNF) is a stricter version of the 3rd Normal Form (3NF). It ensures that every non-key attribute is fully dependent on the primary key. In simple terms, if a table has more than one candidate key (possible primary keys), every non-key attribute must depend on each of those keys. BCNF eliminates anomalies caused by certain types of dependencies that 3NF might not address, ensuring that the database structure is even more efficient and organized.

Issues
  • The table has two candidate keys: StudentID + CourseID and Instructor + CourseID.
  • InstructorPhone depends on the Instructor, but the Instructor is not the primary key in this case, violating BCNF.
Result
  • Now, InstructorPhone is stored in a separate table, where Instructor is the primary key, ensuring that all non-key attributes depend only on the primary key.
  • This eliminates the dependency on non-primary keys and brings the table into BCNF.

Real-Life Example of 4th Normal Form in DBMS

4th Normal Form (4NF) ensures that a table has no multi-valued dependencies. A multi-valued dependency occurs when a single attribute in a table is related to multiple values of another attribute, but the two are independent. In simple terms, a table containing two or more independent multi-valued facts about an entity should be stored in separate tables. 4NF helps eliminate redundancy and maintain data integrity by ensuring each fact is stored in its own table.

Issues
  • The table has a multi-valued dependency, where StudentID is related to multiple Courses and multiple Hobbies, but these are independent of each other.
  • This leads to redundancy, as the same StudentID is repeated for each combination of Course and Hobby.
Result
  • The StudentID is no longer repeated for each combination of Course and Hobby.
  • The independent facts (courses and hobbies) are stored in separate tables, ensuring the table is in 4NF and eliminating multi-valued dependencies.

Use of 5th Normal Form in Database Normalization

The 5th Normal Form (5NF) is focused on eliminating any remaining redundancy caused by the combination of tables. It ensures that every piece of information is stored in its simplest form, without any unnecessary dependencies. In simple terms, 5NF requires that a table cannot be decomposed into smaller tables without losing information. This form deals with complex relationships where multiple keys or combinations of attributes lead to redundancy. 5NF ensures that the database is as efficient and logical as possible by avoiding such redundancies.

Issues
  • The table contains redundant data because information about the Instructor and Textbook is repeated for each student ID and Course combination.
  • The Instructor and Textbook are independent of each other and the Course, but they are combined in the same table, leading to unnecessary repetition.
Result
  • Now, each fact (Instructor and Textbook) is stored separately and linked to the course it belongs to.
  • This removes redundancy and ensures that the table is in 5NF, where no further decomposition is possible without losing information.

What causes the need for Normalization in DBMS

Normalization is needed to address common problems that can occur when data is stored in an unorganized way:

  • Insertion Anomaly: Occurs when it’s difficult to add new data because there is missing information. For example, to add a new student, you might need to provide information about a course or instructor, even if it’s not necessary for the new student.
  • Deletion Anomaly: This happens when deleting data, which leads to the unintended loss of other important information. For example, if you delete a course that a student is enrolled in, you might also accidentally lose the instructor’s details for that course.
  • Updation Anomaly: Takes place when updating data in one place but not in others, leading to inconsistencies. For instance, if an instructor’s phone number changes but isn’t updated across all records, different records will have conflicting information.

Difference Between Data Normalization and Denormalization

NormalizationDenormalization
Reduces data redundancy.It may lead to some anomalies, but it is faster for reads.
Organizes data into smaller, related tables.It may lead to some anomalies, but is faster for reads.
Improves data integrity.Improves query performance.
Helps prevent anomalies (insertion, deletion, update).It may lead to some anomalies, but it is faster for reads.
Data is stored in its simplest form.Data is sometimes repeated for quicker access.
Ideal for complex applications with frequent updates.Ideal for read-heavy applications with few updates.

Conclusion About Database Normalization with all Forms

DBMS normalization forms are essential for creating an organized and efficient database. By applying these forms, we can reduce duplicate data, improve data integrity, and make the database easier to maintain and update. Each level of normalization helps refine the structure, ensuring data is stored logically and without unnecessary repetition. Following these principles makes it easier to manage and retrieve accurate information, making the database more reliable and effective in the long run.

FAQS – Benefits of Normalization in DBMS

Why is normalization important in DBMS?

Normalization is important because it helps organize data in a way that reduces duplication, prevents errors, and makes the database easier to update and maintain. It ensures that related data is stored in separate tables, which helps in keeping the database clean and efficient.

Can a database be over-normalized?

Yes, a database can be over-normalized. Over-normalization occurs when too many tables are created, making the database unnecessarily complex. This can slow down performance and make it harder to work with, so it’s important to balance normalization with practicality.

Do we always need to normalize a database to the highest normal form (like BCNF or 4NF)?

Not always. While it’s good to normalize a database, it doesn’t always have to be taken to the highest normal form. Sometimes, reaching a lower normal form like 3NF is enough for most practical uses. The goal is to improve the structure without making it too complicated or affecting performance.

Can a relation in 2NF still have partial dependency?

No, a relation in 2NF should not have a partial dependency. In 2NF, all non-key attributes must depend on the whole primary key, not just part of it. If there’s still partial dependency, the relation hasn’t been fully normalized to 2NF.

What does BCNF stand for?

BCNF stands for Boyce-Codd Normal Form. It’s a higher level of database normalization that removes certain types of redundancy, ensuring that every determinant in the table is a candidate key.

What is PJNF in database normalization?

PJNF stands for Projection-Join Normal Form. It’s a special form of normalization where a relation is split into smaller relations, ensuring that the data can be reconstructed by joining the tables back together without losing any information. It’s a higher-level normal form than BCNF.

What is a normal form in a database?

A normal form is a rule used to structure data in a database. It helps keep the data clean, organized, and free from unnecessary repetition.

What is the 5th normal form in DBMS?

The 5th normal form (5NF) ensures that data cannot be broken down further without losing information. It focuses on handling complex join dependencies.

What is the 6th normal form?

6NF is the most advanced normal form, used in very specialized systems. It breaks data into the smallest possible parts to manage changes over time.

What are non-key attributes?

Non-key attributes are columns that do not help identify records. They depend on the primary key to give meaning.

What is the use of normal forms?

Normal forms make your database easier to maintain, reduce duplicate data, and prevent errors. They also help with faster and more reliable updates.

What is a deadlock in a DBMS?

A deadlock happens when two or more processes wait for each other to finish, but none of them can move forward. It causes the system to freeze.




Computer Hardware Avatar

Tech Blogger & Hardware Reviewer with a passion for exploring the latest computer components. Sharing in-depth reviews, guides, and tips to help you make informed tech decisions.


Please Write Your Comments
Comments (0)
Leave your comment.
Write a comment
INSTRUCTIONS:
  • Be Respectful
  • Stay Relevant
  • Stay Positive
  • True Feedback
  • Encourage Discussion
  • Avoid Spamming
  • No Fake News
  • Don't Copy-Paste
  • No Personal Attacks
`