Sunday, July 22, 2012

Normalization Nirvana

“Everything should be made as simple as possible, but not simpler.” - Albert Einstein

Normalization was introduced by E.F. Codd, the inventor of relational database technology as a logical database design technique for hierarchical and network approaches. However, (or as expected, since the idea came from the same person) it proved to be more suitable for relational databases .Of course, today normalization plays the role of a quality check inspector on resultant entity-relationship diagrams as opposed to being a mainstream database design technique.

Normalization is a form of minimalism; it is minimizing data clutter, that is redundancy of data by organizing the fields and tables of a relational database in the most optimal fashion. Normalization is a methodology for organizing attributes into tables so that redundancy among the non-key attributes is eliminated (Gilleson). It comprises of splitting large tables in such a manner that the relationships stay intact but the repetitiveness of the data decreases.


However is normalization a compulsory act when it comes to relation database creation? Does the SQL server enforce a normalization policy?  The answer is no. However there are many benefits to normalization.

Advantages of normalization
  1. A shrunken database: By eliminating duplicate data, you will be able to reduce the overall size of the database.
  2. Fine-tuned tables: Tables with less columns translates into more rows per data page.
  3.  Efficiency: Fewer indexes per table translates into faster maintenance tasks.
As alluring as these advantages may sound there are demerits to this process of dispersion.

Disadvantages of normalization
  1. More tables, more links: As data spreads across the tables, linking them up becomes necessary.
  2. Data,but no data?: Repeated data is stored as code in the table instead of the real data. This requires that a map file must be in place inorder to map the code to the real data.
  3. Data query gone awry?:  Since the normalized model is optimized for applications as opposed to ad hoc querying, querying becomes difficult. (Raman)
Denormalization

Denormalization is a “counteract” to the outcome of normalization where redundant data is added to a normalized piece of data in order to optimize the performance of the database. Denormalization finds its application in many avenues, for example it is extensively used in  data warehouse designing.

 “Perfection is achieved, not when there is nothing more to add, but when there is nothing left to take away.” - Antoine de Saint-Exupe

Reference

Gillenson, Mark L. "7." Fundamentals of Database Management Systems. Hoboken, NJ: Wiley, 2005.                                            
        N.pag. Print.
Raman, Ganesh. "What Does Normalization Have to Do with SQL Server?" Disadvantages of  
        Normalization Â. N.p., n.d. Web. 23 July 2012. <http://softwaretestinginterviewfaqs.wordpress.com/category/disadvantages-of-normalization/>.

No comments:

Post a Comment