Sunday, August 12, 2012

To Internet and Beyond...and Back

Shopping for a laptop, filing for tax returns, paying your school fees, using a GPS….

Click, click, click…

Ever wondered where your click “goes”?

In the traditional sense, in an offline database system, your "click" comes in contact with application programs such as Windows folders to reach out for  data, retrieve it and present it back to you.

However, on the internet, when you go online shopping at your favourite online shop, you are in fact reaching out for the shop’s database. Your click, which is a user generated input, traverses a predefined path that comprises of hardware and software, towards the shop’s information system, which is the database. Of course the façade, that is the webpage browser, is aesthetically designed, and it completely conceals the underlying world of codes and computer systems.


(Idaho Department of Commerce)(InformedBuying.com)(Lamb)(Reynolds)


Starting from the Internet and working inwards towards the company’s information system, where the database resides, are optimally normalized tables, which enable easy browsing and easy retrieval of data.

In the DBMS is where the actual data resides, and data, that is the image of a dress on a mannequin, the size , color and the other details are stored  in an efficient way.  The good housekeeping ideology of  ‘A place for everything and everything in its place’ in databases enables us to have a smooth shopping experience, be it  updating the quantity of the pair of shoes, or editing the contents of the virtual shopping cart when we change our mind.

As internet browsers the only thing we come across is the website which has been enhanced with user experience, user interface and information architecture and marketing concepts however backstage there are layers of software, hardware, programming modules and normalized tables of database.

Internet or non-internet environment, the DBMS which handles volumes of data is an organized system that we never come across directly.

Just as the very concept of the medium for database evolved so did the concept of PC interaction with internet and DB evolve.

Since there are different types of mediums involved here software, hardware, different programming languages, different types of data,  a specialized software called middleware is implemented.

In order for the application software running in the Web server to connect with software outside the Web server, there must be agreed upon interfaces, and indeed there are. The original such interface is called the Common Gateway Interface (CGI). Later, another such interface with certain performance advantages was developed, known as the Application Program Interface (API). These interfaces have associated software "scripts" that let them exchange data between the application in the server and the databases controlled by the database server. The connection to the databases could be made directly at this point, but again, with the prospect of different database management systems and different kinds of data involved, it made sense to create another level of standards to smooth out the differences and have one standard way of accessing the data. The most common set of such standards is called Open Database Connectivity (ODBC), which is designed as an interface to relational databases. (Gillenson)

The ODBC connects to the database server which finally  connects to the database which is the hub of information.

Everyone talks about the internet and how living without it would be unimaginable. Internet has progressed from luxury to a necessity. But the internet wouldn’t be an enjoyable experience without an effective database management system.

 In the back alleys of the internet lies an optimally normalized database table and your “clicks” travels up to it and back.



Reference

Gillenson, Mark L. "7." Fundamentals of Database Management Systems. Hoboken, NJ: Wiley, 2005.                                   
        N.pag. Print.

"Internet Shopping -Is Your Credit Card Information Safe!" Internet Shopping.                      
      InformedBuying.com, n.d. Web. 13 Aug. 2012.
      <http://www.informedbuying.net/shopping/smart_shopping.htm>.
"Internet Marketing-Your Community's Gateway to the World." Internet Marketing: Idaho
      Department of Commerce. Idaho Department of Commerce, n.d. Web.
     <http://commerce.idaho.gov/communities/internet-marketing/>.
Lamb, Eric. "Setting Up A Linux Web Server." Made of Everything You're Not. Eric Lamb, n.d.      
    Web. <http://blog.ericlamb.net/2009/05/setting-up-a-linux-web-server/>.
Reynolds, Warren. "How You Can Avoid Closing Delays and Save Your Home Sale."02038.      
    Waren Reynolds, n.d. Web. <http://www.02038.com/wp-ontent/uploads/2009/06/middleman.jpg>.


Sunday, August 5, 2012

Distributed Database Housekeeping Rules

Christopher. J. Date is known for his prominent work with relational database theory  and for formulating  the 12  basic principles of distributed databases. Date like Edgar. F. Codd was involved with IBM and wrote extensively about relational databases .

It appears, inspired by his colleague's "Codd’s 12 commandments" of relational database management system (RDBMS), Date formulated twelve basic principles of distributed databases. Although no current Distributed Database Management System conforms to all of them, they serve as a reference line for optimized allocation of databases and to keep the extent of  double duplicity of databases in check.


C. J. Date's Twelve Commandments For Distributed Databases 

1.     Local site independence. Each local site can act as an independent, autonomous, 
       centralized DBMS. Each site is responsible for security, concurrency control,  
       backup, and recovery.  
2.    Central site independence, No site in the network relies on a central site or any  
       other site. 
3.     Failure independence The system is not affected by node failures. The system 
        is in continuous operation even in the case of a node failure or an expansion of   
        the network. 
4.     Location transparency. The user does not need to know the location of data in 
        order to retrieve those data. 
5.     Fragmentation transparency. Data fragmentation is transparent to the user, 
        who sees only one logical database. The user does not need to know the name of  
        the database fragments in order to retrieve them. 
6.     Replication transparency .The user sees only one logical database. The  
        DDBMS transparently selects the database fragment to access. To the user, the 
        DDBMS manages all fragments transparently. 
7.     Distributed query processing. A distributed query may be executed at several  
        different DP sites. Query optimization is performed transparently by the  
        DDBMS. 
8.     Distributed transaction processing. A transaction may update data at several  
        different sites, and the transaction is executed transparently. 
9.     Hardware independence. The system must run on any hardware platform. 
10.   operating system independence. The system must run on any operating system  
        platform. 
11.   Network independence. The system must run on any network platform. 
12.   Database independence. The system must support any vendor's database  
        product.
(C. J. Date's twelve commandments for distributed databases)

These requirements are not limited to RDBMS's but extent to  the multi-media friendly ODBMS as well.  The rules have become a measure for evaluating distributed databases and they enable the database modeller to make better modelling decisions. (C. J. Date Rules)


Reference

"C. J. Date Rules." Scribd. N.p., n.d. Web. 6 Aug. 2012.
    <http://www.scribd.com/doc/26882380/C-J-Date-Rules>.
"C. J. Date's twelve commandments for distributed databases ." C. J. Date's twelve
    commandments for distributed databases . N.p., n.d. Web. 6 Aug. 2012.
    <www.uobabylon.edu.iq/uobColeges/ad

Sunday, July 29, 2012

"Who Stole The Data from The Database?"


Accuser: Who stole the cookie from the cookie jar?
(name of a child in the circle) stole the cookie from the cookie jar.
Accused: Who, me?
Accuser/Group: Yes, you!
Accused: Not me!/Couldn't be!/Wasn't me!
Accuser/Group: Then who?

When data goes missing, or is stolen , it can become a case of  cat and mouse game for a while  until the police and data detectives zero down upon the cause, which can vary from failed anti virus software's , firewalls or the age old case of mischievous humans.

Tackling dishonest behavior is the most difficult when it comes to data security. After all  companies can easily control unauthorized data access by use of passwords , magnetic stripe cards or the new -age biometric systems and "electric-eye"devices. Viruses  and malicious  codes can be curbed by implementing anti-virus software's  and firewalls. Data can be encrypted and the company premises could be carefully designated with a safe space for the computers. A company can train it employees about information security policies.

However what about dishonest employees?

Consider the following snippet where fraudsters were found to withdraw money from an ATM.

Later it was found that employees of the company were accomplices to the fraudsters.
As highlighted  in the snippet above most banks have a "Know Your Customer "(KYC) policy.

Know your Customer 

Most banks and financial institutions have  put in place a policy framework to know their customers before opening any account.
KYC is a policy that comprises of collecting customer details at the account opening stage.The customer is required to submit proof of identity and proof of address. Some banks may even ask for verification by an existing account holder.

Of course policies likes knowledge don't benefit anyone unlike action is taken to implement them.

"Know Your Customer" At The Micro Level

Hilton Hotels take the"Know Your Customer" policy to a luxurious level.

Hilton is a leader in information technology in its industry, and one of its leading-edge database applications is its Guest Profile Manager (GPM.) This is a customer relationship management (CRM) system that strives to achieve guest recognition and guest acknowledgement at all customer "touch points." These include email, contact at the hotel front desk, special channels on the in-room television, the Audix voice mail system, and post-stay surveys. For example, in the CRM spirit of developing a personalized relationship with the customer, when a guest checks in at any Hilton property, the front desk clerk receives information on their terminal that allows them to say, "Welcome back to Hilton, Mr. Smith," or "Welcome, Ms. Jones. I understand this is your first visit to this hotel (or to Hilton Hotels)." Both the front desk clerk and the housekeeping staff also get information on customer preferences and past complaints, such as
wanting a room with good water pressure and not wanting a noisy room. Targeted customers such as frequent guests might find fruit baskets, bottled water, or bathrobes in their rooms. The system even prepares personalized voice-mail greetings on the guest's in-room telephone.The system, uses an Informix DBMS on a Sun Microsystems platform.(Gillenson)

 IBM Informix is a product family within IBM's Information Management division that is centered on several relational database management system (RDBMS) offerings. (Wikipedia).

Thus the implementation of such sophisticated CRM's may help curb security breaches, prevent identity theft, identity fraud, money laundering, terrorist financing and other financially related perils. We cannot curb the criminal behavior of  human minds but we can know our clients behavior and use it  for damage control or for providing them with with customized luxurious welcomes.

At "beck and call": Hilton Hotel's Hilton Huanying is a  global special welcome programme aimed at Chinese travellers.

References

"ATM Fraudster Took Help of Employees." The Times Of India. N.p., n.d. Web. 30 July 2012.          
     <http://articles.timesofindia.indiatimes.com/2012-06-23/india/32381806_1_atm-cards-kerala-police-federal-bank>.
Gillenson, Mark L. "7." Fundamentals of Database Management Systems. Hoboken, NJ: Wiley, 2005.                                      
        N.pag. Print.
"Hilton Offers Chinese Guests a Special Welcome." « Hotel & Restaurant. N.p., n.d. Web. 30 July 2012.
        <http://www.hotelandrestaurant.co.za/tourism/hilton-offers-chinese-guests-a-special-welcome/>.
Wikipedia. Wikimedia Foundation, 31 July 2012. Web. 30 July 2012.  
        <http://en.wikipedia.org/wiki/IBM_Informix>.



Sunday, July 22, 2012

Normalization Nirvana

“Everything should be made as simple as possible, but not simpler.” - Albert Einstein

Normalization was introduced by E.F. Codd, the inventor of relational database technology as a logical database design technique for hierarchical and network approaches. However, (or as expected, since the idea came from the same person) it proved to be more suitable for relational databases .Of course, today normalization plays the role of a quality check inspector on resultant entity-relationship diagrams as opposed to being a mainstream database design technique.

Normalization is a form of minimalism; it is minimizing data clutter, that is redundancy of data by organizing the fields and tables of a relational database in the most optimal fashion. Normalization is a methodology for organizing attributes into tables so that redundancy among the non-key attributes is eliminated (Gilleson). It comprises of splitting large tables in such a manner that the relationships stay intact but the repetitiveness of the data decreases.


However is normalization a compulsory act when it comes to relation database creation? Does the SQL server enforce a normalization policy?  The answer is no. However there are many benefits to normalization.

Advantages of normalization
  1. A shrunken database: By eliminating duplicate data, you will be able to reduce the overall size of the database.
  2. Fine-tuned tables: Tables with less columns translates into more rows per data page.
  3.  Efficiency: Fewer indexes per table translates into faster maintenance tasks.
As alluring as these advantages may sound there are demerits to this process of dispersion.

Disadvantages of normalization
  1. More tables, more links: As data spreads across the tables, linking them up becomes necessary.
  2. Data,but no data?: Repeated data is stored as code in the table instead of the real data. This requires that a map file must be in place inorder to map the code to the real data.
  3. Data query gone awry?:  Since the normalized model is optimized for applications as opposed to ad hoc querying, querying becomes difficult. (Raman)
Denormalization

Denormalization is a “counteract” to the outcome of normalization where redundant data is added to a normalized piece of data in order to optimize the performance of the database. Denormalization finds its application in many avenues, for example it is extensively used in  data warehouse designing.

 “Perfection is achieved, not when there is nothing more to add, but when there is nothing left to take away.” - Antoine de Saint-Exupe

Reference

Gillenson, Mark L. "7." Fundamentals of Database Management Systems. Hoboken, NJ: Wiley, 2005.                                            
        N.pag. Print.
Raman, Ganesh. "What Does Normalization Have to Do with SQL Server?" Disadvantages of  
        Normalization Â. N.p., n.d. Web. 23 July 2012. <http://softwaretestinginterviewfaqs.wordpress.com/category/disadvantages-of-normalization/>.

Sunday, July 15, 2012

Lay The Tables ... For the Datum is Ready

     The story of how relational database came to be accepted is one of “everything has its place” and “all in good time”.  The idea was first put forth in 1970 by Dr. E.F. Codd, a computer scientist, in a paper titled, “A Relational Model of Data for Large Shared Banks”, which was published in the Communications of the A C M Magazine.

(Codd)


      The idea was brilliant but remained dormant as the information systems community assessed its performance in the real world and cast doubts on it. Moreover network and hierarchical database management systems (DBMS) were at their peak levels and companies became reluctant to abandon them and continued to ride upon their successes.

     But relational database was simply a piece in the jigsaw puzzle. A decade later personal computer’s (PC) came to be popularized, and the normalization technique, which like the PC deemed out to be more compatible with relational tables than hierarchical and network models. 

     By this time, developers were looking for a DBMS that was less cumbersome and easy to handle. The table structure based upon relationships, a.k.a 'relational database' become popular and finally came to be accepted.

     In the relational model every entity interacts with another entity in one to many ways, that is, one -to-one, one-to-many and many-to-many. It is an inhibition-less sort of modeling structure as opposed to the regimen data structure of network and hierarchical.

     The following table substantiates the advantages and disadvantages of hierarchal, network and the multimedia capable object oriented based database management system.



Type of DBMS
Advantages
Disadvantages
Hierarchical
Data can be accessed rapidly as relationships between records are defined in advance.
Relationships between “children” entities are not permitted.
Network
More flexible as connections between different types of data are allowed.
Limit on the number of connections.
Relational
User need not traverse down a hierarchy or network to access data. The data files are relational to each other. The structure is easy to use and data entries can be modified without structure redefinition.
Data search sessions maybe slower.
Object-oriented
Handles new data types such as videos and graphics.
High cost



     However, all four have found acceptance in current day database programming world; the network and hierarchical made a comeback when XML became popular in the 1990’s.  Although the relational model can easily model network or hierarchical data, depending on the requirements of a project, IT professionals choose between different DBMS’s. 

References

Codd, Frank. "A Relational Model of Data for Large Shared Banks." . Communications of the  
    ACM, Feb 1970. Web. 18 Jul 2012. <http://www.seas.upenn.edu/~zives/03f/cis550/codd.pdf>.

Sunday, July 8, 2012

Pebbles to Papyrus to Punched Cards - The Ever Evolving Database


"Just wait, Gretel, until the moon rises, and then we shall see the crumbs of bread which I have strewn about, they will show us our way home again."

 Hansel and Gretel (Brothers)


A sad scene in a fairy tale, a disaster in real life. Like Hansel or Gretel, if you have data at your disposal, but it is simply crude pile, where you can’t associate one datum with the next, then you don’t have information. Disorganized data like vulnerable perishable food is cumbersome and quickly becomes an unusable commodity, devoid of providing long-term benefits. Molding information out of data by laying out tables and building database is an art form that we now call Database Management System (DBMS).

       DBMS’s are everywhere; in your cell phone to your GPS’s to satellites hovering in space. And who can forget the luxurious convenience of package tracking? Whether you waiting for an impulse buy off the internet or a confidential document , your belonging, as it moves from one station to another, one hand to another, from one airport to the next, the assurance, as MasterCard says, is “priceless”. This modern day miracle which has become a necessity and a vital part of every courier service is made possible by DBMS technology.

       There are many approaches to creating a DBMS: hierarchical, network, relational and object oriented.Hierarchical model organizes the data in a top to bottom tree-like structure. Retrieving data is a quick affair that is, it is a speedy system but maintenance is cumbersome.  IMS, Information Management System is probably the most famous hierarchical model. 

       The network model, sometimes informally called the "star model' also adopts a navigational approach like hierarchical. However, since several paths are possible, it is less restrictive. Hierarchical and network are the least used DBMS's approaches. The relational model uses rows and columns to store data; in other words it employs a table structure. The tables of records that are generated "connect" with each other via common relationship values. This model is most commonly used today. MySQL, DB2, Oracle are examples of relational model DBMS's. The object oriented structure is most suitable for multimedia based applications. It works well with JAVA and C++, and hence the name object oriented.

       Database management has changed many mediums; from pebbles to papyrus to punched cards, from tapes to magnetic discs. Nevertheless the pursuit for a flawless DBMS continues; one that caters to usability and the data melting-pot. Saul Wurman, the celebrated information architect in his book Information Anxiety published in 1989 wrote, “A weekday edition of the New York Times contains more information than the average person was likely to come across in a lifetime in seventeenth-century England.” and “The amount of information now doubles every five years; soon it will be doubling every four...” and “Today, the English language contains roughly 500,000 usable words, five times more than the time of Shakespeare.” (Wurman)


       IT (pun intended) is no longer an information highway but a superhighway with dwindling security alley ways. In corporate society where data has become a tangible commodity, where our information is networked directly or indirectly with the internet and fears of identity theft are on everyones mind, DBMS’s need to be developed to be ‘faster, higher and stronger’.

References
Brothers, Grimm. "Short Stories: Hansel and Gretel." East Of The
    Web. N.p., Feb 2000. Web. 8 Jul 2012.
not defined. Trail of Bread Crumbs. Graphic. TV Tropes
Wurman, Richard S. Information Anxiety. New York: Doubleday, 1989. Print.