NoSQL – The Buzz

NoSQL what is it? The word is really catchy you would start thinking about something which is not SQL. Means we are talking about something which doesn’t have Structured Query Language. But hold on, the name is really confusing. NoSQL means No Relational. Why people came up this name I need to find out. NoSQL is termed for persistence technologies where Relational Database is not used. So when you are not using Relational Database means you are using NoSQL technology.

Before diving deep in what is NoSQL(Non Relational) database, we need to understand what is relational database. Relational Databases are driven by Relations. By the definitions on Wiki

  • A relation is a data structure which consists of a heading and an unordered set of tuples which share the same type.
  • A relation is a set of tuples (d1, d2, …, dj), where each element dn is a member of Dn, a data domain.

So the key characteristic of Relation is Tuple which shares the same type across multiple records. So if you look at below image, there are attributes and tuple. So same set of attributes are shared across multiple Tuples.

 NoSQL_Image1

Let’s take one scenario where we are supposed to store the information related to employees of the company owned by you. So below would be the structure of Relation. This would be kind of table we will have considering we need to store Id, Name, Salary and DOJ.

NoSQL_Image2

As this company is run by developers like us, so we are playing as much as we earn and you can see we are kind organization who pays well. Just kidding. So we have these many employees. All the employees in this company share the same set of information(Id, Name, Salary and DOJ). So story is very happy till here and we are pretty much in control of solution. Now requirement has little bit of change as you decided to support some open source projects also. So you are asking programming community to support you in development of that open source. Since it is kind of open source and you are not expected to pay salary to some of you employees. So you don’t have salary attribute to some of these employees. Like  this you have different set of attributes for every employee is different. So you need to store information like this.

NoSQL_Image3

So now we have to persist information. There are couple of ways to solve this problem using Relational Database. Either use a superset of all these tables and create table with all these attributes. So table structure would look like this.

NoSQL_Image4

Second option is to two table to store above information. One table with common set of fields and other with Property/Value combination. So the second table would hold three fields. One EmployeeId foreign field which will map to first table. So after this redesign you table design is going to look like below.

NoSQL_Image5

You might argue that what would be problem with above approach to store different types of data using relational database. I would say it is perfectly fine and you can live with this as long as you are able solve your problem with Relational Database. But the point which I am trying to make here is different. It is not about having one extra field or creating a parent child relationship. It is about persisting data where every tuple has different set of attributes. So we have data which is non-relational in nature and trying to store in RDBMS(Relational Database Management System) but RDBMS are not inherently designed to store Non Relational data.

NoSQL term is used for broad set of technologies which are schema free, Non Relational, BASE(Not ACID), horizontally scalable and support “Web Scale”. Let us understand these features in detail.

  • Schema Free – We have seen what we mean by pre-defined/kind of pre-defined schema. So NoSQL database don’t have predefined schemas. So each tuple can have different attribute.
  • Non-Relational – NoSQL databases are not relational in nature. Generally we use foreign key relationship to map one table to another table. But in case of NoSQL databases, every tuple is responsible for holding the parent/child tuple child relation and keeping one foreign key concept doesn’t play in case of NoSQL database.
  • BASE(Not ACID) –  We know RDBMS has ACID property. ACID stands for Atomicity, Consistency, Isolation and Durability. BASE stands for Basic Availability, Soft state and eventual consistency. So BASE doesn’t talk about always consistency. It talks about eventual consistency but important is availability. Just to provide detail of what we are talking here. Imagine you started a new merchandise website and deployed on a server. So you started getting hit on  this website.

NoSQL_Image6

Now your site becomes famous overnight and you started getting lot of hits. So scenario would be something like this.

NoSQL_Image7

Since it was a great site, the number of hits increased and one server was not able to handle the entire load. So you created a load balancing architecture where you have server farm to handle all this load.  So the scenario would something like below.

NoSQL_Image8

Imagine a basic architecture where have entire app server connecting to one database. So although web server load is balanced but database server is single. So you also need to load balance at database server level. So eventual architecture would be something like below.

NoSQL_Image9

So now you have multiple copies of database to handle the load at database level. So either you would be using sharding, partitioning, denormalizing or distributed caching techniques. But after doing all these, you have to make sure the database is always in consistent state as per ACID. To make database consistent your availability would be compromised when there is new addition or update or delete of information. This model works fine till you need a web-scale level of application which I just explained above.

So the time you go for distributed system according to Eric Brewer’s CAP theorem if you want consistency, availability, and partition tolerance, you have to settle for two out of three. So in case of ACID, we settle for Consistency and Partition tolerant. But some cases Availability is more important than consistency. Imagine a scenario where it is okay to roll out new product(adding new merchandise in database) different database copies eventually and not in consistent way. So it would be added in one copy of database and then eventually propagated to other copies. This pattern is called BASE.

 

  • Web-Scale – I just explained web-scale term in BASE section. Web-Scale is kind of application which is  Highly Available, Reliable, Transparent, High Performance, Scalable, Accessible, Secure, Usable, and Inexpensive.

 

Below are some of the options available as part of NoSQL. Please refer http://nosql-database.org/.

NoSQL_Image10

It is not that all these features of NoSQL cann’t be done using RDBMS. But RDBMS are not inherently designed to support these features. So RDBMS technology is a forced fit for modern interactive software systems. And I am not advocating NoSQL as silver bullet which will solve all your current problems. But it is just additional option which you should have in your arsenal. When you are deciding for database to be used in your application then you can choose NoSQL based on your requirements.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top