Tuesday, June 10, 2014

Exploring NoSQL database. What are they and how do they work?

Most modern applications use databases to store data in what’s known as a Relational Database Management Systems (RDBMS). RDBMS have evolved significantly since they were introduced in the 1970’s. They do have some drawbacks with certain types of applications. The relational model is too heavy as it has to parse, lock, log, keep track of buffer pools and spawn a lot of threads. In many types of applications however, RDBMS systems are perfect for the task.

NoSQL is non-relational, distributed with the ability to horizontally scale, high performance and highly scalable.

You would use a RDBMS if you need to have structured data, transactions, ACID capability, and simple or complex aggregations.

You would use NoSQL if you need high read/write throughput and unstructured or semi-structured data. NoSQL databases are usually simpler to implement as you don’t need to have an architect to design a relational model.

NoSQL databases provide schema free storage and allow indexing of individual fields for fast data retrieval. Data is stored in JSON (Binary JSON to be more specific). This means that you can store arrays and arrays inside arrays, making it very flexible for many types of applications.

If we were to build a simple blogging application that used NoSQL as a backend database.
We would need to create a Collection (table) that contains the blog information like Title, Date Published, Content, array of meta tags, array of comments by anonymous users.

When loading a specific blog, you have all the data with one query.

A JavaScript like language is used to query NoSQL databases. If you know JavaScript the syntax is easy to pick up.

An interesting fact about JavaScript: You can now write a complete end to end product using JavaScript. Web Browser and/or Apache Cordova for mobile, back end API with NodeJs, Database system with NoSQL.

What really sets NoSQL apart is the ability to distribute the data to multiple servers known as data sharding. Imagine that you have a multi-terabyte SQL database with millions or billions of rows of data in each table. Querying this much data becomes very expensive and tricky on a RDBMS system. You would need to have the right team with the right skillset and a big server or cluster of servers.

With a NoSQL solution, this is much easier. You would need to buy a bunch of commodity servers with lots of memory, and an initial configuration to shard your data. Your millions and billions of rows of data are now distributed between your physical commodity servers. For example, let’s say that you have 6 servers/virtual machines and 6 billion rows in total.
Once your data sharded each server will contain 1 billion rows of data that it can query. There are no changes to the application, your application still sees all 6 billion rows or data due to the underlying architecture of the NoSQL platform. If this was a RDBMS, it would need to keep track of all 6 billion rows of data.

A NoSQL system can offer a lot of ease when dealing with large data to scale horizontally and a quick start because the relational model does not need to be defined upfront (the application logic defines database model).

Some of the popular NoSQL database servers are MongoDB and CouchBase.

We hope you have found this week’s edition of "To The Point" by Fetbi Irsat to be helpful and informative. Look out for our next week instalment as we continue to explore unique topics from business to the latest technology.

We want to hear your point! If you have any ideas, suggestions or any questions about our weekly blog, please contact us at: info@pointalliance.com.

Warm regards,

Point Alliance Team

No comments:

Post a Comment