NoSQL database stores are initially
considering the notion “Just Say No to SQL” and these were the reactions to the
perceived limitations of (SQL-based) relational databases RDBMS. It’s is not like that these people hated SQL, but they
were tired of putting square pegs into round holes by rectifying problems that
relational databases weren’t actually designed for. A relational database is a
very powerful tool, but for several kinds of data (for e.g. key-value pairs, or
graphs) and few usage patterns (like extremely large scale storage) a
relational database just isn’t practical. Although when it comes to high-volume
storage, relational database can be very costly, both in terms of database
license costs and hardware costs. (Relational databases are engineered to work
with enterprise-grade hardware.) So, with the NoSQL movement, innovative developers
and programmers developed dozens of solutions for distinctive types of thorny
data storage and processing problems. These NoSQL databases specifically
provide massive scalability by the way called “clustering”, and are often
architected to enable high throughput and low latency.
The NoSQL group currently available can be broken down into four specific
categories, on the basis of their design and purpose:
These kind of data stores provides a mechanism to store any kind of data without having to use a schema. In contrary, in relational databases, we need to define the schema (the table structure) before inserting any data into it. Because key-value stores don’t needs a schema, it enables great flexibility to store data in many formats. In a key-value store, a row (or a data) simply comprised of a key (an identifier) and a value, which can be anything from an integer value to a large binary data string. Several implementations of key-value stores are on the basis of Amazon’s Dynamo paper. Reddis and Riak are widely popular key value pair data store
Column family stores:
Here we have databases in which columns are grouped into column families and stored together on disk. If we speak strictly about it, many of these databases aren’t column-oriented, since they’re based on Google’s BigTable paper, which stores data as a multidimensional sorted map for e.g. Cassendra and CouchDB.
These kinds of data store offering rely on collections of similarly encoded and formatted documents to enhance efficiencies. Document stores empower individual documents in a collection to include only a subset of fields, so only the data that’s required is stored. For complex data like sparse data sets, in which many fields are often not populated, this can translate into significant space savings. In Contrary, empty columns in relational database (RDBMS) tables do take up space. Document stores also provide schema flexibility, since only the fields that are required are stored, and new fields can be added. Again, in contrast to relational databases, table structures and schemas are defined up front before data is stored, and changing columns is a messy task that impacts the entire data set. JSON is a very popular format for Document based data stores which is widely used in MongoDB- A document based NoSQL
Here we have databases that store graph structures — representations that show collections of objects (vertices or nodes) and their relationships (edges) with each other. These structures empowers graph databases to be extremely well suited for storing complex structures, such as the linking relationships between all known web pages. (For example, individual web pages act as nodes, and the edges connecting those acts as links from one page to another.) Google, of course, is all over graph technology, and implemented a graph processing engine known as Pregel to power its PageRank algorithm. In the Hadoop , there’s an Apache project called Giraph (based on the Pregel paper), which works is a graph processing engine designed to process graphs stored in HDFS. The best example for Graph based data store is Neo4j.