NoSQL database stores are initially
considering the notion “Just Say No to SQL” and these were the reactions to the
perceived limitations of (SQL-based) relational databases RDBMS. It’s is not like that these people hated SQL, but they
were tired of putting square pegs into round holes by rectifying problems that
relational databases weren’t actually designed for. A relational database is a
very powerful tool, but for several kinds of data (for e.g. key-value pairs, or
graphs) and few usage patterns (like extremely large scale storage) a
relational database just isn’t practical. Although when it comes to high-volume
storage, relational database can be very costly, both in terms of database
license costs and hardware costs. (Relational databases are engineered to work
with enterprise-grade hardware.) So, with the NoSQL movement, innovative developers
and programmers developed dozens of solutions for distinctive types of thorny
data storage and processing problems. These NoSQL databases specifically
provide massive scalability by the way called “clustering”, and are often
architected to enable high throughput and low latency.
The NoSQL group currently available can be broken down into four specific
categories, on the basis of their design and purpose:
Key-value stores:
These
kind of data stores provides a mechanism to store any kind of data without having
to use a schema. In contrary, in relational databases, we need to define the
schema (the table structure) before inserting any data into it. Because
key-value stores don’t needs a schema, it enables great flexibility to store
data in many formats. In a key-value store, a row (or a data) simply comprised
of a key
(an identifier) and a value, which can be anything from an
integer value to a large binary data string. Several implementations of
key-value stores are on the basis of Amazon’s Dynamo paper. Reddis
and Riak
are widely popular key value pair data store
Column family stores:
Here
we have databases in which columns are grouped into column families and stored
together on disk. If we speak strictly about it, many of these databases aren’t
column-oriented, since they’re based on Google’s BigTable paper, which stores
data as a multidimensional sorted map for e.g. Cassendra and CouchDB.
Document stores:
These
kinds of data store offering rely on collections of similarly encoded and
formatted documents to enhance efficiencies. Document stores empower individual
documents in a collection to include only a subset of fields, so only the data
that’s required is stored. For complex data like sparse data sets, in which many
fields are often not populated, this can translate into significant space
savings. In Contrary, empty columns in relational database (RDBMS) tables do
take up space. Document stores also provide schema flexibility, since only the
fields that are required are stored, and new fields can be added. Again, in
contrast to relational databases, table structures and schemas are defined up
front before data is stored, and changing columns is a messy task that impacts
the entire data set. JSON is a very
popular format for Document based data stores which is widely used in MongoDB-
A document based NoSQL
Graph databases:
Here
we have databases that store graph structures — representations that show
collections of objects (vertices or nodes) and their relationships (edges) with
each other. These structures empowers graph databases to be extremely well
suited for storing complex structures, such as the linking relationships
between all known web pages. (For example, individual web pages act as nodes,
and the edges connecting those acts as links from one page to another.) Google,
of course, is all over graph technology, and implemented a graph processing
engine known as Pregel to power its PageRank algorithm. In the Hadoop , there’s
an Apache project called Giraph (based on the Pregel paper),
which works is a graph processing engine designed to process graphs stored in
HDFS. The best example for Graph based data store is Neo4j.
Leave Comment
1 Comments