Feature Image Querying a NoSQL database_ A 101 Guide (1)

Querying a NoSQL Database:
A 101 Guide

By Laurent Mauer · November 9, 2022 · 6 min read

NoSQL is termed non-SQL. We won’t use SQL to write queries in No SQL. It is not relational but it gives us an organized way of storing data. The data is stored in the form of documents rather than tabular form. The best example for NoSql is Mongo DB. In SQL we will use the term key-value pairs but in Mongo DB we will use field-value pairs. Documents are stored and the group of documents is called “Collection”. The document will be in JSON format. The data is called a “Document” and the collection of documents is called a “Collection”.

NoSQL Databases are mainly categorized into four types: Key-value pair, Column-oriented, Graph-based and Document-oriented.

Every category has its unique attributes and limitations. None of the above-specified database is better to solve all the problems. Users should select the database based on their product needs.

 

Types of NoSQL Database

1. Column-based

Column-oriented databases work on columns and are based on BigTable paper by Google. Every column is treated separately. Values of single column databases are stored contiguously.

2. Graph-Based

A graph type database stores entities as well the relations amongst those entities. The entity is stored as a node with the relationship as edges. An edge gives a relationship between nodes. Every node and edge has a unique identifier.

3. Document-Oriented:

Document-Oriented NoSQL DB stores and retrieves data as a key value pair but the value part is stored as a document. The document is stored in JSON or XML formats. The value is understood by the DB and can be queried.

4. Key Value Pair Based

Data is stored in key/value pairs. It is designed in such a way to handle lots of data and heavy load.

Key-value pair storage databases store data as a hash table where each key is unique, and the value can be a JSON, BLOB(Binary Large Objects), string, etc.

How to query NoSQL database?

  1. Install the MongoDB installer.
  2. Specify a custom directory for MongoDB.
  3. After installation runs the MongoDB daemon.
  4. Connect to Mongo shell.
  5. You can start to code.

Let’s Take an example of a document in the customer collection:

{

“_id” : ObjectId(“600c1806289947de938c68ea”),

“name” : “John”,

“age” : 32,

“gender” : “male”,

“amount” : 32

}

This document is in JSON format.

Getting Started with Querying:

Example 1

Query documents that belong to a specific customer.

“Find” method is used to query documents from a MongoDB database.

We want to see the document belongs to customer John so the name field needs to be specified in the find method.

> db.customer.find( {name: “John”} ){ “_id” : 

ObjectId(“600c1806289947de938c68ea”), “name” : “John”, “age” : 32, “gender” : “male”, “amount” : 32 }

Use pretty to display in readable format.

Query

				
					> db.customer.find( {name: "John"} ).pretty()

				
			

Output: 

				
					{
 "_id" : ObjectId("600c1806289947de938c68ea"),
 "name" : "John",
 "age" : 32,
 "gender" : "male",
 "amount" : 32
}

				
			

Example 2

Query documents that belong to customers older than 40.

The condition is applied to age field using a logical operator. The “$gt” stands for “greater than” and is used as follows.

Query: 

				
					> db.customer.find( {age: {$gt:40}} ).pretty()

				
			

Output: 

				
					{
 "_id" : ObjectId("600c19d2289947de938c68ee"),
 "name" : "Jenny",
 "age" : 42,
 "gender" : "female",
 "amount" : 36
}

				
			

Example 3

Query documents that belong to female customers who are younger than 25.

This example is like a combination of the previous two examples. Both conditions must be met so we use “and” logic to combine the conditions. It can be done by writing both conditions separated by comma.

Query: 

				
					> db.customer.find( {gender: "female", age: {$lt:25}} ).pretty()

				
			

Output: 

				
					{
 "_id" : ObjectId("600c19d2289947de938c68f0"),
 "name" : "Samantha",
 "age" : 21,
 "gender" : "female",
 "amount" : 41
}{
 "_id" : ObjectId("600c19d2289947de938c68f1"),
 "name" : "Laura",
 "age" : 24,
 "gender" : "female",
 "amount" : 51
}

				
			

The “$lt” stands for “less than”.

Example 4

In this example, we will repeat the previous example in a different way. Multiple conditions can also be combined with “and” logic as below.

Query: 

				
					> db.customer.find( {$and :[ {gender: "female", age: {$lt:25}} ]} ).pretty()


				
			

The logic used for combining the conditions is indicated at the beginning. The remaining part is same as the previous example but we need to put the conditions in a list ( [ ] ).

Example 5

Query customers who are either male or younger than 25.

This example requires a compound query with “or” logic. We just need to change “$and” to “$or”.

Query:

				
					> db.customer.find( { $or: [ {gender: "male"}, {age: {$lt: 22}} ] })

				
			

Output: 

				
					{ "_id" : ObjectId("600c1806289947de938c68ea"), "name" : "John", "age" : 32, "gender" : "male", "amount" : 32 }{ "_id" : ObjectId("600c19d2289947de938c68ed"), "name" : "Martin", "age" : 28, "gender" : "male", "amount" : 49 }{ "_id" : ObjectId("600c19d2289947de938c68ef"), "name" : "Mike", "age" : 29, "gender" : "male", "amount" : 22 }{ "_id" : ObjectId("600c19d2289947de938c68f0"), "name" : "Samantha", "age" : 21, "gender" : "female", "amount" : 41 }

				
			

Example 6

MongoDB allows for aggregating values while retrieving from the database. For instance, we can calculate the total purchase amount for males and females. The aggregate method is used instead of the find method.

Query:

				
					> db.customer.aggregate([
... { $group: {_id: "$gender", total: {$sum: "$amount"} } }
... ]

				
			

Output: 

				
					{ "_id" : "female", "total" : 198 }
{ "_id" : "male", "total" : 103 }

				
			

Let’s elaborate on the syntax. We first group the documents by the gender column by selecting “$gender” as id. The next part specifies both the aggregation function which is “$sum” in our case and the column to be aggregated.

If you are familiar with Pandas, the syntax is quite similar to the groupby function.

Example 7

Let’s take the previous example one step further and add a condition. Thus, we first select documents that “match” a condition and apply aggregation.

The following query is an aggregation pipeline which first selects the customers who are older than 25 and calculates the average purchase amount for males and females.

Query:

				
					> db.customer.aggregate([
... { $match: { age: {$gt:25} } },
... { $group: { _id: "$gender", avg: {$avg: "$amount"} } }
... ])

				
			

Output: 

				
					{ "_id" : "female", "avg" : 35.33 }
{ "_id" : "male", "avg" : 34.33 }

				
			

These are basic queries in NoSql.

Conclusion

At RestApp, we’re building a Data Activation Platform for modern data teams with our large built-in library of connectors to databases, including MongoDB, data warehouses and business apps.

We have designed our next-gen data modeling editor to be intuitive and easy to use.

If you’re interested in starting with connecting all your favorite tools, check out the RestApp website or try it for free with a sample dataset.

Discover the next-gen end-to-end data pipeline platform with our built-in No Code SQL, Python and NoSQL functions. Data modeling has never been easier and safer thanks to the No Code revolution, so you can simply create your data pipelines with drag-and-drop functions and stop wasting your time by coding what can now be done in minutes! 

Play Video about Analytics Engineers - Data Pipeline Feature - #1

Discover Data modeling without code with our 14-day free trial!

Category

Share

Subscribe to our newsletter

Laurent Mauer
Laurent Mauer
Laurent is the head of engineer at RestApp. He is a multi-disciplinary engineer with experience across many industries, technologies and responsibilities. Laurent is at the heart of our data platform.

Related articles

Build better data pipelines

With RestApp, be your team’s data hero by activating insights from raw data sources.