MongoDB | Notion

What is MongoDB?
- MongoDB is an open-source NoSQL database that stores data in flexible, JSON-like documents called BSON (Binary JSON).
- It is schema-less, allowing dynamic and evolving data structures.
- Uses collections instead of tables, and documents instead of rows.
- Supports high scalability through sharding and replication.
- Ideal for big data, real-time applications, and rapidly changing data.
Tip: MongoDB = flexible, document-based, schema-less database for scalable applications.
Difference between SQL and NoSQL databases.
- SQL Databases:
  - Relational, store data in tables with rows and columns.
  - Use Structured Query Language (SQL) for queries.
  - Schema-based, require predefined structure.
  - Vertically scalable (scale by upgrading hardware).
  - Ideal for complex queries and transactions (ACID compliant).
- NoSQL Databases:
  - Non-relational, store data as documents, key-value pairs, graphs, or columns.
  - Use dynamic queries or API-based access.
  - Schema-less, flexible structure.
  - Horizontally scalable (scale by adding servers).
  - Ideal for big data, real-time apps, and unstructured data.
Tip: SQL = structured and relational, NoSQL = flexible, scalable, non-relational.
Why is MongoDB schema-less?
- MongoDB is schema-less because each document in a collection can have a different structure.
- There is no fixed schema enforced by the database, so fields can be added, removed, or changed dynamically.
- This allows flexibility for applications where data evolves over time or is unstructured or semi-structured.
- Developers can store complex nested data without redesigning the schema.
Tip: Schema-less = flexible document structure, no fixed table design needed.
What is BSON?
- BSON (Binary JSON) is the binary representation of JSON-like documents used by MongoDB to store data.
- Supports more data types than JSON, including Date, Binary, and ObjectId.
- Enables fast encoding and decoding for storage and network transfer.
- Allows MongoDB to efficiently store and query complex, hierarchical data.
Tip: BSON = binary format of JSON used by MongoDB for efficient storage and richer data types.
Difference between JSON and BSON.
- JSON (JavaScript Object Notation):
  - Text-based, human-readable format for storing and exchanging data.
  - Supports basic data types like strings, numbers, arrays, objects, booleans.
  - Slower for storage and parsing in databases due to text format.
- BSON (Binary JSON):
  - Binary-encoded format used by MongoDB for storage and network transfer.
  - Supports additional data types like Date, Binary, and ObjectId.
  - Faster to read, write, and parse, optimized for database operations.
Tip: JSON = readable text format, BSON = binary, efficient, supports more types.
What is a collection?
- A collection in MongoDB is a group of related documents, similar to a table in SQL.
- Collections do not enforce a schema, so documents can have different structures.
- Collections are stored within a database.
- Examples: users, orders, products.
Tip: Collection = container for documents, like a table in relational databases.
What is a document?
- A document in MongoDB is a single record stored in a collection.
- It is a JSON-like object (BSON) containing key-value pairs.
- Documents can store nested data and arrays, allowing flexible and complex structures.
- Example:
```
{
  "_id": ObjectId("624b1f2e7c1a4f5a6d3e8b9c"),
  "name": "John Doe",
  "age": 30,
  "email": "john@example.com"
}
```
Tip: Document = a flexible, JSON-like record inside a MongoDB collection.
What is ObjectId?
- ObjectId is the default unique identifier for a MongoDB document, stored in the _id field.
- It is a 12-byte BSON value containing:
  - 4 bytes: timestamp
  - 5 bytes: random value
  - 3 bytes: incrementing counter
- Ensures uniqueness across distributed systems without a central authority.
- Automatically generated if you don’t provide an _id when inserting a document.
Tip: ObjectId = unique ID for each document, generated automatically by MongoDB.
How is ObjectId generated?
- ObjectId in MongoDB is a 12-byte unique identifier automatically generated for each document.
- The 12 bytes are composed of:
  1. 4 bytes: Timestamp of creation (seconds since epoch)
  2. 5 bytes: Random value unique to the machine and process
  3. 3 bytes: Incrementing counter, unique per ObjectId generated on that machine/process
- This combination ensures global uniqueness and allows you to sort documents by creation time.
Tip: ObjectId = timestamp + machine/process identifier + counter → globally unique ID.
Advantages of using MongoDB.
- Schema-less: Flexible document structure allows easy changes to data without altering the database schema.
- Scalability: Supports horizontal scaling via sharding for handling large datasets.
- High Performance: Optimized for read/write operations with indexing and in-memory storage.
- Rich Data Model: Supports nested documents and arrays, ideal for complex data.
- Replication & High Availability: Replica sets provide automatic failover and redundancy.
- Developer Productivity: Easy to use with JSON-like documents, aligning well with modern application development.
- Aggregation Framework: Powerful tools for data analysis and transformations.
Tip: MongoDB = flexible, scalable, high-performance, and developer-friendly NoSQL database.
Limitations of MongoDB.
What is indexing and why is it important?
Types of indexes in MongoDB.
What is a compound index?
What is a unique index?
Explain normalization vs denormalization.
What is embedding?
What is referencing?
When should you use embedding vs referencing?
What is a TTL index?