MongoDB Optimization

MongoDB Optimization

When queries get too complex or too slow it is often an indication that there is some thing about your model that does not support your use case.
 
⚠️
This is pretty disorganized, and will not be organized in the immediate future
 
 

Corollary Documents & Notes

💽MongoDB Official Documentation📦Schema Design🗂️MongoDB Indexes📚Choosing a shard key for MongoDB🏎️Optimize for write performance with Cosmos DB using the MongoDB API♻️Read Concern - Write Concern - Read Preference - What does it mean?
 
 
notion image

Guarantees associated with consistency levels

Strong consistency

Description
Strong consistency offers a linearizability guarantee. Linearizability refers to serving requests concurrently. The reads are guaranteed to return the most recent committed version of an item. A client never sees an uncommitted or partial write. Users are always guaranteed to read the latest committed write.
The following graphic illustrates the strong consistency with musical notes. After the data is written to the "West US 2" region, when you read the data from other regions, you get the most recent value:
notion image

Bounded staleness consistency

Description
In bounded staleness consistency, the reads are guaranteed to honor the consistent-prefix guarantee. The reads might lag behind writes by at most "K" versions (that is, "updates") of an item or by "T" time interval, whichever is reached first. In other words, when you choose bounded staleness, the "staleness" can be configured in two ways:
  • The number of versions (K) of the item
  • The time interval (T) reads might lag behind the writes
For a single region account, the minimum value of K and T is 10 write operations or 5 seconds. For multi-region accounts the minimum value of K and T is 100,000 write operations or 300 seconds.
Bounded staleness offers total global order outside of the "staleness window." When a client performs read operations within a region that accepts writes, the guarantees provided by bounded staleness consistency are identical to those guarantees by the strong consistency. As the staleness window approaches for either time or updates, whichever is closer, the service will throttle new writes to allow replication to catch up and honor the consistency guarantee.
Inside the staleness window, Bounded Staleness provides the following consistency guarantees:
  • Consistency for clients in the same region for an account with single write region = Strong
  • Consistency for clients in different regions for an account with single write region = Consistent Prefix
  • Consistency for clients writing to a single region for an account with multiple write regions = Consistent Prefix
  • Consistency for clients writing to different regions for an account with multiple write regions = Eventual
    • Bounded staleness is frequently chosen by globally distributed applications that expect low write latencies but require total global order guarantee. Bounded staleness is great for applications featuring group collaboration and sharing, stock ticker, publish-subscribe/queueing etc. The following graphic illustrates the bounded staleness consistency with musical notes. After the data is written to the "West US 2" region, the "East US 2" and "Australia East" regions read the written value based on the configured maximum lag time or the maximum operations:
notion image

Session consistency

Description
In session consistency, within a single client session reads are guaranteed to honor the consistent-prefix, monotonic reads, monotonic writes, read-your-writes, and write-follows-reads guarantees. This assumes a single "writer" session or sharing the session token for multiple writers.
Clients outside of the session performing writes will see the following guarantees:
  • Consistency for clients in same region for an account with single write region = Consistent Prefix
  • Consistency for clients in different regions for an account with single write region = Consistent Prefix
  • Consistency for clients writing to a single region for an account with multiple write regions = Consistent Prefix
  • Consistency for clients writing to multiple regions for an account with multiple write regions = Eventual
  • Consistency for clients using the Azure Cosmos DB integrated cache = Eventual
    • Session consistency is the most widely used consistency level for both single region as well as globally distributed applications. It provides write latencies, availability, and read throughput comparable to that of eventual consistency but also provides the consistency guarantees that suit the needs of applications written to operate in the context of a user. The following graphic illustrates the session consistency with musical notes. The "West US 2 writer" and the "West US 2 reader" are using the same session (Session A) so they both read the same data at the same time. Whereas the "Australia East" region is using "Session B" so, it receives data later but in the same order as the writes.
notion image

Consistent prefix consistency

Description
In consistent prefix option, updates that are returned contain some prefix of all the updates, with no gaps. Consistent prefix consistency level guarantees that reads never see out-of-order writes.
If writes were performed in the order A, B, C, then a client sees either AA,B, or A,B,C, but never out-of-order permutations like A,C or B,A,C. Consistent Prefix provides write latencies, availability, and read throughput comparable to that of eventual consistency, but also provides the order guarantees that suit the needs of scenarios where order is important.
Below are the consistency guarantees for Consistent Prefix:
  • Consistency for clients in same region for an account with single write region = Consistent Prefix
  • Consistency for clients in different regions for an account with single write region = Consistent Prefix
  • Consistency for clients writing to a single region for an account with multiple write region = Consistent Prefix
  • Consistency for clients writing to multiple regions for an account with multiple write region = Eventual
The following graphic illustrates the consistency prefix consistency with musical notes. In all the regions, the reads never see out of order writes:
notion image

Eventual consistency

Description
In eventual consistency, there's no ordering guarantee for reads. In the absence of any further writes, the replicas eventually converge.Eventual consistency is the weakest form of consistency because a client may read the values that are older than the ones it had read before. Eventual consistency is ideal where the application does not require any ordering guarantees. Examples include count of Retweets, Likes, or non-threaded comments. The following graphic illustrates the eventual consistency with musical notes.
notion image
 

Multi-region Availability Concerns

Multi-region accounts will experience different behaviors depending on the following table.
Availability
Configuration
Outage
Availability impact
Durability impact
What to do
Read region outage
All clients will automatically redirect reads to other regions. No read or write availability loss for all configurations, except 2 regions with strong consistency which loses write availability until the service is restored or, if service-managed failover is enabled, the region is marked as failed and a failover occurs.
No data loss.
During the outage, ensure that there are enough provisioned RUs in the remaining regions to support read traffic.When the outage is over, re-adjust provisioned RUs as appropriate.
Write region outage
Clients will redirect reads to other regions.Without service-manages failover, clients will experience write availability loss, until write availability is restored automatically when the outage ends. With service-managed failover clients will experience write availability loss until the services manages a failover to a new write region selected according to your preferences.
If strong consistency level is not selected, some data may not have been replicated to the remaining active regions. This depends on the consistency level selected as described in this section. If the affected region suffers permanent data loss, unreplicated data may be lost.
During the outage, ensure that there are enough provisioned RUs in the remaining regions to support read traffic.Do not trigger a manual failover during the outage, as it will not succeed. When the outage is over, re-adjust provisioned RUs as appropriate. Accounts using SQL APIs may also recover the non-replicated data in the failed region from your conflicts feed.
Any regional outage
Possibility of temporary write availability loss, analogously to single write region with service-managed failover. The failover of the conflict-resolution region may also cause a loss of write availability if a high number of conflicting writes happen at the time of the outage.
Recently updated data in the failed region may be unavailable in the remaining active regions, depending on the selected consistency level. If the affected region suffers permanent data loss, unreplicated data may be lost.
During the outage, ensure that there are enough provisioned RUs in the remaining regions to support additional traffic.When the outage is over, you may re-adjust provisioned RUs as appropriate. If possible, Cosmos DB will automatically recover non-replicated data in the failed region using the configured conflict resolution method for SQL API accounts, and Last Write Wins for accounts using other APIs.
 

Brewer's CAP Theorem

Eric Brewer proposed a conjecture that states that if you want consistency, availability, and partition tolerance, you have to settle for two out of three for any shared data system. This assertion as since been proven and Brewer's proposal is known as Brewer's CAP theorem, where CAP stands for Consistency, Availability, and Partitions. Partition tolerance means that all the systems will continue to work unless there is a total network failure. The inaccessibility of a few notes will not impact the system. Let's examine each of the aspects of CAP.

Consistency

Consistency in this discussion means that everyone sees the same view of the data if it is replicated in a distributed system. This can be enforced by forcing the algorithms to wait until all participating nodes acknowledge their actions (e.g., two phase commit). Guaranteeing this impacts availability. Alternatively, if we want to offer availability, we need to ensure that all live nodes can get updated and we have to give up on partition tolerance.

Availability

Availability refers to the system being highly available. Since commodity-built individual systems are not highly available, we achieve availability through redundancy, which means replication. If one system is down, a request can be fulfilled by another. In an environment with multiple systems connected on a network we have to be concerned about network partitioning. If we have partition tolerance, then we lose consistency: some systems are disconnected from the network segment where updates are being issued. Conversely, to keep consistency, we have to ensure that the network remains fully connected so that all live nodes can get updates. This means giving up on partition tolerance.

Partition Tolerance

Partition tolerance means that the system performs correctly even if the network gets segmented. This can be enforced by using a non-distributed system (in which case partitioning is meaningless) or by forcing the algorithms to wait until network partitioning no longer exists (e.g., two phase commit). Guaranteeing this impacts availability. Alternatively, the system can continue running, but partitioned nodes will not participate in the computation (e.g., commits, updates) and will hence have different values of data, impacting consistency. Giving up on consistency allows us to use optimistic concurrency control techniques as well as leases instead of locks. Examples of this are web caches and the Domain Name System (DNS).

BASE: Giving up on ACID

Availability and partition tolerance are not part of the ACID guarantees of a transaction, so we may be willing to give those up to preserve database integrity. However, that may not be the best choice in all environments since it limits a system's ability to scale and be highly available. In fact, in a lot of environments, availability and partition tolerance are more important than consistency (so what if you get stale data?).
In order to guarantee ACID behavior in transactions, objects (e.g., parts of the database) have to be locked so that everyone will see consistent data, which involves other entities having to wait until that data is consistent and unlocked. Locking works well on a small scale but is difficult to do efficiently on a huge scale. Instead, it is attractive to consider using cached data. The risk is that we violate the "C" and "I" in ACID (Consistent & Isolated): two separate transactions might see different views of the same data. An example might me that you just purchased the last copy of a book on Amazon.com but I still see one copy remaining.
An alternative to the strict requirements of ACID is BASE, which stands for Basic Availability, Soft-state, Eventual consistency. Instead of requiring consistency after every transaction, it is enough for a database to eventually be in a consistent state. In these environments, accessing stale data is acceptable. This leniency makes it easy to cache copies of data throughout multiple nodes, never have to lock access to all those copies for any extensive time (e.g., a transaction operating on data will not lock all copies of that data), and update that data asynchronously (eventually). With a BASE model, extremely high scalability is obtainable through caching (replication), no central point of congestion, and no need for excessive messages to coordinate activity and access to data.

PACELC Theorem

notion image
Daniel Abadi, author of the PACELC Theorem, describes PACELC in his original paper as
if there is a partition (P), how does the system trade off availability and consistency (A and C); else (E), when the system is running normally in the absence of partitions, how does the system trade off latency (L) and consistency (C)?
PACELC extends CAP to normal conditions (i.e. when the system has no partitions) where the tradeoff is between latency and consistency. Systems that allow lower latency along with relaxed consistency are classified as EL while systems that have higher latency along with linearizable consistency are classified as EC.

Practical Implications

PACELC gives distributed database designers a more complete framework to reason about the essential tradeoffs and therefore avoid building a more limiting system than necessary in practice.

Comparisons

Note that both CAP and PACELC theorems help us better understand single-row linearizability guarantees in distributed databases. Multi-row operations come under the purview of ACID guarantees (with the Isolation level playing a big role in how to handle conflicting operations) that are not covered by CAP and PACELC.

The Eight Fallacies of Distributed Computing

Essentially everyone, when they first build a distributed application, makes the following eight assumptions. All prove to be false in the long run and all cause big trouble and painful learning experiences.
  1. The network is reliable
  1. Latency is zero
  1. Bandwidth is infinite
  1. The network is secure
  1. Topology doesn't change
  1. There is one administrator
  1. Transport cost is zero
  1. The network is homogeneous