Checking the integrity of data
Probabilistic, memory-efficient data structure for approximating the content of a set
Can tell if a key does not appear in the DB
Causal dependency: one event causing another
Happened-before relationship
Not only operations that happen at the same time but also operations made without knowing about each other
Example:
- Concurrent to-do list operations with a current "Buy milk" item
- User 1 deletes it
- User 2 doesn't have an internet connection, modifies it into "Buy soy milk", and then is connected again => this modification may have been done one hour after user 1 deletion
Special kind of hashing such that when a resize occurs, only 1/n percent of the keys need to be rebalanced (n: number of nodes)
Solutions:
- Ring consistent hash with virtual nodes to improve the distribution
- Jump consistent hash: faster but nodes must be numbered sequentially (e.g., if we have 3 servers foo, bar, and baz => we can't decide to remove bar)
May decrease:
- Availability
- Performance
- Scalability
Read heavy:
- Leverage replication
- Leverage denormalization
Write heavy:
- Leverage partition (usually)
- Leverage normalization
- Delayed
- Dropped
- Duplicated
- Out-of-order
Event log:
- Consumers are free to select the point of the log they want to consume messages from, which is not necessarily the head
- Log is immutable, messages cannot be removed by consumers (removed by a GC running periodically)
Impossible to achieve
However, we can achieve exactly-once processing using a dedup or by requiring the consumers to be idempotent
In an asynchronous distributed system, there's no consensus algorithm that can satisfy:
- Agreement
- Validity
- Termination
- And fault tolerance
Encode geographic coordinates into a short string called a cell with varying resolutions
The more letters in the string, the more precise the location
Main use case:
- Proximity searches in O(1)
Map data of arbitrary size to fixed-size values
Examples:
- MD5: 16 bytes
- SHA256: 32 bytes
Distributed filesystem:
- Fault tolerant
- Scalable
- Optimised for batch operations
Architecture:
- Single master (maintain filesystem metadata, inform clients about which server store a specific part of a file)
- Multiple data nodes
Leverage:
- Partitioning: each file is partitioned into multiple chunks => performance
- Replication => availability
Read: communicates with the master node to identify the servers containing the relevant chunks
Write: chain replication
- Decompose stateful and stateless parts of a system => makes scaling easier
- Partitioning => fault isolation
Used to approximate cardinality of a set
Optimization for space over perfect accuracy
Coin flip game: you flip a coin, if head, flip again, if tail stop
If a player reaches n flips, it means that on average, he tried 2n+1 times
For an ID, we will count how many consecutive 0 (head) bits on the left
Example: 001110 => 2
Hence, on average we should have seen 22+1 visitors
Requirement: we need visitors ID to be uniform => either if the ID is randomly generated or by hashing them (if ID is auto incremented for example)
Required memory: log(log(m)) with m the number of unique visitors
Problem with this algo: it depends on luck. For example, if user 00000001 connects every day => the system will always approximate 28 visitors
Distribute to multiple counters and aggregate the results (possible because each counter is very small)
If we want 4 counters, we distribute the ID based on the first 2 bits
Result: 2(n1 + n2 + n3 + n4) / 4
Problem: mean is highly impacted with large outliers
Solution: use harmonic mean
If executed more than once it has the same effect as if it was executed once
- Read 1MB sequentially from memory: 250 碌s
- Round trip within the same datacenter: 0.5 ms
- Read 1MB sequentially from SSD: 1 ms
- Disk seek: 10 ms
- Read 1MB sequentially from disk: 20 ms
- Send round trip packet over continents: ~100 ms
Lock with an expiry timeout after which the lock is automatically released
May lead to situations where two nodes believe they hold the lock (for example, when the expiry signal hasn't been caught yet by the first node because of a GC or CPU throttling)
Can be solved using a fencing token
Not efficient
A more efficient option is to randomly pick two servers and route the request to the least-loaded one of the two
Something good will eventually occur
Example: leader is elected, eventual consistency
Route requests across a pool of servers
Action to reduce the load on something
Example: when the CPU utilization reaches a threshold, the server can start returning errors
A more special form of load shedding is selective client throttling, where an application assigns different quotas to each of its clients
Performance optimization to put several pieces of data in the same place
Append-only, totally ordered sequence of messages
Each message is:
- Appended at the end of the log
- Is assigned a unique sequential index
Example: Kafka
Throw away duplicate keys in the log and keep only the most recent update for each key
Reduce flexibility
If the application needs to access to new data access patterns in an efficient way, it might be hard to provide it given the system's data have been partitioned in a specific way
Example: attempting to query by a secondary attribute that is not the partitioning key might require to access all the nodes of the system
Programming model for processing large amounts of data in bulk across many machines:
- Map: processes a set of key/value pairs and produces as output another set of intermediate key/value pairs.
- Reduce: receives all the values for each key and returns a single value, essentially merging all the values according to some logic
Pros:
- Organizational (each team dictates its own release schedule, etc.)
- Codebase is easier to digest
- Strong boundaries
- Independent scaling
- Independent data model
Cons:
- Eventual consistency
- Remote calls
- Harder to operate (more complex)
- 32: 80 k
- 64: 5 billion
- 128: 3e+18 (1 billion hashes generated every second for 100 years)
Orchestration: single central system responsible for coordinating the execution
Choreography: no need for a central coordinator, each system is aware of the previous and the next
Used to update a DB and publish an event in a transactional fashion
Within a transaction, persist in the DB (insert, update or delete) and insert at the same time a new row in an event table
Implements a worker that checks the event table, publishes an event and deletes the row (at least once guarantee)
No collision, only possible if we know the keys up front
Given k elements, the hashing function returns an int between 0 and k
Tree data structure where each internal node has exactly four children: NE, NW, SE, SW
Main use case:
- Improve geospatial caching (e.g., 1km in an urban area isn't the same as 1km outside cities)
Source: https://engblog.yext.com/post/geolocation-caching
Mechanism that rejects a request when a specific quota is exceeded
Token of a pre-defined capacity, put back in the bucket periodically:
Uses a FIFO queue When a request arrives, checks if the queue is full:
- If yes: request is dropped
- If not: added to the queue => Requests pulled from the queue at regular intervals
Move data or services from one node to another in order to spread the load fairly
Architectural style where the server exposes a set of resources
All communications must be stateless and cacheable
Relies mainly on HTTP but not mandatory
REST (architectural style):
- Universality
- Standardization (status code, ETag, If-Match, etc.)
gRPC (RPC framework):
- Contract
- Binary protocol (faster, less bandwidth) // We could use HTTP/2 without gRPC and leverage binary protocols but it would require more efforts
- Bidirectional
Something bad will never happen
Example: leader election eventually completes
Distributed transaction composed of a set of local transactions
Each transactions has a corresponding compensation action to undo its changes
Usually, a Saga is implemented with an orchestrator that manages the execution of the transactions and handles the compensations if needed
System's ability to cope with increased load
Hard limit (e.g., device maximum throughput)
Reduce coordination and contention so that every request can be processed independently by a single node or group of nodes
Increase availability, performance, and scalability
Holds the authoritative version of the data
Network partition => nodes unable to communicate with each other => multiple nodes believing they are the leader
As a node is unaware that another node is still functioning, it can lead to data corruption or data loss
The rate of work performed
Total order: a binary relation that can be used to compare any 2 elements of a set with each other
Partial order: a binary relation that can be used to compare only some of the elements of a set with each other
Total ordering in distributed systems is rarely mandatory
128-bit number
Collision probability: after generating 1 billion UUID every second for ~100 years, the probability of creating a single duplicate reaches 50%
Validation: process of analyzing the parts of the system and building mental models that reflects the interaction of those parts
Example: validate the quality of water by inspecting all the pipes and infrastructure to capture, clean and deliver water
Verification: process of analyzing output at a system boundary
Example: validate the quality of water by testing the water (output) coming from a sink
Algorithm that generates partial ordering of events and detects causality violation
Reduce temporal coupling (not connected at the same time) => processes execute at independent rates, without blocking the sender
If the interaction pattern isn't request/response with client blocking until it receives the response