Cloud Native Software Architecture
A personal cheatsheet/reference for cloud native software architecture.
architecture
how the components are assembled and organized. This will be done in a way that meets the quality attributes.
- Key Questions
- Organization Considerations
- Quality Attributes (*ities and friends)
- Patterns
- Topics / Concepts / Terms
- Resources
Key Questions
- who are the users?
- what devices and form factors will be used?
- what is the context of their usage?
- scale and growth?
- who are the main actors in the system (domain objects - e.g. orders, products, etc.)?
- data classifications (PII)?
- data types and sizes (relation records, documents, media files, etc.)?
- what is the time frame for delivery?
- is there an existing product / SaaS / open-source / etc. that provides the solution or a portion / components of it
- capacity estimation & constraints?
- functional requirements?
- Non Functional Requirements - Latency, Consistency, Availability, High Throughput, etc.
- what is explicitly out of scope
- organization and teams structure
see System Design: DoorDash — a prepared food delivery service for good reference
Organization Considerations
- engineering (application & platform)
- operations (application & platform)
Quality Attributes (*ities and friends)
- reliability - ability to continue to operate under predefined conditions
- availability - ratio of the available system time to the total working time
- scalability - ability of the system to handle load increases without decreasing performance
- efficiency
- performance
- security
- cost
- interoperability
- correctness
- maintainability
- readability
- extensibility
- testability
Patterns
modern cloud native architecture patterns as of July 2020
event-sourcing
Capture all changes to an application state as a sequence of events.
Core Design Decisions
- Domain Entities and Events
- popular method is via Event Storming
- Event Content
- each event stores delta state
- each event stores full state
- idempotent is easy to solve for duplicate events
- Total Ordering (ordered stream of events - ledger)
- ensure all event are processed in order. this is needed for causal relationships.
- e.g. ordering matters for two messages related to the same entity
Resources
- Scaling Event Sourcing for Netflix Downloads, Episode 1
- Scaling Event Sourcing for Netflix Downloads, Episode 2
- InfoQ | Scaling Event Sourcing for Netflix Downloads | Video + Presentation - shows in detail how they implemented event sourcing backed by cassandra
- matrinfowler.com | Event Sourcing
- Pattern: Event sourcing
- EventBridge Storming — How to build state-of-the-art Event-Driven Serverless Architectures - approach to defining the Events, Boundaries and Entities in your business domain
- Decomposing the Monolith with Event Storming
Hexagonal
the ports and adapters architecture. decouples core domain logic from specific storage, database, protocol, etc.
Resources
Topics / Concepts / Terms
Database
- CAP theorem
- Consistency: Every read receives the most recent write or an error
- Availability: Every request receives a (non-error) response, without the guarantee that it contains the most recent write
- Partition tolerance: The system continues to operate despite an arbitrary number of messages being dropped (or delayed) by the network between nodes
- Serializability
- Snapshot isolation
- Multiversion concurrency control
- Things I Wished More Developers Knew About Databases
Shuffle Sharding
limits / isolates tenants in a multi-tenant system so they don’t negatively impact other tenants. method of assigning tenant to resources.
Resources
Constant Work
- overprovision resources to the point where it would operate correctly even if an availability zone were to be unavailable
- if AZ becomes unavailable, no new resources need to be provisioned, just a quick re-routing. you are essentially always operating the infrastructure for failure mode (active-active)
Resources
Canary
A canary release is a technique to reduce the risk from deploying a new version of software into production. A new version of software, referred to as the canary, is deployed to a small subset of users alongside the stable running version. Traffic is split between these two versions such that a portion of incoming requests are diverted to the canary. This approach can quickly uncover any problems with the new version without impacting the majority of users.
Resources
Resources
Books (oreilly.com )
Clean Architecture: A Craftsman’s Guide to Software Structure and Design, First Edition
Clean Architecture: A Craftsman’s Guide to Software Structure and Design, First Edition
Domain-Driven Design: Tackling Complexity in the Heart of Software
Design Patterns: Elements of Reusable Object-Oriented Software
Designing Distributed Control Systems: A Pattern Language Approach (Wiley Software Patterns Series)