Serverless Architecture: Key Service Considerations

A serverless architecture is “typically” composed of many services. The following covers the key considerations and configuration options for the most common AWS services leveraged for serverless architectures.

Relevant Patterns
Lambda
SNS
SQS
Kinesis
EventBridge
DynamoDB
Step Functions
API Gateway
CloudFront
Route53
Global Accelerator
WAF

Relevant Patterns

common cloud native patterns to consider in the context of serverless architectures of scale

event sourcing
circuit breaker - trip circuit to prevent downstream systems overload
load shedding - prevent backlog buildup
handle poison messages - prevent kinesis and dynamodb streams from progressing
prevent distributed transactions. e.g. lambda send job to SQS and stores status in dynamodb. break it up. lambda put job status in dynamo -> dynamo stream -> lambda send job to SQS

Lambda

synchronous vs asynchronous vs poll based (poll based is sync) - impacts automatic retries, stuck messages due to poison message, etc.
- see Understanding the Different Ways to Invoke Lambda Functions
if lambda is strictly a glue passthrough for API Gateway to call a backend AWS service, look to use API Gateway Service Proxies to remove lambda. simpler/cheaper/etc.
memory
DLQ
lambda destinations (only for async invokes)
reserved concurrency - concurrency allocated for a specific function. e.g. i always want fn X to be able to run 10 lambda invokes concurrently
provisioned concurrency - pre-warmed lambda instances / no cold starts. good for latency sensitive needs
- can optionally use auto scaling to adjust on based on metrics and/or schedule.
- will spill over to on-demand scaling (lambda default)
- Provisioned Concurrency comes out of your regional concurrency limit
concurrent executions (throttles) - 1000 per account
timeout - 15min
- set code timeouts based on remaining invocation time provided in context
burst concurrency - 500 - 3000
burst - 500 new instances / min
poll based options (kinesis, dynamodb, SQS)
- on-failure destination (SNS or SQS)
- retry attempts
- max age of record - use to implement load shedding (prioritize newer messages)
- split batch on error
- concurrent batches per shard

SNS

fan out to address scale
KMS to encrypt payloads

SQS

batch size - batch fails as unit
visibility timeout - set to 6x lambda timeout
message retention period
delivery delay - max 15min
types - standard vs FIFO
- standard - at least once delivery. need to ensure idempotent
alarm on queue depth
KMS

Kinesis

partition key - choose wisely as order is guaranteed per shard and pk determines the shard the message lands on
poison messages (retry until success - can cause backlog)
KMS to encrypt payloads
enhanced fan-out via AWS::Kinesis::StreamConsumer. each consumer gets 2 MiB per second for every shard you subscribe to. can subscribe a max of 5 consumers per stream.

EventBridge

put events - 2400 requests per second per region
invocation quota - 4500 requests per second per region (invocation is an event matching a rule and being sent on to the rule’s targets)

DynamoDB

global tables - for resilient active-active architectures
throttles
streams - 24hr data retention. poison messages (retry until success - can cause backlog)
partition key - distribute data among nodes to minimize hot partitions
TTL - can the data be removed automatically

Step Functions

Standard Workflows vs Express Workflows
saga pattern for rollback
parallel map opportunities - run tasks in parallel

API Gateway

REST API vs HTTP API (cheaper )
caching - fixed cost based on time / no pay per use
throttles
timeout - 29s
auth - cognito, JWT, IAM (aws sigv4), custom lambda auth
OpenAPI specs for payload validation
service proxies - no need for lambda glue in middle
custom domains
websockets

CloudFront

origin access identity to force traffic through CloudFront and removes direct access to S3 website domain URL
signed URLs or cookies
lambda@edge - headers only requests, rewrite URLs, server-side rendering (SSR), auth, etc.
cache invalidations
non GET HTTP methods support. must explicitly turn on support for PUT, POST, PATCH, etc.
WAF in front

Route53

Geoproximity routing for global solutions serving multiple regions

Global Accelerator

uses the AWS global network to optimize the path from your users to your applications, improving the performance of your traffic by as much as 60%

WAF

can put in front of API Gateway or CloudFront
API Gateway provides overlapping functionality with WAF. Need to determine the appropriate service to use.