How to choose the best event source for pub/sub messaging with AWS Lambda

Publish-subscribe (pub/sub) messaging is a versatile pattern that enables loose coupling between message publishers and consumers. AWS offers several options for implementing pub/sub with AWS Lambda, including Amazon SNS, Kinesis streams, and DynamoDB streams as event sources. In this article, we‘ll take an in-depth look at these options and provide guidance on how to choose the best event source for your use case.

The pub/sub messaging pattern

In the pub/sub pattern, publishers send messages to a topic or channel without knowing who or how many subscribers there are. Subscribers express interest in one or more topics and only receive messages for topics they‘ve subscribed to. The pub/sub model allows for greater scalability and flexibility than traditional point-to-point messaging.

Some common use cases for pub/sub include:

  • Notifying multiple consumers about events like changes in data or system state
  • Enabling parallel asynchronous processing of messages
  • Implementing event-driven architectures and serverless workflows

AWS Lambda is a natural fit for consuming messages in a pub/sub system. Lambda functions can be triggered by messages from SNS topics, Kinesis streams, and DynamoDB streams. The Lambda service takes care of scaling function instances up and down to match the rate of incoming messages.

Overview of AWS event sources for pub/sub with Lambda

There are three main options for event sources that support pub/sub messaging with Lambda:

  1. Amazon Simple Notification Service (SNS)
  2. Amazon Kinesis Data Streams
  3. Amazon DynamoDB Streams

Let‘s look at each of these in more detail.

Amazon SNS as an event source

Amazon SNS is a fully managed pub/sub messaging service. With SNS, you create a topic and control access to it using AWS Identity and Access Management (IAM). Publishers send messages to the SNS topic, and SNS delivers the message to all subscribers.

Lambda functions can subscribe to SNS topics. Whenever a message is sent to the subscribed topic, Lambda will invoke your function, passing in the message payload as a parameter. SNS is a push-based system, so Lambda is invoked as soon as a message is available.

Some characteristics of using SNS as an event source for Lambda:

  • Automatic triggering of Lambda for each message
  • Highly scalable – supports high throughput
  • Parallel execution – each message invokes a separate instance of the Lambda function
  • Configurable retry policy for failures (Lambda retries, dead-letter queues)
  • Subject filtering – Lambda can filter messages by a field in the message attributes
  • Fan-out architecture – deliver the same message to multiple subscribers

Pros of SNS:

  • Simple setup and configuration
  • Fully serverless and managed
  • Supports wide variety of subscriber types (Lambda, SQS, HTTP/S, email, SMS)
  • Built-in message filtering

Cons of SNS:

  • No message ordering guarantees
  • Limited to maximum message size of 256 KB
  • Requires careful management of IAM permissions

SNS is a good choice when you need to fan out messages to multiple destinations and don‘t need strict message ordering. It‘s commonly used for notifications, webhooks, and simple workflows that can be accomplished with retries.

Amazon Kinesis as an event source

Kinesis Data Streams is a scalable real-time data streaming service. Data producers send records to a Kinesis stream, and consumers process the records in real time. Kinesis streams retain data for up to 7 days, allowing for replay and reprocessing.

Lambda can be configured to poll a Kinesis stream and invoke a function to process batches of records. Each Lambda function is associated with a particular shard in the stream. The number of shards determines the read throughput and affects Lambda parallelism.

Some characteristics of using Kinesis as an event source for Lambda:

  • Polling-based – Lambda polls the stream for new records
  • Configurable batch size (1 – 10,000) and batch window
  • In-order processing per shard
  • Reads are throttled by number of shards
  • Automatic scaling of Lambda up to concurrent execution limit
  • Records may be processed more than once in event of failure
  • Data retention allows for replaying records

Pros of Kinesis:

  • Supports high throughput and large messages (up to 1 MB)
  • Ordered records within a shard
  • Long data retention (1-7 days) allows replayability
  • Dedicated throughput per shard
  • Scales to handle high-velocity data

Cons of Kinesis:

  • More complex to set up and manage than SNS
  • Have to provision and scale shards, which affects cost
  • Polling has some latency vs push model
  • Repeated processing of the same records is possible

Kinesis is a good fit for use cases that require high throughput, ordered messaging, or the ability to replay records. Examples include real-time analytics, IoT data ingestion, clickstream processing, and complex event processing.

DynamoDB streams as an event source

DynamoDB is a NoSQL database that supports triggers via DynamoDB streams. Whenever data is modified in a DynamoDB table, the changes can be captured and streamed to Lambda.

With DynamoDB streams, Lambda polls the stream for changes and can be configured to process batches of records. Each Lambda function processes records from a particular shard, similar to Kinesis.

Some unique aspects of DynamoDB streams:

  • Tightly integrated with DynamoDB as the source
  • Captures create, update, and delete events
  • Automatic scaling of shards based on table traffic
  • No additional cost beyond regular DynamoDB pricing

From a development perspective, DynamoDB streams have some limitations:

  • Each stream only contains events for one table
  • Records describe low-level events, requiring translation to domain events
  • 24 hour data retention, less than Kinesis
  • Only supports Lambda as a consumer

Pros of DynamoDB streams:

  • Easy to set up triggers for DynamoDB tables
  • Automatic scaling and pricing along with table
  • In-order, exactly-once processing semantics

Cons of DynamoDB streams:

  • Limited to DynamoDB as the source
  • Short data retention, no replay after 24 hours
  • Records are low-level DynamoDB events

DynamoDB streams are a good choice when you primarily need to react to changes in a DynamoDB table. Common use cases include notifications, data aggregation, and updating denormalized copies of data.

Cost comparison

The cost of SNS, Kinesis and DynamoDB streams depends on your usage in terms of message volume, size, and velocity. Some high-level characteristics:

  • SNS has no upfront costs, you pay per million requests + data transfer out
  • Kinesis has a per-shard hourly cost + put payload unit cost
  • DynamoDB streams costs are bundled with table reads/writes + data transfer out

To get a rough idea, here are some monthly cost projections at different usage levels, assuming 1KB messages:

1 msg/sec 1,000 msg/sec
SNS $0.50 $43
Kinesis (1 shard) $11 $32
DynamoDB streams $0 $1000+

Keep in mind these are simplified estimates. In reality, messages are likely to be larger and traffic may be bursty. You‘ll need to provision some extra capacity for Kinesis and DynamoDB streams to handle peaks.

The key takeaway is that while Kinesis has a higher upfront cost due to per-shard pricing, the per-message cost is lower than SNS at high volumes. DynamoDB streams costs can vary widely depending on table throughput.

Using Lambda as a message broker

Beyond SNS, Kinesis and DynamoDB streams, you can use Lambda functions themselves as brokers to propagate messages between services. The "lambda-fanout" project from AWS Labs demonstrates this pattern.

With lambda-fanout, a Lambda function consumes events from a source like Kinesis and forwards them to destinations like SQS queues or other Lambda functions. This allows propagating messages across accounts and regions where direct integration may not be possible.

While a valid approach for some niche use cases, using Lambda as a broker adds complexity and cost. You need to factor in additional messaging latency, error handling, and the operational burden of deploying and monitoring the fanout function. Generally it‘s best to use Lambda as a broker only when native integration is not possible.

Choosing the right event source

With the different options available, how do you choose the right event source for your pub/sub use case? Here are some of the key factors to consider:

  • Scalability and throughput needs
  • Ordering and exactly-once processing requirements
  • Message size
  • Latency sensitivity
  • Durability and retention
  • Ease of use and management
  • Cost at your usage level

Here are some common scenarios and the event source that might fit best:

Scenario Event Source Rationale
Sending notifications to many subscribers SNS Simple pub/sub model, supports multiple subscriber types
Real-time stream processing Kinesis High throughput, supports ordering, long retention
Triggering on DynamoDB changes DynamoDB streams Tightly integrated with table, exactly-once semantics
Routing messages across regions Lambda as broker Enables forwarding where direct integration is not possible

In reality, many systems will utilize multiple event sources. You might use SNS for notification events, Kinesis for streaming data, and DynamoDB streams for database triggers. The key is to understand the tradeoffs and choose the right tool for each job.

Best practices for Lambda-based pub/sub

Whichever event source you use, there are some best practices to keep in mind for building reliable and scalable pub/sub systems with Lambda:

  • Manage Lambda permissions with IAM roles and resource policies
  • Control concurrency with reserved concurrency limits
  • Avoid long-running tasks that may exceed Lambda timeouts
  • Use dead-letter queues for messages that can‘t be processed
  • Monitor performance with Lambda metrics and logging
  • Load test to validate scalability and throughput
  • Deploy and update functions with CI/CD pipelines
  • Consider VPC and security configurations for private resources

Conclusion

AWS provides a spectrum of options for implementing pub/sub messaging with Lambda, each with its own strengths and tradeoffs. SNS is a simple and versatile managed pub/sub service, Kinesis enables high-throughput streaming use cases, and DynamoDB streams unlock database triggers. In some cases, you can even use Lambda as a message broker.

The right choice depends on your requirements for scalability, ordering, durability, and cost. By understanding the capabilities and tradeoffs of each option, you can design an event-driven architecture that is scalable, flexible, and fits your use case. As always with serverless, make sure to follow Lambda best practices around permissions, concurrency, monitoring, and testing for maximum reliability.

Pub/sub messaging is a powerful pattern for building loosely coupled, event-driven systems. By leveraging the managed event sources for Lambda, you can implement pub/sub in a scalable and resilient way without having to manage the underlying messaging infrastructure. Choose the right event source, follow best practices, and let Lambda do the heavy lifting of processing your messages.

Similar Posts