What are DynamodDB Streams? How do they work?
DynamoDB Stream can be described as a stream of observed changes in data. Once enabled, whenever you perform a write operation to the DynamoDB table, like
delete, a corresponding event containing information like which record was changed and what was changed will be saved to the Stream.
Characteristics of DynamoDB Stream
- events are stored up to 24 hours
- ordered, sequence of events in the stream reflects the actual sequence of operations in the table
- near-real time, events are available in the stream within less than a second from the moment of the write operation
- deduplicated, each modification corresponds to exactly one record within the stream
Anatomy of DynamoDB Stream
Stream consists of Shards. Each Shard is a group of Records, where each record corresponds to a single data modification in the table related to that stream.
Shards are automatically created and deleted by AWS. Shards also have a possibility of dividing into multiple shards, and this also happens without our action.
Moreover, when creating a stream you have few options on what data should be pushed to the stream. Options include:
OLD_IMAGE- Stream records will contain an item before it was modified
NEW_IMAGE- Stream records will contain an item after it was modified
NEW_AND_OLD_IMAGES- Stream records will contain both pre and post-change snapshots
DynamoDB Lambda Trigger
DynamoDB Streams works particularly well with AWS Lambda. They scale to the amount of data pushed through the stream and streams are only invoked if there's data that needs to be processed.
In Serverless Framework, to subscribe your Lambda function to a DynamoDB stream, you might use following syntax:
DynamoDB Streams Example Use Cases
DynamoDB Streams are great if you want to decouple your application core business logic from effects that should happen afterward. Your base code can be minimal while you can still "plug-in" more Lambda functions reacting to changes as your software evolves. This enables not only separation of concerns but also better security and reduces the impact of possible bugs.
Even though cross-region data replication can be solved with DynamoDB Global tables, you may still want to replicate your data to DynamoDB table in the same region or push it to RDS or S3. DynamoDB Streams are perfect for that.
DynamoDB Streams are also useful for writing "middlewares". You can easily decouple business logic with asynchronous validation or side-effects. One example of such a case is content moderation. Once a message or image is added to a table, DynamoDB Stream passes that record to the Lambda function, which validates it against AWS Artificial Intelligence services such as AWS Rekognition or AWS Comprehend.
Sometimes the data must also be replicated to other sources, like Elasticsearch where it could be indexed in order to make it searchable. DynamoDB Streams allow that too.
Notifications and sending e-mails
Similarl to the previous example, once the message is saved to DynamoDB table, Lambda function which subscribes to that stream, invokes AWS Pinpoint or SES to notify recipients about it.
Frequently Asked Questions
Do DynamoDB streams have limits?
Head to DynamoDB Limits page.
How much do DynamoDB streams cost?
DynamoDB Streams are based on "Read Request Units" basis. To learn more about them head to our DynamoDB Pricing calculator.
How can I view DynamoDB stream metrics?
DynamoDB Stream metrics can be viewed in two places:
- In AWS DynamoDB Console in Metrics tab
- Using AWS Cloudwatch
© 2020 Dynobase