DynamoDB Batch Write, Update & Delete [How-To Guide]
Written by Lakindu Hewawasam
Published on April 17th, 2022
Time to 10x your DynamoDB productivity with Dynobase [learn more]
All modern-day applications are widely data-driven. As a result, a vast amount of data is persisted, read, and updated in every application. For users building applications with AWS, DynamoDB is the go-to database due to its ability to scale when needed without affecting performance.
However, when developers do not use DynamoDB in the recommended access patterns, it adds latency that ultimately slows down the system. Many developers encounter the problem of performing individual operations for data items rather than grouping them, which adds latency for each operation.
Therefore, DynamoDB utilizes Batch Operations to perform Write and Delete requests for a group of items across multiple tables in parallel. As a result, it significantly boosts performance when dealing with a vast number of data, thus, helping you reach single-digit latency.
Batch Operations in DynamoDB
DynamoDB Batch Write
What is DynamoDB Batch Write?
A bulk (batch) write in DynamoDB allows you to write multiple items into multiple tables in a single API call. It uses the
BatchWriteItem operation to group multiple write requests into one API call to reduce the number of network calls that improve application performance and reduce latency. However, it is essential to note that DynamoDB does not allow you to use condition expressions on items ("attribute_not_exists()") when executing bulk writes.
How to perform a Batch Write in DynamoDB?
You can use the
batchWrite method available in the document client to perform a batch write. For example, consider the snippet shown below.
The snippet above uses the batch write method of the Document Client. The Document Client invokes the
BatchWriteItem operation and writes the three data items to the single table in one batch operation.
If you wish to write multiple data items to multiple tables, you can add more tables to the
RequestItems and specify Put operations for the data items of each table. This is shown below.
The snippet shows two items in a batch for two different tables.
It is important to note that you cannot have duplicate items (items with the same keys) in your Request.
Performance Evaluation of DynamoDB Batch Write
A Bulk Write is faster when compared to individual put requests. This is because the
PutItem uses a database call per item, whereas "bulk writes" bundles multiple database calls into one SDK call. As a result, it improves speed and decreases latency.
Additionally, DynamoDB supports up to 25 items up to 400KB each, or a maximum of 16MB for the bulk write request (whichever occurs first) per API call.
If the request size causes the provisioned throughput to exceed, the request will get throttled and cause potential latencies.
BatchWrite is highly effective for writing extensive records across multiple tables with higher throughputs.
DynamoDB Batch Write Best Practices
batchWrite method returns a list of unprocessed items in the request. To ensure that all the items get inserted, AWS recommends you perform bulk writes in a loop using the exponential backoff algorithm.
DynamoDB Batch Delete
What is DynamoDB Batch Delete?
A bulk (batch) delete in DynamoDB allows you to delete multiple items from multiples tables using one SDK call. Grouping these requests to one SDK call boosts speed and application performance. But it comes at a price. To improve performance for the bulk delete, DynamoDB does not allow you to specify conditional expressions for delete requests.
How to perform a Batch Delete in DynamoDB?
The bulk delete uses the same
batchWriteItem operation as bulk write, but instead of specifying Put Requests, we specify Delete Requests. For example, consider the snippet below.
The snippet above shows a bulk delete occurring on two tables. The
RequestItems accepts an array of objects where each object configures the bulk operations for a table.
Performance Evaluation of DynamoDB Batch Delete
Since bulk delete uses the same
BatchWriteItem operation, it has similar performance metrics to bulk writes. For example, a bulk delete supports a maximum of 25 items per request (400KB per item) or a maximum request size of 16MB (whichever occurs first). If the request exceeds these thresholds, the operation gets canceled.
However, due to these size restrictions, DynamoDB can provide faster bulk deletes than individually deleting a group of items.
DynamoDB Batch Delete Best Practices
As the same
batchWriteItem operation gets called for bulk deletes, you should perform the bulk delete in an iterative manner where you handle the unprocessed items using an exponential backoff algorithm.
DynamoDB Batch Get
What is DynamoDB Batch Get?
DynamoDB allows you to perform bulk writes and bulk deletes, but it also allows you to fetch bulk data from multiple tables via a single API call. In addition, it makes it extremely fast to query data across various tables rather than perform individual queries for each table.
DynamoDB uses the
batchGetItem operation to perform a bulk get. In addition, it uses "eventually consistent reads" to provide faster response times by default.
How to perform a Batch Get in DynamoDB?
It is easy to perform bulk gets in DynamoDB. First, you need to use the method
batchGet in the Document Client and pass in the Request Items to fetch. This is shown below.
The snippet above shows groups of data being fetched from two tables. Each table accepts a parameter named
Keys. It specifies the partition key or the composite partition key for the table.
It is important to note that DynamoDB rejects the bulk get operation if you've specified duplicate keys to be fetched in a single table.
Performance Evaluation of DynamoDB Batch Get
A DynamoDB "bulk get" can fetch records of up to 16MB, containing up to 100 records. If the throughput exceeds while fetching the data, DynamoDB does not throttle the request. Instead, it returns a partial result and the rest as unprocessed keys.
Users can use the unprocessed keys iteratively to fetch the remaining data without experiencing throttling or added latency.
Therefore, a bulk get performs better than individual queries for multiple tables.
DynamoDB Batch Get Best Practices
A "bulk get" is faster performing individual queries per table. But, you can further enhance your "bulk get" operations if you optimize the operation. Therefore, when you perform a "bulk get," make sure to:
AttributesToGet: After specifying the Keys to fetch data, you can use the attribute -
AttributesToGetto fetch only the required attributes.
- Use Eventually Consistent Reads: You may choose to use this based on your use case. "Eventually Consistent Reads" might not reflect the latest change (only if the change was made a few milliseconds ago), but it can always be brought up to date by re-fetching. However, it results in faster fetch times from the tables, reducing latency, thus improving performance.
- Execute the operation in a loop: AWS recommends iteratively executing batch operations with exponential backoff to complete the batch operations, ensuring that no unprocessed data remains.
DynamoDB Batch Update
What is DynamoDB Batch Update?
A bulk (batch) update refers to updating multiple rows belonging to a single table. However, DynamoDB does not provide the support for this.
But, there is a way that we can perform bulk updates in DynamoDB.
How to perform a Batch Update in DynamoDB?
Performing a bulk update in DynamoDB is a two-part process.
- Fetch the items that you wish to update.
- Perform an Update operation on each item.
For example, we will update all the rows that have a hash key starting with "user" by introducing a new attribute named - "visibility" that has the value "private."
Let us look at how we can perform this bulk update in DynamoDB.
The snippet above shows a bulk update executed with DynamoDB. Initially, the table gets scanned to identify the items having "id" starting with "user." Afterward, each item is updated individually in a loop with private visibility.
This shows the implementation of bulk updates using DynamoDB.
Performance Evaluation of DynamoDB Batch Update
We've used an iterative approach to perform bulk updates in DynamoDB. Unfortunately, it comes with its performance pitfalls.
First, you retrieve the data and then update it. When retrieving the data, using a Scan rather than a Query will increase latency.
Secondly, all the items are updated synchronously, meaning that each update creates a new network call, thus adding latency to the overall operation.
Lastly, if your bulk update exceeds the throughput capacity (a maximum of 40,000 WCU can be allocated depending on the region), your requests get throttled and slow the bulk update down.
DynamoDB Batch Update Best Practices
However, to improve the performance of bulk updates, consider using these best practices.
- Query a GSI: When fetching items to update, obtain the data by querying a GSI with only the required attributes.
- Use a Transaction: DynamoDB provides the TransactWriteItems operation to group update requests as a transaction. It still executes the operations synchronously, but you need to fetch objects to update, thus improving performance. To find out its implementation, refer to the guides provided by Dynobase.
But, there's one more thing.
It's important to note that bulk operations in DynamoDB do not behave as transactions. This means that when you perform a batch write and encounter an error, DynamoDB will not roll back the previously created data.
However, bulk operations increase the execution speed, thus creating more incredible performance.
Using batch operations in your application help improve performance and decrease latency, which helps reduce costs and serve customers better.
Thank you for reading.
How many writes can DynamoDB handle?
DynamoDB can handle 1000 WCU (Write Capacity Units) per second on a single partition, and it has a throughout of 40,000 WCU for the table (depending on the table).
DynamoDB can handle up to 1000 writes per second up to 1MB on a partition. Additionally, it can handle up to 40,000 writes per second up to 40MB on the table.