1. Use AWS.DocumentClient instead of AWS.DynamoDB
When writing JS/TS code for DynamoDB, instead of using
AWS.DocumentClient. It’s much simpler to use and converts responses to native JS types.
2. Store Blobs in S3 instead of DynamoDB
While DynamoDB allows storing Blobs inside tables (e.g. pictures), it's a much better practice to upload actual assets to S3 and store only links to them inside the database.
3. Use Promises instead of Callbacks
.promise(). This will make the library return a Promise which you can .then() or await on. This will make your code much more elegant.
4. Use BatchGetItem API for querying multiple tables
Use BatchGetItem API to get up to 100 items identified by primary key from multiple DynamoDB tables at once in parallel instead of using Promise.all wrapping a series of calls. It’s much faster but can only return up to 16MB of data.
5. Use ‘AttributesToGet’ to make API responses faster
In order to make responses of your API faster, add
AttributesToGet parameter to get calls. This will return less data from DynamoDB table and potentially reduce overhead on data transport and un/marshalling.
6. Use 'convertEmptyValues: true' in DocumentClient
In JS SDK, use
convertEmptyValues: true in DocumentClient constructor to prevent silly validation exceptions by automatically converting falsy values to NULL types when persisting to DynamoDB.
7. Use DynamoDB.batchWriteItem for batch writes
DynamoDB.batchWriteItem calls to write up to 16MB of data or do up to 25 writes to multiple tables with a single API call. This will reduce overhead related to establishing HTTP connection.
8. Use IAM policies for security and enforcing best practices
You can use IAM policies to enforce best practices and restrict your developers and services from doing expensive Scan operations on DynamoDB tables.
9. Use DynamoDB Streams for data post-processing
Once data gets saved to the table, λ function subscribing to the stream can validate value, enrich information, or aggregate metrics. This will decouple your core business logic from side-effects.
10. Use 'FilterExpressions'
Use FilterExpressions to refine and narrow your Query and Scan results on non-indexed fields.
11. Use 'Parallel Scan' to scan through big dataset
If you need to scan through big dataset fast, use "Parallel Scan" which divides your table into N segments and lets multiple processes to scan separate parts concurrently.
12. Set 'AWS_NODEJS_CONNECTION_REUSE_ENABLED' to 1
To reduce the overhead of connecting to DynamoDB, make sure the environment variable
AWS_NODEJS_CONNECTION_REUSE_ENABLED is set to 1 to make the SDK reuse connections by default.
13. Use 'TransactWriteItems' to update multiple records atomically
If you need to update multiple records atomically, use
TransactWriteItems function to write up to 10 items atomically across tables at once.
14. Use Contributor Insights from Day 1
Use Contributor Insights from day-one to identify most accessed items and most throttled keys which might cause you performance problems.
15. Use VPC endpoints to make your connections more secure
Use VPC endpoints when using DynamoDB from a private subnet to make your connections more secure and remove the need for a public IP.
16. Always use 'ExpressionAttributeNames' in your queries
Because DynamoDB has over 500 reserved keywords, use
ExpressionAttributeNames always to prevent from
ValidationException - Invalid FilterExpression: Attribute name is a reserved keyword; reserved keyword: XXX
17. Use caching for frequently accessed items
Invest in caching early. Even the smallest caches can slice your DynamoDB bill by up to 80%. It works especially well for frequently accessed items.
18. Use On-Demand to identify traffic patterns
If unsure of your traffic patterns, use On-Demand capacity mode to scale your DynamoDB table ideally with the amount of read and write requests.
19. For Billing, start with On-Demand, then switch to Provisioned
Use On-Demand capacity mode to identify your traffic patterns. Once discovered, switch to provisioned mode with autoscaling enabled to save money.
20. Use 'DynamoDB Global Tables' for latency crucial applications
If latency to the end-user is crucial for your application, use DynamoDB Global Tables, which automatically replicate data across multiple regions. This way your data closer to the end-user. For compute part, use Lambda@Edge functions.
21. Use 'createdAt' and 'updatedAt' attributes
updatedAt attributes to each item. Moreover, instead of removing records from the table, simply add
deletedAt attribute. It will not only make your delete operations reversible but also enable some auditing.
22. Aggregate, a lot
Instead of running expensive queries periodically for e.g. analytics purposes, use DynamoDB Streams connected to a Lambda function. It will update result of an aggregation just-in-time whenever data changes.
23. Leverage write sharding
To avoid hot partitions and spread the load more evenly across them, make sure your partition keys have high cardinality. You can achieve that by adding a random number to the end of the partition key values.
24. Large attributes compression
If storing large blobs outside of DynamoDB, e.g. in S3 is not an option because of increased latency, use common compression algorithms like GZIP before saving them to DDB.
25. Date formats
Use epoch date format (in seconds) if you want support DynamoDB TTL feature and all the filters. If you don't need TTL, you can also use ISO 8601 format. It works for filters like "between" too.
26. Use DynamoDB Local
If you need to mock your DynamoDB Table(s) for local development or tests on your machine, you can use Localstack or DynamoDB Local to run DynamoDB as Docker container without connectivity to the cloud. It supports most of the DynamoDB APIs.
27. Remember to backup
Before going live in production, remember to enable point-in-time backups, so there's an option to rollback your table in case of an error.
© 2020 Dynobase