DynamoDB Scan vs Query - Everything You Need To Know
Written by Rafal Wilinski
Published on May 15th, 2020
Time to 10x your DynamoDB productivity with Dynobase [learn more]
Difference Between Query and Scan in DynamoDB
Query and Scan are two operations available in DynamoDB SDK and CLI for fetching a collection of items. While they might seem to serve a similar purpose, the difference between them is vital. While Scan is "scanning" through the whole table looking for elements matching criteria, Query is performing a direct lookup to a selected partition based on primary or secondary partition/hash key.
You will also see the difference in speed. While Query usually returns results within 100ms, Scan might even take a few hours to find the relevant piece of data, especially for large tables.
When querying DynamoDB, When You Should Use Scan vs Query
Generally speaking, you should always favor Query over Scan. When it's not possible (for example, when you're looking for a piece of data with a key that is unknown to you), and if it's a frequently used pattern, consider adding a GSI (Global Secondary Index) to index that attribute and enable Query. In the last resort, use Scan.
Scan is also useful when you need to retrieve all the table data. However, be mindful of the cost implications and performance overhead associated with scanning large tables.
DynamoDB Scan vs Query - Syntax Differences
Request parameters for both Query and Scan are almost identical. The only difference is KeyConditionExpression
parameter which is required in Query operation. It specifies the condition that the key values for items to be retrieved by this action. Moreover, the specified condition must perform an equality check on a partition key value. Optionally, you can use various other operators like Equals, GreaterThan, BeginsWith on range/sort key.
For the rest of the parameters, it's pretty much the same. In both cases, FilterExpression
can be used to narrow down the results. However, it's important to note that FilterExpression
is applied after the data is retrieved, which means it does not reduce the read capacity units consumed by the operation.
Best Practices for Using DynamoDB Query and Scan
When designing your DynamoDB tables, always consider your access patterns. If you find yourself frequently needing to scan the table, it might be a sign that your table design could be improved. Using indexes effectively can help you avoid the need for scans. Additionally, consider the use of projections to limit the amount of data returned by your queries and scans, which can help reduce costs and improve performance.
Frequently Asked Questions
When getting data from DynamoDB, when you should use Scan and when Query?
You should always favor Query over Scan. When it's not possible (for example, when you're looking for a piece of data with a key that is unknown to you), and if it's a frequently used pattern, consider adding a GSI to index that attribute and enable Query. In the last resort, use Scan.
Is DynamoDB query cheaper than scan?
Generally speaking - yes, because it accesses the data in the desired partition directly. Scan, because it goes through the whole table space, is billed not on the data returned basis, but data scanned, hence its costs can be higher.
What is the DynamoDB scan vs query performance difference?
If you need to access data identified by known keys, query is much faster because of the direct access method.
I need to use DynamoDB scan. Can I make it faster?
You can consider using DynamoDB Parallel Scan. It uses multiple threads to run multiple scans at once scanning multiple parts of your table space simultaneously.
Why is DynamoDB scan not returning any results?
It is possible that the Scan you're running is simply not returning any results relevant to your FilterExpressions
on the first page. Items you are looking for might be on the next pages. More info on how to use pagination in DynamoDB