dynobase-icon
Dynobase

DynamoDB Scan - What It Is & Why You Should (Almost) Never Use It

Written by Rafal Wilinski

What Is DynamoDB Scan?

Scan is one of the three ways of getting the data from DynamoDB, and it is the most brutal one because it grabs everything. Scan operation "scans" through the whole table, returning a collection of items and their attributes. The end result of a scan operation can be narrowed down using FilterExpressions.

To run a Scan operation using CLI, use following command:

Should I use DynamoDB Scans?

Generally speaking, no. Scans are expensive, slow, and against best practices. In order to fetch one item by key, you should use Get operation, and if you need to fetch a collection of items, you should do that using Query.

When To Use DynamoDB Scan

But sometimes using scans is inevitable, you only need to use them sparingly and with knowledge of the consequences. Here are use-cases by scans might make sense:

  • Getting all the items from the table because you want to remove or migrate them
  • If your table is really small (< 10 MB)

DynamoDB Scan Examples

After reading the above content, if you feel that the scan query still makes sense for your use-case, then we've got you covered. Here are different methods and scan query code snippets you can copy-paste.

DynamoDB Pagination

Similar to the Query operation, Scan can return up to 1MB of data. If the table contains more records that could be returned by Scan, API returns LastEvaluatedKey value, which tells the API where the next Scan operation should start. The returned value should be passed as the ExclusiveStartKey parameter for the subsequent call.

How fast is DynamoDB scan?

DynamoDB Scan is not a fast operation. Because it goes through the whole table to look for the data, it has O(n) computational complexity. If you need to fetch data fast, use Query or Get operations instead.

What is the DynamoDB scan cost?

DynamoDB Scan cost depends on the amount of data it scans, not the amount of data it returns. Even if you narrow down the results returned by the API using FilterExpressions, you'll be billed by the amount of data in went through to find the relevant results.

The exact cost of the operation depends on the table's Capacity Mode; you can estimate it using our free pricing calculator.

Parallel Scan in DynamoDB

Scans are generally speaking slow. To make that process faster, you can use a feature called "Parallel Scans" which divide the whole DynamoDB Table into Segments. A separate thread/worker then processes each Segment so N workers can work simultaneously to go through the whole keyspace faster.

Creating Parallel Scan is quite easy. Each of your workers, when issuing a Scan request should include two additional parameters:

  • Segment - Number of segments to be scanned by a particular worker
  • Total Segments - Total amount of Segments/Workers/Threads

Parallel Scans Diagram


But, be careful with Parallel scans as they can drain your provisioned read capacity pretty quickly incurring high costs and degrading the performance of your table.

Dynobase is a Professional GUI Client for DynamoDB

Start your 7-day free trial today

Product Features

© 2020 Dynobase

+
Annoyed by DynamoDB Console?
Try Dynobase. Start your 7-day free trial today