AWS Cloud Blog & News | StratusGrid

How to Scan AWS DynamoDB Tables with AWS SDK for Rust

Written by Trevor Sullivan | Mar 8, 2024 2:32:14 PM

In an earlier article, we discussed how you can query data efficiently from AWS DynamoDB partitions. The query operation uses fewer capacity units than table scans because queries must target partitions. However, in some cases, you may need to retrieve items from a DynamoDB table without knowing the partition key values. Queries won’t allow you to do this, but table scans can.

Using table scans is generally inadvisable, due to the impact that they can have on the DynamoDB table’s performance and cost. However, if you have a specific use case for scanning, don’t be afraid to leverage them when appropriate.

 

Project Setup

As with any AWS SDK project, you’ll need to ensure that you have an async main function, and use the tokio::main macro to designate tokio as the async runtime for your entire application.

#[tokio::main]

async fn main() { }

You’ll also need to load the AWS credentials from your environment and construct a DynamoDB client struct. Add the following code to your main function body.

use aws_sdk_dynamodb as ddb;

let sdk_cfg = aws_config::load_from_env().await;

let ddb_client = ddb::Client::new(&sdk_cfg);

Now you can use the ddb_client variable to call various DynamoDB APIs.

Running DynamoDB Table Scans

You can use the scan() method on the DynamoDB client struct to invoke a scan operation against a table. You’ll of course need to specify the name of the DynamoDB table that you’re scanning, which you can use the table_name() function for.

let scan_result = ddb_client.scan()

    .table_name("trevor-products")

    .send().await;

The scan_result variable should contain the Result<T, E> type returned from the async future that runs the scan operation. If the scan operation succeeded, then you can call the unwrap() function on the Result struct, to access the ScanOutput struct.

The ScanOutput struct has an items field that provides access to all of the DynamoDB items that were retrieved from the table. The items field is an Option<T> type, so you’ll need to unwrap() that to access the Vec of HashMaps, where each HashMap represents a single item from the DynamoDB table.

You can iterate over each of the Vec of HashMaps and do something with each item, like printing one of its attributes out to the terminal. Since each item is a HashMap, we can use the get() method to retrieve a specific attribute value from the item. The attribute values are wrapped by the AttributeValue enum type in the AWS SDK for Rust, so you’ll need to “decode” the underlying attribute value using the appropriate method, such as as_s() for DynamoDB string values.

let scan_output = scan_result.unwrap();

for item in scan_output.items.unwrap() {

    println!("{0}", item.get("p_category").unwrap().as_s().unwrap());

}

Paginating DynamoDB Table Scan Results

When you retrieve items from DynamoDB using the scan() operation, you can retrieve a maximum of 1 MB of data per API call. If you need to retrieve more items, you can paginate through the table continuously.

To paginate over the table, you will need to access the last_evaluated_key field from the ScanOutput struct, and use that as input to the next scan() operation. This essentially tells DynamoDB to pick up from where it left off, on the previous request. Next time you call scan(), make sure you call the exclusive_start_key() method to specify the last_evaluated_key.

Because we are dynamically adding both the partition and sort key attributes as the “start key,” we need to separate the send() operation from the ScanFluentBuilder. We also need to make the scan_result variable mutable this time, since we’re mutating the variable after calling exclusive_start_key() each time. We iterate over the last_evaluated_key from the first scan() call, and add each of those keys into the new scan() call. Finally, we run the send() method and await the future.

let mut scan_result = ddb_client.scan()

    .table_name("trevor-products");




for key in scan_output.last_evaluated_key.unwrap() {

    scan_result = scan_result.exclusive_start_key(key.0, key.1);

}

let scan_result_2 = scan_result.send().await;

Now the scan_result_2 variable contains the next set of items from the table which you can iterate over like we did for the first batch of items.

To finish iterating through the entire table, you will need to repeat this process until the ScanOutput struct has a None value for the last_evaluated_key field. When the last_evaluated_key is None, you have reached the end of the table items. Generally, if you’re attempting to process the entire table, then you’ll want to put this code into a loop construct. This will enable you to iteratively call scan() until you’ve finished processing all items, without duplicating code.

Filtering DynamoDB Items

Using DynamoDB scans retrieves every item from the underlying table storage, however you pass the items through a filter before the items are returned. This helps to reduce network bandwidth utilization, thereby improving performance of your application.

To specify a filter, call the filter_expression() function on the ScanFluentBuilder, before you send() the request. This function accepts a simple String value, which is the filter you want to apply to each item. Refer to the DynamoDB documentation for guidance on building filters.

For example, let’s say we want to retrieve all products with a price greater than $4. The filter expression you’d want to use would look something like this.

#attr1 > :val1

Although this syntax may look cryptic, these are merely placeholders. To specify the price attribute name, you use the expression_attribute_names() function. To specify the value to compare the attribute value to, you use the expression_attribute_values() function.

Here’s what the Rust code would look like.

let scan_result = ddb_client.scan()

    .table_name("trevor-products")

    .filter_expression("#attr1 > :val1")

    .expression_attribute_names("#attr1", "price")

    .expression_attribute_values(":val1", ddb::types::AttributeValue::N("4".to_string()))

    .send().await;

If you want to use a compound filter, with multiple criteria, you can specify the expression attribute names and values multiple times. Because we’re filtering items after retrieving them from underlying table storage, you can do partial matches on the partition key attribute value. Let’s add another piece of criteria to the filter, to check if the p_category attribute contains the substring “kitc”.

#attr1 > :val1 and contains(#attr2, :val2)

Here’s the Rust code that would implement this filter. Notice that we are using one of the supported comparison functions in DynamoDB, to check for the existence of a substring in an attribute value.

let scan_result = ddb_client.scan()

    .table_name("trevor-products")

    .filter_expression("#attr1 > :val1 and contains(#attr2, :val2)")

    .expression_attribute_names("#attr1", "price")

    .expression_attribute_names("#attr2", "p_category")

    .expression_attribute_values(":val1", ddb::types::AttributeValue::N("4".to_string()))

    .expression_attribute_values(":val2", ddb::types::AttributeValue::S("kitc".to_string()))

    .send().await;

Feel free to explore the other comparison functions available in the AWS DynamoDB filter syntax, and experiment with them to learn how they work.

Scanning AWS DynamoDB Tables with AWS SDK for Rust

As you can see, running scan operations against DynamoDB tables is relatively straightforward. Unlike query operations, you don’t have to worry about specifying a partition as input to your scan operation. Rather, you simply iterate through every single item in the table.

Your code will become a bit more complex when you paginate through scan results. After you practice writing this code a few times and read through the documentation, it should make sense to you. Remember, these concepts apply to the Amazon DynamoDB service, no matter which language SDK you’re using.

Knowing how to filter items from DynamoDB is also an important concept for you to understand, as a developer. This helps to optimize the network performance of your applications.

Make sure you spend some hands-on time with the AWS SDK for Rust and the AWS DynamoDB service. The more you practice and work through any errors, the more experienced you’ll be when you start building or supporting production services around it.

Check out these resources for more learning on AWS DynamoDB:

Need Help with Cloud Modernization? Contact StratusGrid!

Tackling AWS DynamoDB can be complex, and we are here to assist you with any questions or challenges you might face. Whether you're exploring DynamoDB table scans, setting up your AWS SDK for Rust, or seeking to optimize your cloud infrastructure, our experts are ready to provide the guidance and support you need. Reach out to StratusGrid for personalized assistance and take your cloud modernization efforts to the next level. 

Watch our video on scanning AWS DynamoDB tables with AWS SDK for Rust for more insights and contact us today to enhance your cloud capabilities!

BONUS: Find Out if Your Organization is Ready for The Cloud ⤵️