Async Generators in Use: DynamoDB Pagination

a windmill
Photo by insung yoon on Unsplash

Async Iteration and Async Generators: Recap

Asynchronous iteration was added to the ECMAScript standard in its 2018 edition (TypeScript has supported it since version 2.3). In layman’s terms, it means iterating over a collection where you have to wait for each item to become available:

// Compare:

const collection = [1, 2, 3];

for (const item of collection) {
  console.log(item);
}

// and

const collection = [Promise.resolve(1), Promise.resolve(2), Promise.resolve(3)];

for await (const item of collection) {
  console.log(item);
}

Just like with regular, synchronous iteration, asynchronous iteration is compatible with asynchronous generators. In both cases you iterate over values yielded from a generator:

// Compare:

function* gen() {
  yield 1;
  yield 2;
  yield 3;
}

for (const item of gen()) {
  console.log(item);
}

// and

async function* asyncGen() {
  const one = await getOne();
  yield one;
  const two = await getTwo();
  yield two;
  const three = await getThree();
  yield three;
}

for await (const item of asyncGen()) {
  console.log(item);
}

You can read more on the topic in this article by the ever scrupulous Dr. Axel Rauschmayer.

Asynchronous generators seem like a neat idea, but maybe not something the average developer will make a lot of use of. And indeed, it took a couple of years for me to encounter a nice real-world application for async generators, and now I’m happy to tell you all about it. (Credit goes to my colleague Peter Smith for the original idea.)

Async Generators: A Real Use Case

When you query a DynamoDB table, the result set can get paginated if the amount of data exceeds a certain size. It looks a little bit like this:

const queryResult = await dynamoClient.query(params).promise();

// If queryResult.LastEvaluatedKey is present, the query was paginated.
// queryResult.Items contains a page of results, but not the entire result set.
// To load the next page, we need make another query, passing LastEvaluatedKey
// as the start key in the params for that query.
params.ExclusiveStartKey = queryResult.LastEvaluatedKey;
const nextQueryResult = await dynamoClient.query(params).promise();

// ...Repeat until queryResult.LastEvaluatedKey is empty.

In a real application you would do this in a loop. You might also want to create a helper function with this logic so it can be reused for different types of queries throughout the application. A simple approach would be to combine results from all pages into an array:

async function getPaginatedResults(dynamoClient, params) {
  let lastEvaluatedKey;
  const results = [];

  do {
    const queryResult = await dynamoClient.query(params).promise();

    lastEvaluatedKey = queryResult.LastEvaluatedKey;
    params.ExclusiveStartKey = lastEvaluatedKey;

    results.push(...queryResult.Items);
  } while (lastEvaluatedKey);

  return results;
}

const allItems = await getPaginatedResults(dynamoClient, someQueryParams);

Depending on the context, this may be perfectly reasonable. What if, however, you want to further process each page of results as soon as it’s available, without waiting for the rest to come in? The simplest implementation might accept a callback with the processing logic:

async function forEachPage(dynamoClient, params, callback) {
  let lastEvaluatedKey;

  do {
    const queryResult = await dynamoClient.query(params).promise();

    lastEvaluatedKey = queryResult.LastEvaluatedKey;
    params.ExclusiveStartKey = lastEvaluatedKey;

    await callback(queryResult.Items);
  } while (lastEvaluatedKey);
}

await forEachPage(dynamoClient, someQueryParams, async (pageOfItems) => {
  // do something with the page of items
});

This is workable, but callbacks are known to be clunky in use. For example, you may need to make the callback return false to indicate that the loop needs to stop. What if, instead of this foreach-style iteration, you want to move to a for...of style? Enter asynchronous generators.

async function* getPaginatedResults(dynamoClient, params) {
  let lastEvaluatedKey;
  do {
    const queryResult = await dynamoClient.query(params).promise();

    lastEvaluatedKey = queryResult.LastEvaluatedKey;
    params.ExclusiveStartKey = lastEvaluatedKey;

    yield queryResult.Items;
  } while (lastEvaluatedKey);
}

for await (const pageOfItems of getPaginatedResults(dynamoClient, someQueryParams)) {
  // do something with the page of items
}

Every time a new page of items is loaded, the async generator yields it back into the for-await-of loop. Neat.

That last example highlights one of the key aspects of generators (both sync and async). If you look at the for-await-of loop, we only invoke getPaginatedResults once—right at the start of the loop. And at that moment, it’s not known how many pages we’ll get. However, we can still conveniently run a for loop over this ”eventually known” collection, as if it were a plain old array.

Conclusion

I hope this practical example helps illustrate the usefulness of asynchronous generators. Perhaps now you can more easily spot places in your own code where they might be handy.