DynamoDB Export to S3 and Query with Athena

example exporting dynamodb table to S3 and then querying via athena

Files

Running

sam deploy --guided
# note: seed data is generated as part of deploy via cfn custom resource `Custom::SeedData`
# which triggers a lambda which populates the dynamodb table

# update `STACK_NAME` variable in ./main.sh
# run export table to s3 script
./main.sh

aws dynamodb list-exports # in progress
{
    "ExportSummaries": [
        {
            "ExportArn": "arn:aws:dynamodb:us-east-1:529276214230:table/dynamodb-export-to-s3-and-query-with-athena-playground-v2-MyTable-1WGSJ3W2WJWPK/export/01605130432834-8a918b87",
            "ExportStatus": "IN_PROGRESS"
        }
    ]
}
aws dynamodb list-exports # completed
{
    "ExportSummaries": [
        {
            "ExportArn": "arn:aws:dynamodb:us-east-1:529276214230:table/dynamodb-export-to-s3-and-query-with-athena-playground-v2-MyTable-1WGSJ3W2WJWPK/export/01605130432834-8a918b87",
            "ExportStatus": "COMPLETED"
        }
    ]
}

Example export file contents (line delimited json items)

gzcat example-export/demo_prefix/AWSDynamoDB/01605130432834-8a918b87/data/vajn3deidy4svdja3fgej2ynay.json.gz

{"Item":{"pk":{"S":"pk001-1605121017583"},"sk":{"S":"sk001"}}}
{"Item":{"pk":{"S":"pk002-1605121017583"},"sk":{"S":"sk002"}}}

Access Exported Data in S3 via Athena

-- Create external table in athena pointing to exported S3 data
CREATE EXTERNAL TABLE IF NOT EXISTS ddb_exported_table (
  Item struct <pk:struct<S:string>,
               sk:struct<S:string>>
)
ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe'
LOCATION 's3://dynamodb-export-to-s3-and-query-with-exportbucket-103kukqwmab7n/demo_prefix/AWSDynamoDB/01605130432834-8a918b87/data/'
TBLPROPERTIES ( 'has_encrypted_data'='true');

-- issues SELECT query
SELECT
    Item.pk.S as pk,
    Item.sk.S as sk
FROM ddb_exported_table

Screenshot from Athena Console

Resources