AWS Web Content

App URL: https://brianpfeil.com/apps/awswebcontent/build/index.html


  • fetch all aws directory api metadata (arch diagrams, products, blog posts, builders library articles, etc.)
  • based on public data fetched via https://aws.amazon.com/api/dirs/items/search?item.directoryId=${directoryId}&item.locale=en_US
  • index.js - entrypoint
  • data/ - directory items

Updating AWS Web Content Data

the downloads new items in each directory and updates the following files

  • data/${directoryId}.flat.json
  • data/${directoryId}.display.json
npm run fetch-new-items

Frontend

under construction

# development
cd frontend
npm start

# publishing - copies to `personal-website`
cd frontend
npm run publish

Data Shapes

general response shape

{
  "items": [],
  "metadata": {
    "count": 0,
    "totalHits": 299
  },
  "fieldTypes": {
    "updateDate": "Date",
    "imageSrcUrl": "URL",
    "featureFlag": "Text",
    "description": "LongText",
    "sortDate": "Date",
    "docTitle": "Text",
    "primaryURL": "URL",
    "datePublished": "Date",
    "publishedText": "Text",
    "footerInfoSubtext": "Text",
    "subHeadline": "Text",
    "enableShare": "Boolean",
    "category": "Text",
    "contentType": "Text"
  }
}

example item (items[0].item)

    {
      "item": {
        "id": "whitepapers#image-moderation-chatbot",
        "locale": "en_US",
        "directoryId": "whitepapers",
        "name": "image-moderation-chatbot",
        "author": "julicoll",
        "createdBy": "julicoll",
        "lastUpdatedBy": "julicoll",
        "numImpressions": 0,
        "score": 0,
        "dateCreated": "2019-06-25T17:21:57+0000",
        "dateUpdated": "2021-07-29T17:01:46+0000",
        "additionalFields": {
          "datePublished": "2018-12-05",
          "publishedText": "December 2018",
          "description": "Shows you how to build a serverless chatbot on AWS that monitors your chat channels and removes images containing suggestive or explicit content.<p><a href=\"https://github.com/awslabs/lambda-refarch-imagemoderationchatbot?did=wp_card&trk=wp_card\" target=\"_blank\" rel=\"noopener\">Code</a></p><p class=\"m-subheadline\">Media Services | Serverless</p>",
          "docTitle": "Image Moderation Chatbot",
          "sortDate": "2018-12-05",
          "enableShare": "1",
          "contentType": "Reference Architecture Diagram",
          "primaryURL": "https://github.com/awslabs/lambda-refarch-imagemoderationchatbot?did=wp_card&trk=wp_card"
        }
      },
      "tags": [
        {
          "id": "GLOBAL#content-type#reference-arch-diagram",
          "locale": "en_US",
          "tagNamespaceId": "GLOBAL#content-type",
          "name": "Reference Architecture Diagram",
          "description": "Reference Architecture Diagram",
          "createdBy": "jenbar",
          "lastUpdatedBy": "jenbar",
          "dateCreated": "2020-04-29T05:19:31+0000",
          "dateUpdated": "2022-02-03T03:31:09+0000"
        },
        {
          "id": "GLOBAL#methodology#serverless",
          "locale": "en_US",
          "tagNamespaceId": "GLOBAL#methodology",
          "name": "Serverless",
          "description": "Serverless",
          "createdBy": "jenbar",
          "lastUpdatedBy": "jenbar",
          "dateCreated": "2020-06-05T07:06:34+0000",
          "dateUpdated": "2022-02-03T03:32:11+0000"
        },
        {
          "id": "GLOBAL#tech-category#media-services",
          "locale": "en_US",
          "tagNamespaceId": "GLOBAL#tech-category",
          "name": "Media Services",
          "description": "Media Services",
          "createdBy": "jarfaa",
          "lastUpdatedBy": "jenbar",
          "dateCreated": "2020-07-17T03:06:10+0000",
          "dateUpdated": "2022-02-03T03:35:28+0000"
        }
      ]
    }

TODO

Scratch

# download data from api
node index.js download

# download a specified directory
node index.js download --directoryId "whats-new"

# flatten nested item arrays into single array of items
node index.js flatten-data 

# generate data for frontend which contains only a subset of the full data for each item
node index.js create-data-for-frontend

# index data into lunr indexes
node index.js index

# search against an index
node index.js search --directoryId "whats-new" --query "name:*lambda*"
BUCKET_NAME="aws-web-content-s3bucket-1rmdk8t0ols56"
BUCKET_DOMAIN_NAME="${BUCKET_NAME}.s3.amazonaws.com"
aws s3 cp data s3://$BUCKET_NAME/data/ --recursive
aws s3 cp index s3://$BUCKET_NAME/index/ --recursive


aws lambda invoke \
    --cli-binary-format raw-in-base64-out \
    --function-name aws-web-content-UpdateAwsWebContent-xzzE4avkIwN6 \
    --payload '{}' \
    tmp/response.json 

One-liner to retrieve a list of all AWS products

# source: <https://gist.github.com/garystafford/37442d8fd8dde388f50856c6a2900b0d>
# One-liner to retrieve a list of all AWS products from aws.amazon.com/products sorted by product category (requires jq). Worked as of 2022-01-03. Page format tends to change a lot...
curl --silent --compressed \
'https://aws.amazon.com/api/dirs/items/search?item.directoryId=aws-products&sort_by=item.additionalFields.productCategory&sort_order=asc&size=500&item.locale=en_US' \
| jq -r '.items[].item | .additionalFields.productCategory + " | " + .additionalFields.productName' \
| sort

Resources