Export and Import AWS DynamoDB data

A simple, straightforward way export and import AWS DynamoDB table’s data with AWS CLI and a few scripts.

At first, export all the data from AWS DynamoDB table:

1
πœ† aws --profile production dynamodb scan --table-name tile-event > tile-event-export.json

Convert a list of items/records (DynamoDB JSON) into individual PutRequest JSON with jq.

1
πœ† cat tile-event-export.json | jq '{"Items": [.Items[] | {PutRequest: {Item: .}}]}' > tile-event-import.json

Transform the data if necessary:

1
πœ† sed 's/tile-images-prod/tile-images-pdev/g' tile-event-import.json > tile-event-import-transformed.json

Split all requests into 25 requests per file, with jq and awk (Note: There are some restriction with AWS DynamoDB batch-write-item request - The BatchWriteItem operation can contain up to 25 individual PutItem and DeleteItem requests and can write up to 16 MB of data. The maximum size of an individual item is 400 KB.)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
πœ† cat tile-event-processed.awk
#!/usr/bin/awk -f

NR%25==1 {
x="tile-event-import-processed-"++i".json";
print "{" > x
print " \"tile-event\": [" > x
}
{
printf " %s", $0 > x;
}
NR%25!=0 {
print "," > x
}
NR%25==0 {
print "" > x
print " ]" > x
print "}" > x
}

πœ† jq -c '.Items[]' tile-event-import-transformed.json | ./tile-event-processed.awk

Import all 22 processed JSON files into DynamoDB table:

1
2
3
4
$ for f in tile-event-import-processed-{1..22}.json; do \
echo $f; \
aws --profile development dynamodb batch-write-item --request-items file://$f; \
done

Get and read logs from AWS CloudWatch with saw

For all the people painfully read logs on AWS CloudWatch console, saw is your friend.

Get CloudWatch log groups start with paradise-api:

1
2
πœ† saw groups --profile ap-prod --prefix paradise-api
paradise-api-CloudFormationLogs-mwwmzgYOtbcB

Get last 2 hours logs for paradise-api from CloudWatch, with saw:

1
πœ† saw get --profile ap-prod --start -2h paradise-api-CloudFormationLogs-mwwmzgYOtbcB --prefix docker | jq .log | sed 's/\\\n"$//; s/^"//'

Read environment variables of a process in Linux

When try to get the content of any /proc/PID/environ file in more readable format, you can:

1
2
3
4
/proc/[pid]/environ
This file contains the environment for the process. The entries
are separated by null bytes ('\0'), and there may be a null byte
at the end.

A simple way is to apply xargs -0 -L1 -a on it:

  • -0 - read null-delimited lines,
  • -L1 - read one line per execution of command
  • -a - file read lines from file
1
2
3
4
5
6
7
8
9
10
11
12
13
# ps -aef
10101 3629 3589 0 Apr27 ? 00:00:00 /bin/bash bin/start
10101 3670 3629 0 Apr27 ? 00:00:00 /bin/bash bin/start-tomcat
10101 3671 3670 0 Apr27 ? 00:07:36 /usr/lib/jvm/java-11-amazon-corretto.x86_64/bin/java -Djava.util.logging.config.file=/usr/local/tomcat/conf/

# cat /proc/3629/environ
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/binHOSTNAME=27c44e8a5c7cJAVA_HOME=/usr/lib/jvm/java-11-amazon-corretto.x86_64HOME=/usr/local/tomcat

# xargs -0 -L1 -a /proc/3629/environ
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
HOSTNAME=27c44e8a5c7c
JAVA_HOME=/usr/lib/jvm/java-11-amazon-corretto.x86_64
HOME=/usr/local/tomcat

AWS KMS decrypt for base64 encoded input

With AWS CLI version 2:

1
2
πœ† aws --version
aws-cli/2.1.17 Python/3.7.4 Darwin/20.3.0 exe/x86_64 prompt/off

Encrypt with AWS KMS key:

1
2
3
4
5
6
7
πœ† aws kms encrypt --profile personal \
--key-id e2695b79-cbe0-4c16-aa5e-b7dbf52df1f9 \
--plaintext "string-to-encrypt" \
--output text \
--query CiphertextBlob \
--cli-binary-format raw-in-base64-out
AQICAHjbJrIPgME ... lILuBSUdA==

Decrypt with AWS KMS key:

1
2
3
4
5
πœ† echo "AQICAHjbJrIPgME ... lILuBSUdA==" | base64 -D | \
aws kms decrypt --profile personal \
--ciphertext-blob fileb:///dev/stdin \
--output text \
--query Plaintext | base64 -D

Reference

A Modern Architecture Application

RAD (Rapid Application Development) of a Serverless application β€œNotification Service” on modern technologies, e.g. AWS CDK & SAM, AWS Step Functions, TypeScript, VS Code, Open API Top Down Design and Test Driven Development, in order to rapidly build a prototype, or a POC, verify and test some technologies and approaches.

Request Handler => Step Functions (orchestration for Lambda functions, represents a single centralized executable business process, outsources low level operations like retry / exception catch and handle. Another choice is SNS) => Service Providers

Have experienced of Terraform, Serverless, AWS SAM … now this time based on code over configuration principle, what you get is flexibility, predictability and more control. You focus on code you tell the tools what steps it has to complete directly. At the end of day, it is a simple matter of separation of concerns and single responsibility principle.

β€’ VS Code for API Spec editing

β€’ Postman API, Environment and Mock server, for QA team, then switch to real service in DEV/TEST environment

1
πœ† npm run openapi

β€’ openapi-generator generates model classes; typescript-json-validator generates JSON Schema and validator

1
2
πœ† openapi-generator generate -g typescript-node -i Notification\ API\ openapi.json -o Notification\ API\ generated
πœ† npx typescript-json-validator notificationRequest.ts NotificationRequest

β€’ Onboard on Kong / API Manager, https://konghq.com/kong/

β€’ CDK, is based on CloudFormation but abstract layer on the top of it. It can generates CloudFormation template file template.yaml

1
πœ† cdk synth --no-staging > template.yaml

β€’ Demo of local run and debug lambda, with background TSC watch process

1
2
3
4
πœ† npm run watch

πœ† sam local invoke RequestNotification9F9F3C31 -e samples/api-gateway-notification-event.json
πœ† sam local invoke RequestNotification9F9F3C31 -e samples/api-gateway-notification-event.json -d 5858

Data validation to make data integrity unbreachable will take a lot time.

ajv framework and performance benchmark, https://github.com/ebdrup/json-schema-benchmark

β€’ Code lint with eslint and prettier and automatically correction

β€’ Code commit rule enforcement

β€’ Change code and deploy AWS stack by CDK

1
πœ† cdk deploy --require-approval never --profile dev-cicd

β€’ Behavior Driven Test Framework Jest, https://github.com/facebook/jest, 2x / 3x faster than Karma, with code coverage, easy mocking

1
πœ† npm t

β€’ Automatically generate application changelog and release notes

1
πœ† npm run release:minor

β€’ Automatically generate application document

1
πœ† npm run docs

β€’ AWS resources created by CDK

β€’ Not Mono Repo app, which multiple projects all under one giant Repo

β€’ ONE AWS Layers put all dependent NPM libs and shared code into; size of Lambda functions, readability

β€’ AWS EventBridge tro trigger and send event to Request Handler, for scheduling task

β€’ Health Check, with Service Monitoring Dashboard, verify dependencies at the endpoints, keep Lambda warming up

1
πœ† curl https://c81234xdae8w1a9.execute-api.ap-southeast-2.amazonaws.com/health

Cloud computing and Serverless architecture let developers in fast lane for Application Development. Right now, there are so many low hanging fruit to pick up.

As developers, we should not always think about our comfort zone, we need to think about people who take over your work, think about BAU team to support the application. The codebase is not about you, but about the value that your code brings to others, and the organization that you work for.

Bring back MagSafe

My first published video, created by Apple Final Cut Pro, on YouTube for official channel title Bring back MagSafe regard to solution that bring one of the most innovative design from Apple, back to MacBook Pro, iPad … and Android phones https://www.youtube.com/watch?v=yvkJR4Y0FK0

Risk Management for CI/CD processes

Consider a full development and deployment cycle, and the potential risks involved during the different stages in CDP (CI / Continuous Integration, CD / Continuous Delivery, CDP / Continuous Deployment):

  • Code
Role Details
Stakeholders Individual Developer
Pair Programming Mentor
DBA
Security Team
Failure Points Logic flaws
Security flaws
Code standards issues
Safeguards Test Driven Development
Red/Green/Refactor
Linting tools
Testing Docker containers
Pair programming
Query analysis
Static code analysis
  • Commit
Role Details
Stakeholders Security Team Member for sign-off
Engineering Team Lead for sign-off
Failure Points Force pushes
Merge conflicts
Safeguards Master branch protections
3 member sign-off before master merge
Commit hooks
  • Test
Role Details
Stakeholders Individual Developer
QA Team
Failure Points Broken tests
Stale tests
False positive tests
Safeguards Weekly failure testing triage meeting to catch broken tests
Daily cron runs of test suite against mock prod environment
  • Deployment
Role Details
Stakeholders SysOps Team
Individual Developers
Support Team
Customers
Failure Points Broken deployments
Dropped customer traffic
Safeguards Blue/Green deployment
Traffic re-routing
Pre deployment spare instance warmup
Communicate out to support in order to verify proper staffing levels
  • Runtime
Role Details
Stakeholders Security Team
SysOps Team
Engineering Teams
Support Team
Customers
Failure Points High resource usage
Slow queries
Malicious actors
MProvider downtime
Safeguards Communicate out to support for new feature awareness and appropriate categories for issues regarding the component
System resource alarms for various metrics and slow DB log alerts
Instant maintenance page switchover capabilities
Status page on redundant providers
Application firewalls
Database replicas

AWS CloudWatch Metrics Example

AWS CloudWatch Metrics

The interface of Metrics in AWS CloudWatch console:

AWS CloudWatch - Metrics

The URL:

1
https://ap-southeast-2.console.aws.amazon.com/cloudwatch/home?region=ap-southeast-2#metricsV2:graph=~(metrics~(~(~'AWS*2fRoute53Resolver~'InboundQueryVolume)~(~'.~'OutboundQueryVolume))~view~'timeSeries~stacked~false~region~'ap-southeast-2~stat~'Sum~period~86400~start~'-P28D~end~'P0D);query=~'*7bAWS*2fRoute53Resolver*7d

Metrics source:

1
2
3
4
5
6
7
8
9
10
11
12
{
"metrics": [
[ "AWS/Route53Resolver", "InboundQueryVolume" ],
[ ".", "OutboundQueryVolume" ]
],
"view": "timeSeries",
"stacked": false,
"region": "ap-southeast-2",
"stat": "Sum",
"period": 86400,
"title": "Test"
}