The current configuration (through July 6 2022) hosts the production, staging, develop, and independent developer databases all on a single RDS server. While this makes it very fast and easy to replicate production data to the develop and staging databases, it does mean a shared database server. The memory and CPU resource will be utilized by not only the production customers but by our QA and dev teams whenever they access these secondary databases.
Current Copy Process
A set of tools has been built into the OmniBlocks app (FE and BE) that allow a super admin to access some system configuration tools. One of these tools is the data copy tool. This tool basically has two modes, copy live data, and copy the copy to other places (staging, develop, etc.). The tool runs a ‘create database from <other-database” pgSQL command. It is super fast but requires NOBODY be accessing the source database. In other words it terminates all active live connections.
This works fine in the evening when our current customers are not using the system, but it is not an option when we need to make data copies during daytime hours. Also, we will eventually have 24×7 customers and this will require a new methodology.
Proposed Process – Create A New RDS Instance
A better option is to create a new RDS instance using the AWS system snapshots. These snapshots make a copy of the entire RDS image every night. These can be used to start a new instance.
The .env configuration files on develop (or staging, or local dev boxes) would need to be updated to point to the new server. It also means that the latest production data is only in the main ds_prod_test (great name, TP) database, NOT the ds_stage or ds_dev databases. As such the .env file on develop, staging, or other API servers need to not only update the DB_HOST address but the DB_NAME to be ds_prod_test.
AWS Process
Login to AWS and go to the RDS service.
Look for snapshots and find the latest snapshot, likely from the System Services if restoring from the automated nightly snapshot. Could be from a manual snapshot if one was created for “more recent than 1AM” data copies.
Restore the snapshot, leaving most settings intact.
It does NOT need to be multizone nor have any maintenance updates applied.
It should also be run on the smallest instance size available (prev generations db.m3.small).
Make sure to add the main VPCPostgres security group and drop default.
Networking additional configuration, set make publicly accessible to yes.
Set backup retention to 1 day.
Name the instance something identifiable such as “develop-from-system-snapshot”.
Create the instance. It will take a bit of time.
July 7th 2022, started creating at 0934 EST… finished before 1016 EST.
Update The API Server Config
The .env file used by the REST API server needs to be updated to point to the new server (DB_HOST) and new database name (DB_NAME).
Login to the develop server using SSH to access the command line.
Connect to the Jenkins user:
sudo su - jenkins
Go to the API directory
cd /var/lib/jenkins/workspace/dev_backend
Edit the .env file (vim?) and change the DB_HOST and DB_NAME.
DB_HOST needs to match the endpoint of the newly-created RDS server (develop-from-system-snapshot.czdhjjbfynwe.us-east-1.rds.amazonaws.com)
DB_NAME needs to be changed from ds_dev (the development database) to ds_prod_test (the production database normally, but not on our cloned server).**
Restart the API service…
yarn pm2 stop all
yarn start
Other Considerations
Keeping Develop Database vs. “Cloned Production Database”
Instead of connecting the API servers to the production database (editing .env and changing DB_NAME) it CAN remain connected to the ds_dev (or ds_stage on staging) database. This is less risky, however it now requires the original DB Copy Tool noted at the start of this article.
To use this method…
DO NOT change the DB_NAME as noted in “Update The API Server Config”. Instead, only change the DB_HOST, continue with the restart the API service.
Using the front end app, login as a system admin at https://develop.omniblocks.app.
Go to the system / tools module.
Copy live data (this will only shut off access to live data on the CLONED server, assuming you’ve set DB_HOST and restarted PM2 properly first). This should take 15m or less.
When copy live data is finished use the other tool to update the develop database.
Now logging into the develop server should come up with the latest production data while connected to the ds_dev database on the clone server.
Problem Investigation
Check the log files at /var/lib/jenkins/workspace/dev_backend/logs , PM2 logs connection I/O and errors much like nginx server logs here.
Performance Notes
Sadly the db.tX instances are not longer available in the US East zone, despite being offered by AWS. That means the DB needs to be on a DB.M* instance. Current setting is db.m5.large.
Recommended to enable performance insights to see what is going on internally on the RDS requests in real-time.
- Copy live data (July 7 2022) took 13 minutes (16:43 to 16:56) on this setting
Alternative pg_dump method
A new EC2 server has been started to test some data copy tools. It is a t2.small burstable instance with PostgreSQL 11 command line tools available.
copydata.sh
This is a bash script that uses pg_dump to dump and restore a single data table between servers. The initial run copying from the Dev RDS Server ds_prod_test “aka production, but on dev server” table => ds_dev on the same Dev RDS Server was super slow.
May be worth testing swapping out the source host to the actual production server AFTER HOURS and monitoring performance vs. the RDS snapshot restore method noted above.
The server
Access with SSH key (same as EC2 middleware dev SSH key):
ssh -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no -i ~/Desktop/Research\ Blocks/pem\ keys/dsprodbackkey.pem ec2-user@18.207.228.31
The script…
#!/bin/sh
#
# RDS Dev host: develop-from-system-snapshot.czdhjjbfynwe.us-east-1.rds.amazonaws.com
#
#
# Prod DB: ds_prod_test
# Dev DB: ds_dev
# Staging DB: ds_stage
#
#
echo 'Copying data from PROD SQL server => DEV SQL server'
echo
pg_dump -C -W -h develop-from-system-snapshot.czdhjjbfynwe.us-east-1.rds.amazonaws.com -U postgres ds_prod_test | psql -W -h develop-from-system-snapshot.czdhjjbfynwe.
us-east-1.rds.amazonaws.com -U postgres ds_dev
echo
exit;