Cross-Functional Architecture And Tools For Cloud-Based Operating Models
This is a full-fledged data lake house appliance including several systems, complete with networking, IAM roles, source data ingested into s3, EMR clusters, a Glue Crawler example, and a SageMaker studio domain.
The embedded video on this page will ask you to cut and paste things from the text below, and will also illustrate each step being done. The architecture of the appliance created on this page is described in this other video in the architecture section of this web site .
You will need a provisioned devbox in order to orchestrate this appliance using the Agile Cloud Manager. Either Windows or Linux.
For convenience, we have defined a simple process for spinning up a DevBox at this link .
If you already have a working DevBox that you created for one of the other appliance examples, you can reuse that DevBox for this example if you do the following things first:
You will need an administrative user with specific permissions in order to run this example appliance.
You will also need to be working in the us-east-1 (N. Virginia) AWS region because some of the resource types are only available in us-east-1.
So begin by switching to the us-east-1 (N. Virginia) region as shown in the following screen shot.
Then open up a new CloudShell again, this time in the us-east-1 region.
Type the following command to download the template required to create the administrative user:
wget https://github.com/AgileCloudInstitute/aws-building-blocks/blob/master/cf/lf-admin-user.yaml?raw=true -O lf-admin-user.yaml
Then create the administrative user with the following command:
aws cloudformation create-stack --stack-name adminUser --capabilities CAPABILITY_NAMED_IAM --template-body file://lf-admin-user.yaml
One the CLI command has run, navigate another browser tab to the AWS CloudFormation service and make sure to switch to the us-east-1 (N. Virginia) region. Locate the stack, which should be named adminUser, and wait for the stack to say “CREATE_COMPLETE”.
Get the AWS Secrets as follows:
You can see how this will look in the next section.
Create a file in the DevBox named keys.yaml at $USER\acm\keys\starter\keys.yaml
After you have generated the keys for the newly-created admin user in STEP TWO above, place the AWSAccessKeyId
and AWSSecretKey
into your $USER\acm\keys\starter\keys.yaml under secretsType: master
, so that your entire keys.yaml will look as follows:
secretsType: master
AWSAccessKeyId: <ACTUAL-ID-REDACTED>
AWSSecretKey: <long-alpha-numeric-actual-key-redacted>
Create a file in the DevBox named config.yaml in the $USER\acm\keys\starter directory and add the following precise contents to the file (REMOVE THE INDENTATION FROM EACH LINE BUT KEEP EVERYTHING ELSE EXACTLY AS IT IS):
TPCDBName: tpc
DBMasterUser: tpcadmin
DBMasterPassword: BigData26!
EEKeyPair: MyKeyPair
LatestAmiId: /aws/service/ami-amazon-linux-latest/amzn2-ami-hvm-x86_64-gp2
organization: lhf3e
networkName: name-of-vnet
sysName: name-of-system
lhFoundationStackName: lhfoundation
region: us-east-1
CFNDatabaseName: tpc
lfUsersStackName: lh-iam-users
lfgluStackName: glue-database
ClassificationVal1: Sensitive
ClassificationVal2: Non-Sensitive
GroupVal1: developer
GroupVal2: campaign
GroupVal3: analyst
emrEngineerStackName: lhengineer
ReleaseLabel: emr-6.7.0
InstanceType: m4.large
EC2KeyPair: MyKeyPair
emrEngineerUserStackName: lh-engineer-users
EMRStepUserPassword: 2!PutRealPasswordInKeysYaml
emrScientistStackName: emr-scientist
glueScientistStackName: glue-scientist
lhScienceFoundationStackName: foundation-scientist
After you have confirmed that $USER\acm\keys\starter\keys.yaml
and $USER\acm\keys\starter\config.yaml
have been properly created, navigate the command line to any directory into which you want the Agile Cloud Manager to place resources for the given appliance.
Check the version by running the following command:
$ acm version
1.3
The version must be at least 1.3 to successfully run the Lake House example appliance. If you have a lower version installed, you will need to upgrade to the latest version, or at least to version 1.3.
Download and install all the requirements for the Lake House appliance by running the following CLI command in the new directory into which you want the Agile Cloud Manager to place the resources for the appliance:
acm setup on sourceRepo=https://github.com/AgileCloudInstitute/acm-demo-lake-formation.git
After the setup command completes running, confirm that your current working directory directory contains an acmAdmin
subdirectory and an acmConfig
subdirectory in addition to subdirectories for the other repositories that are listed in the appliance’s setupConfig.yaml file. You should now be able to find setupConfig.yaml inside the acmConfig
subdirectory now that the acm setup on sourceRepo=https://github.com/AgileCloudInstitute/acm-demo-lake-formation.git
command has completed running.
For example, on a Windows DevBox, you might run the dir
command and see:
C:\path\to\mydirectory>dir
Volume in drive C is Windows
Volume Serial Number is 3E5E-9650
Directory of C:\p\a\acm_dl
10/25/2023 04:48 PM <DIR> .
10/23/2023 09:09 AM <DIR> ..
10/23/2023 09:35 AM <DIR> acm-system-templates
10/23/2023 09:34 AM <DIR> acmAdmin
10/23/2023 09:34 AM <DIR> acmConfig
10/23/2023 09:35 AM <DIR> aws-building-blocks
The acmAdmin
and acmConfig
subdirectories will be present for any acm working directory after setup is run. The acm-system-templates
and aws-building-blocks
subdirectories are specific to this demo, and their names can be validated by examining the contents of the setupConfig.yaml file that you will find inside the acmConfig
directory.
You can create the entire appliance by running “acm appliance on”, and you can destroy the entire appliance by running “acm appliance off”.
But it is better to run smaller, more narrowly-scoped commands the first time.
Narrowly-scoped commands make it easier for you to learn how to examine the logs and to troubleshoot to understand what is going on.
Therefore, run the following create commands one at a time in sequence. Wait until each command has finished running, and monitor the progress in the AWS GUI console to see everything working properly. If you encounter any errors, examine the log files.
acm foundation on systemName=lakehouse-core
acm services on systemName=lakehouse-core
acm foundation on systemName=lakehouse-engineer
acm services on systemName=lakehouse-engineer
acm foundation on systemName=lakehouse-scientist
acm services on systemName=lakehouse-scientist
Then after you get all the above “on” commands working properly, run the “off” commands one at a time as follows:
acm services off systemName=lakehouse-scientist
acm foundation off systemName=lakehouse-scientist
acm services off systemName=lakehouse-engineer
acm foundation off systemName=lakehouse-engineer
acm services off systemName=lakehouse-core
acm foundation off systemName=lakehouse-core
You can watch the appliance being created in the following two ways:
After you have gotten all of the narrowly-scoped commands working, you can create and then destroy the entire appliance by running the following two commands one after the other:
acm appliance on
acm appliance off
Confirm in the AWS GUI console that all involved stacks have been deleted before moving on. If you encounter a problem, you can diagnose the problem by examining the logs. If you need help, create a ticket at the project website and someone will respond to help you in a timely manner.
Experiment with other CLI commands after the appliance has been destroyed. The other CLI commands will enable you to create and destroy individual components of the appliance.
The documentation for the CLI commands is at this link.
You can read about the language that defines the objects on which the CLI commands work at this second link.
And you can read about operating on the object model using the CLI at this third link.
Back up keys.yaml and config.yaml someplace safe so that you can re-use them later.
Confirm that all relevant stacks have been deleted by viewing the AWS GUI console’s cloudformation stacks dashboard for the us-east-1 region.
Run acm setup off
if you wish.
Confirm that anything you created has now been deleted.
Make sure that there are no keys.yaml or config.yaml in your $USER\acm\keys\starter
directory after you backed up those files to a safe location.
If you encounter any errors, or if you want to experiment, dig deeper, and potentially cleanup after running “acm appliance on” and “acm appliance off”, you can try reading the instructions at this link
Return to the list of example appliances at this link