Fields | Type | Description |
---|---|---|
id_ | string | id of the characters |
name | string | name of the characters |
description | string | description of the characters |
image | string | url with image of the characters |
This is an example cloud architecture project for a data engineer. Using Marvel's api, a first scraping is done that creates a file in a local directory called "dataset". This folder is being monitored by a linux service that, when it receives a file or a file update, it triggers terraform migrating the entire service to aws. In AWS a bucket is created, and automatically starts a lambda process that reads its contents, performs the necessary transformations and populates a dynamoDB.
To run this project you need to follow a few steps:
- Make sure you have the TerraForm installed on your machine, if you do not have it, you can install following the instructions on the official website: https://www.terraform.io/downloads.html
- Clone the project repository on your machine.
- Go to https://developer.marvel.com/ and click "Get a Key".
- Log in with your Marvel account or create a new account.
- Fill in the form with the requested information, such as the name of the application, the name of the developer, the email and the country.
- Accept the API terms and conditions and click "Create Account".
- After creating the account, you will be redirected to the "My Account" page and you will be able to see your public key (public key) and your private key (private key).
- Save these keys in your machine
- Access the cloned project directory and build a parameter file called param/param.json
- This file is a json, so build the variables that the script/insert_on_dataset.py file needs. Including the api keys.
- Finally, run the run.sh file with the command "sh run.sh". and run the run.sh file with the command "sh run.sh".
- Open the variables.tf file and fill in all requested variables.
After following these steps, the infrastructure will be ready and its ETL may be executed.