8000 GitHub - ShafiqaIqbal/SFTP-S3-Glue-Ingestion-Python: Glue Batch ingestion job to move files from file server to S3
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

ShafiqaIqbal/SFTP-S3-Glue-Ingestion-Python

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 

Repository files navigation

SFTP-S3-Glue-Ingestion-Python

Glue Batch ingestion job to move files from file server to S3 This code is part of my article on Step-By-Step guide on how to design a data lake using AWS Glue, Athena, Lambda and QuickSight

This AWS Glue Python Shell Job is used to upload intial Batch of files on AWS S3 Bucket. This Glue Job also handles CDC, which will upload updated files on the S3 bucket when Glue Job is run after initial load. This code uses AWS SDK Boto3. Following are the key features:

  • Fetch credentials from SSM Parameter Store (file server credentials, folder path, S3 bucket)
  • Connect to SFTP file server and upload files (which are not already present on S3 and are not updated since the last time Glue job ran)
  • Automatic Handling of Failed Scenarios (Retries) and Exception Handling
  • Handles multipart upload to s3 automatically, if file size is greater than 100MB

About

Glue Batch ingestion job to move files from file server to S3

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

0