SQL Runner
A routine to run SQL query/ies in RedShift.
Objective
To build a generic light-weight application to synchronously run in a k8s pod to execute submitted queries in Redshift.
Motivation
Data preparation is often ongoing in a redshift cluster across many organisations. Since a couple years, kubernetes/k8s is being leveraged as the foundation for organisation-wide data platform.
Python is a great programming language, and de-facto the default technology for most of data engineering teams. The services they made
Steps
- Connect to s3
- Conenct to RedShift
- Read query from s3 bucket
- Impute query paramters
- Split query to queries
- Loop over queries list and execute every sub-query
- Close connections