Provisioning AWS Athena queries with Lambda and StepFunctions

Saturday, 30 May 2020, 23:20

Amazon Athena is a brilliant tool for data processing and analytics in AWS cloud. Under the hood it utilizes Presto engine to query and process data in your S3 storage using standard SQL notation. The concept behind it is truely simple - run SQL queries against your data in S3 and pay only for the resurces consumed by the query. No cluster to manage - everything fully serverless and managed by Amazon; no need to learn new technology - you query data using SQL that most likely is known among your team; no need for additional storage or fees - you store data directly in S3. Did I mention a keyword serverless? Yes, it runs completely via API SDK, no need to manage any resource on your own (and cold startup in Athena is super low). It integrates tightly with Glue. When we say serverless in AWS we mainly think Lambda. And for sure, sooner or later you will want to integrate your query into some more complex workflow.

