Hi there!
I’m trying to build a extract-transform-load pipeline that involves a few S3 buckets using Serverless. Here’s the relevant config:
# serverless.yml
functions:
extract:
handler: handler.extract
timeout: 60
events:
- schedule: cron(0 7 ? * MON-FRI *)
transform:
handler: handler.transform
events:
- s3: ${S3_RAW_BUCKET}
load:
handler: handler.load
events:
- s3: ${S3_TRANSFORMED_BUCKET}
# serverless.env.yml
vars:
stages:
dev:
vars:
S3_RAW_BUCKET: "raw-dev"
S3_TRANSFORMED_BUCKET: "transformed-dev"
regions:
eu-west-1:
vars:
Hopefully this is understandable - handler.extract
will extract data (from a URL) periodically and store the raw data into an S3 bucket. handler.transform
will change the data and store the transformed data into a different S3 bucket. handler.load
will load it into our database.
We’re running into problems with the S3 bucket names: handler.extract
needs to know the name of the S3 bucket it’s dropping into. I specified it as raw-dev
, but on inspection in S3 the name of the bucket is raw-dev0
; as a result, I get a NoSuchBucket
error from S3.
Where did the 0
come from, and is there a way to get rid of it?
Thanks for your help!