Skip to content

Allow configurable and dynamic s3 path #134

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
arihantsurana opened this issue Jun 25, 2018 · 1 comment
Closed

Allow configurable and dynamic s3 path #134

arihantsurana opened this issue Jun 25, 2018 · 1 comment

Comments

@arihantsurana
Copy link
Contributor

Currently, the s3 loader writes files to a single s3 directory. This makes it somewhat hard to use with Athena or other hive query engines because we cannot set up time-based partitions by use of the directory structure. This also makes it harder to maintain and handle the data on s3.
Please allow optional configuration to load data onto s3 with a directory structure that can be filled with values from s3 loader's server timestamp.
eg. -

configure to store in monthly chunks:
s3_path = "base_dir/enriched/good/yr={YYYY}/mo={MM}"

store in base directory:
s3_path = "some_dir/raw"

store in hourly chunks:
s3_path = "base_dir/enriched/good/date={YYYY-MM-dd}/hour={HH}"
@arihantsurana
Copy link
Contributor Author

Added PR to solve for this - #135

@BenFradet BenFradet changed the title allow configurable and dynamic s3 path Allow configurable and dynamic s3 path Jul 4, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant