You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Would be glad to open a PR since it's a simple fix but wanted to capture as an issue for any others with the problem. I'm currently operating with a fork that has the time format changed to HHmmssSSS instead of HH:mm:ss.SSS
The text was updated successfully, but these errors were encountered:
I recently deployed the S3 loader in a new setup with part of it working like this:
Scala stream collector -> NSQ topic -> S3 Loader -> S3
This works as expected with files being sinked to the S3 bucket with names like this:
2018-02-14-03:52:55.281-03:54:21.489-1579626703.lzo
The ETL process was then kicked off using EmrEtlRunner but would immediately fail on the first step with errors like this:
Exception in thread "main" java.lang.IllegalArgumentException: java.net.URISyntaxException: Relative path in absolute URI: 2018-02-14-03:52:55.281-03:54:21.489-1579626703.lzo
It was a cryptic error at first but then was obvious since HDFS treats colons as special characters. The time format defined here (https://github.com/snowplow/snowplow-s3-loader/blob/master/src/main/scala/com.snowplowanalytics.s3/loader/NsqSourceExecutor.scala#L80) is causing the problem. The Kinesis sink uses different file name handling so the problem doesn't exist there.
Would be glad to open a PR since it's a simple fix but wanted to capture as an issue for any others with the problem. I'm currently operating with a fork that has the time format changed to
HHmmssSSS
instead ofHH:mm:ss.SSS
The text was updated successfully, but these errors were encountered: