Skip to content

S3 outage causes infinite retry loop #44

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
alexanderdean opened this issue Aug 10, 2015 · 4 comments
Closed

S3 outage causes infinite retry loop #44

alexanderdean opened this issue Aug 10, 2015 · 4 comments
Assignees
Labels
Milestone

Comments

@alexanderdean
Copy link
Member

There was a us-east-1 outage for 4 hours this morning.

The Kinesis S3 app got "stuck" during the outage, and continues to be stuck in an infinite retry loop:

@4000000055c8d09012bd3bcc Aug 10, 2015 4:25:42 PM com.amazonaws.services.kinesis.metrics.impl.DefaultCWMetricsPublisher publishMetrics
@4000000055c8d09012bd4784 INFO: Successfully published 6 datums.
@4000000055c8d09527e22ea4 Aug 10, 2015 4:25:47 PM com.snowplowanalytics.snowplow.storage.kinesis.s3.S3Emitter attemptEmit$1
@4000000055c8d09527e23674 SEVERE: S3Emitter threw an unexpected exception
@4000000055c8d09527e23a5c com.amazonaws.AmazonClientException: Data read (0) has a different length than the expected (51706225)
@4000000055c8d09527e23a5c   at com.amazonaws.util.LengthCheckInputStream.checkLength(LengthCheckInputStream.java:135)
@4000000055c8d09527e23e44   at com.amazonaws.util.LengthCheckInputStream.read(LengthCheckInputStream.java:103)
@4000000055c8d09527e29434   at com.amazonaws.services.s3.internal.MD5DigestCalculatingInputStream.read(MD5DigestCalculatingInputStream.java:84)
@4000000055c8d09527e2981c   at org.apache.http.entity.InputStreamEntity.writeTo(InputStreamEntity.java:98)
@4000000055c8d09527e2981c   at com.amazonaws.http.RepeatableInputStreamRequestEntity.writeTo(RepeatableInputStreamRequestEntity.java:153)
@4000000055c8d09527e29c04   at org.apache.http.entity.HttpEntityWrapper.writeTo(HttpEntityWrapper.java:98)
@4000000055c8d09527e2d2b4   at org.apache.http.impl.client.EntityEnclosingRequestWrapper$EntityWrapper.writeTo(EntityEnclosingRequestWrapper.java:108)
@4000000055c8d09527e2d69c   at org.apache.http.impl.entity.EntitySerializer.serialize(EntitySerializer.java:122)
@4000000055c8d09527e2d69c   at org.apache.http.impl.AbstractHttpClientConnection.sendRequestEntity(AbstractHttpClientConnection.java:271)
@4000000055c8d09527e2e63c   at org.apache.http.impl.conn.ManagedClientConnectionImpl.sendRequestEntity(ManagedClientConnectionImpl.java:197)
@4000000055c8d09527e2ea24   at org.apache.http.protocol.HttpRequestExecutor.doSendRequest(HttpRequestExecutor.java:257)
@4000000055c8d09527e2ea24   at com.amazonaws.http.protocol.SdkHttpRequestExecutor.doSendRequest(SdkHttpRequestExecutor.java:47)
@4000000055c8d09527e2f1f4   at org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:125)
@4000000055c8d09527e2f1f4   at org.apache.http.impl.client.DefaultRequestDirector.tryExecute(DefaultRequestDirector.java:713)
@4000000055c8d09527e2f5dc   at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:518)
@4000000055c8d09527e2f5dc   at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:906)
@4000000055c8d09527e2fdac   at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:805)
@4000000055c8d09527e2fdac   at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:402)
@4000000055c8d09527e30194   at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:245)
@4000000055c8d09527e30964   at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3722)
@4000000055c8d09527e30d4c   at com.amazonaws.services.s3.AmazonS3Client.putObject(AmazonS3Client.java:1474)
@4000000055c8d09527e30d4c   at com.amazonaws.services.s3.AmazonS3Client.putObject(AmazonS3Client.java:1311)
@4000000055c8d09527e31134   at com.snowplowanalytics.snowplow.storage.kinesis.s3.S3Emitter.attemptEmit$1(S3Emitter.scala:144)
@4000000055c8d09527e3151c   at com.snowplowanalytics.snowplow.storage.kinesis.s3.S3Emitter.emit(S3Emitter.scala:166)
@4000000055c8d09527e31904   at com.amazonaws.services.kinesis.connectors.KinesisConnectorRecordProcessor.emit(KinesisConnectorRecordProcessor.java:159)
@4000000055c8d09527e31904   at com.amazonaws.services.kinesis.connectors.KinesisConnectorRecordProcessor.processRecords(KinesisConnectorRecordProcessor.java:132)
@4000000055c8d09527e320d4   at com.amazonaws.services.kinesis.clientlibrary.lib.worker.ProcessTask.call(ProcessTask.java:125)
@4000000055c8d09527e324bc   at com.amazonaws.services.kinesis.clientlibrary.lib.worker.MetricsCollectingTaskDecorator.call(MetricsCollectingTaskDecorator.java:48)
@4000000055c8d09527e324bc   at com.amazonaws.services.kinesis.clientlibrary.lib.worker.MetricsCollectingTaskDecorator.call(MetricsCollectingTaskDecorator.java:23)
@4000000055c8d09527e33074   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
@4000000055c8d09527e3345c   at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
@4000000055c8d09527e3345c   at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
@4000000055c8d09527e33844   at java.lang.Thread.run(Thread.java:745)
@4000000055c8d09527e33c2c
@4000000055c8d09715ea67ac Aug 10, 2015 4:25:49 PM com.amazonaws.services.kinesis.clientlibrary.lib.worker.Worker$WorkerLog info
@4000000055c8d09715ea7364 INFO: Current stream shard assignments: shardId-000000000207
@4000000055c8d09715ec53dc Aug 10, 2015 4:25:49 PM com.amazonaws.services.kinesis.clientlibrary.lib.worker.Worker$WorkerLog info
@4000000055c8d09715ec5bac INFO: Sleeping ...
@4000000055c8d09a13abf514 Aug 10, 2015 4:25:52 PM com.amazonaws.services.kinesis.metrics.impl.DefaultCWMetricsPublisher publishMetrics
@4000000055c8d09a13abfce4 INFO: Successfully published 6 datums.
@4000000055c8d09f2c20de0c Aug 10, 2015 4:25:57 PM com.snowplowanalytics.snowplow.storage.kinesis.s3.S3Emitter attemptEmit$1
@4000000055c8d09f2c20e5dc SEVERE: S3Emitter threw an unexpected exception
@4000000055c8d09f2c20e5dc com.amazonaws.AmazonClientException: Data read (0) has a different length than the expected (51706225)
@4000000055c8d09f2c20e9c4   at com.amazonaws.util.LengthCheckInputStream.checkLength(LengthCheckInputStream.java:135)
@4000000055c8d09f2c20e9c4   at com.amazonaws.util.LengthCheckInputStream.read(LengthCheckInputStream.java:103)
@4000000055c8d09f2c2137e4   at com.amazonaws.services.s3.internal.MD5DigestCalculatingInputStream.read(MD5DigestCalculatingInputStream.java:84)
@4000000055c8d09f2c2137e4   at org.apache.http.entity.InputStreamEntity.writeTo(InputStreamEntity.java:98)
@4000000055c8d09f2c213bcc   at com.amazonaws.http.RepeatableInputStreamRequestEntity.writeTo(RepeatableInputStreamRequestEntity.java:153)
@4000000055c8d09f2c213bcc   at org.apache.http.entity.HttpEntityWrapper.writeTo(HttpEntityWrapper.java:98)
@4000000055c8d09f2c21439c   at org.apache.http.impl.client.EntityEnclosingRequestWrapper$EntityWrapper.writeTo(EntityEnclosingRequestWrapper.java:108)
@4000000055c8d09f2c214784   at org.apache.http.impl.entity.EntitySerializer.serialize(EntitySerializer.java:122)
@4000000055c8d09f2c214784   at org.apache.http.impl.AbstractHttpClientConnection.sendRequestEntity(AbstractHttpClientConnection.java:271)
@4000000055c8d09f2c215724   at org.apache.http.impl.conn.ManagedClientConnectionImpl.sendRequestEntity(ManagedClientConnectionImpl.java:197)
@4000000055c8d09f2c215b0c   at org.apache.http.protocol.HttpRequestExecutor.doSendRequest(HttpRequestExecutor.java:257)
@4000000055c8d09f2c215b0c   at com.amazonaws.http.protocol.SdkHttpRequestExecutor.doSendRequest(SdkHttpRequestExecutor.java:47)
@4000000055c8d09f2c2162dc   at org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:125)
@4000000055c8d09f2c2162dc   at org.apache.http.impl.client.DefaultRequestDirector.tryExecute(DefaultRequestDirector.java:713)
@4000000055c8d09f2c2166c4   at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:518)
@4000000055c8d09f2c2166c4   at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:906)
@4000000055c8d09f2c216e94   at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:805)
@4000000055c8d09f2c216e94   at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:402)
@4000000055c8d09f2c21727c   at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:245)
@4000000055c8d09f2c217a4c   at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3722)
@4000000055c8d09f2c217e34   at com.amazonaws.services.s3.AmazonS3Client.putObject(AmazonS3Client.java:1474)
@4000000055c8d09f2c217e34   at com.amazonaws.services.s3.AmazonS3Client.putObject(AmazonS3Client.java:1311)
@4000000055c8d09f2c21821c   at com.snowplowanalytics.snowplow.storage.kinesis.s3.S3Emitter.attemptEmit$1(S3Emitter.scala:144)
@4000000055c8d09f2c218604   at com.snowplowanalytics.snowplow.storage.kinesis.s3.S3Emitter.emit(S3Emitter.scala:166)
@4000000055c8d09f2c2189ec   at com.amazonaws.services.kinesis.connectors.KinesisConnectorRecordProcessor.emit(KinesisConnectorRecordProcessor.java:159)
@4000000055c8d09f2c218dd4   at com.amazonaws.services.kinesis.connectors.KinesisConnectorRecordProcessor.processRecords(KinesisConnectorRecordProcessor.java:132)
@4000000055c8d09f2c2191bc   at com.amazonaws.services.kinesis.clientlibrary.lib.worker.ProcessTask.call(ProcessTask.java:125)
@4000000055c8d09f2c2195a4   at com.amazonaws.services.kinesis.clientlibrary.lib.worker.MetricsCollectingTaskDecorator.call(MetricsCollectingTaskDecorator.java:48)
@4000000055c8d09f2c21998c   at com.amazonaws.services.kinesis.clientlibrary.lib.worker.MetricsCollectingTaskDecorator.call(MetricsCollectingTaskDecorator.java:23)
@4000000055c8d09f2c21a15c   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
@4000000055c8d09f2c21a544   at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
@4000000055c8d09f2c21a544   at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
@4000000055c8d09f2c21a92c   at java.lang.Thread.run(Thread.java:745)
@4000000055c8d09f2c21d80c
@4000000055c8d0a414e795dc Aug 10, 2015 4:26:02 PM com.amazonaws.services.kinesis.metrics.impl.DefaultCWMetricsPublisher publishMetrics
@4000000055c8d0a414e7a194 INFO: Successfully published 6 datums.

Grepping the logs for 51706225 shows that the sink has been stuck on the same batch of data for many hours (the S3 outage was resolved hours ago).

I am going to bounce the box now and fully expect the issue to resolve itself with the bounce.

@jbeemster
Copy link
Member

So we need much the same logic as elastic search sink whereby it kills itself after N amount of failures.

@alexanderdean
Copy link
Member Author

Agree! BTW the bounce has worked:

@4000000055c8d1f73594935c Aug 10, 2015 4:31:41 PM com.snowplowanalytics.snowplow.storage.kinesis.s3.S3Emitter emit
@4000000055c8d1f735949f14 INFO: Successfully serialized 44861 records out of 44861
@4000000055c8d1fa091cb9bc Aug 10, 2015 4:31:44 PM com.snowplowanalytics.snowplow.storage.kinesis.s3.S3Emitter attemptEmit$1
@4000000055c8d1fa091cc574 INFO: Successfully emitted 44861 records to S3 in s3://redacted.lzo with index redacted.lzo.index
@4000000055c8d1fd2d5a5fdc Aug 10, 2015 4:31:47 PM com.amazonaws.services.kinesis.metrics.impl.DefaultCWMetricsPublisher publishMetrics
@4000000055c8d1fd2d5a6b94 INFO: Successfully published 16 datums.
@4000000055c8d20731052414 Aug 10, 2015 4:31:57 PM com.amazonaws.services.kinesis.metrics.impl.DefaultCWMetricsPublisher publishMetrics
@4000000055c8d20731052fcc INFO: Successfully published 19 datums.

@jbeemster
Copy link
Member

Now we just wait for it to catch up!

@fblundun
Copy link
Contributor

I'm wondering why it didn't start catching up once S3 recovered. Could it somehow be because the AmazonS3Client object is created once when the sink initializes instead of once per attempt to send records?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants