-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Suppress all kinds of Exceptions raised by report() #1049
Suppress all kinds of Exceptions raised by report() #1049
Conversation
Looks good. We should catch all exceptions, despite it's a rare case when a reporter implementation throws a checked exception. |
Thank you for your contribution! |
Thanks for merging. Any idea when 3.2.0 will be released ? |
I'm currently merging/reviewing outstanding pull requests and bug fixes from the backlog to the 3.2.x branch. A big part of backlog was processed, but I would like to land #1047 and #1048 to 3.2.* and after that make a release. I will discuss it with @ryantenney (he has the deployment rights). Hopefully, we can make a release this week, but I can't promise it. |
Great thanks, I'll ask for a version bump in Cassandra once it's released |
If the Cassandra project is depended on it, I think it makes sense to backport the fix to the 3.1.* branch as well, so it's easier to upgrade for the Cassandra folks. |
* Support for publishing metrics of BigInteger and BigDecimal type * Update third-party.rst New reporter - metrics-munin-reporter * Add links to more third party libs * Update third-party to include metrics-circonus * Fix spelling in core.rst * #814 always close socket (cherry picked from commit 2762ce5) * Revert "Add log4j2 xml-based config support" This reverts commit 6667014. * Revert "Added log4j2 InstrumentedAppender xml support to docs" This reverts commit 26c4cd2. * Revert "Fix javadoc link" This reverts commit 28e6556. * Revert "Add CPU profiling support to the AdminServlet (fixes #927)" This reverts commit bbb52c1. * Revert "Support for publishing metrics of BigInteger and BigDecimal type" This reverts commit 7cfc23c. * Suppress all kinds of Exceptions raised by report() (#1049) (#1056) * Support the JDK's ThreadLocalRandom (#1052) Add a proxy which provides a ThreadLocalRandom implementation depending on its availability in runtime. If the JDK provides ThreadLocalRandom, use it. Otherwise, fallback to the internal implementation. * Support JDK's LongAdder (#1055) * Support the JDK's LongAdder Add an ability to use the JDK's implementation if it's available in the runtime. Use a proxy for creating LongAdders which tries to use the JDK's implementation and fallbacks to the internal one if the JDK doesn't provide any. Unfortunatelly, `LongAdder`s don't have a common interface, therefore we introduce `LongAdderAdapter` as a common interface for `LongAdder`s. It will be used in other components as an interface, but a concrete implementation of it will be created by `LongAdderProxy`. This pattern allows to us to still run code in the JRE, but enjoy the benefits of the stock `LongAdder` in modern JDKs (less bugs, better performance). * Run tests only on JDK8 We need a compile dependency on LongAdders which is not available in JDK7. * Retry DNS lookups (#1064) Do retry DNS lookups, which are just done in the constructor of InetSocketAddress. Reason might be that in some environments DNS is very dynamic and thus a name might be sometimes resolveable and sometimes not. The way this is currently implemented the InetSocketAddress is always cached, which in itself caches the return value of the last DNS lookup, even if that failed. * Backport rescale bug (#1046) * Added ExponentiallyDecayingReservoir test illustrating the rescale race condition. (cherry picked from commit d9af3c1) * Fixed the race condition in ExponentiallyDecayingReservoir's rescale method. (cherry picked from commit b022f04)
Looks like so. @ryantenney told me yesterday he is ready to cut the releases. |
Looks like this wasn't enough :/ We should replace Exception by Throwable |
Does the Cassandra reporter throw |
Yes, Cassandra throws exception on snapshot extraction time, and it does it by design(not because of bug). Currently I am trying to promote this request which solves this issue CASSANDRA-13223. I hope I will achieve success after fixes for mentioned notes. I want to raise following thought: it is better to surround by try/catch the code places like this, because if any histogram throws exception then we have no reports at all. Missing of reports can lead to epic fail when something going wrong and you are unable to know how much wrong. I think that MetricsServlet should act in similar way and surrounds by try/catch each particular metric. And that why I do not use any built-in Dropwizard reporter, because I do not want to come in situation when reports are missed because of tricky bugs(like this #1104) or absurd design solution(like Cassandra). |
There is a PR for GraphiteReporter which provides something like that (#984), I could take a look at it. I think it's not a bad idea and could improve reliability and resilience. Doing so, we avoid situations when a failure for a set of metrics prevents the whole MetricsRegistry from reporting (or worse sends bogus data). If we adopt it, it means we will not imply that all metrics implementation don't throw exceptions, but embrace failures and find a means to mitigate from them. That might be not so easy, but it looks like a step in the right direction. |
@arteam the thing is that currently the silentely ignore Throwables and Errors subclasses. IMOH it's better to log something. |
I'm still a lit bit reluctant to change |
That's the thing: I don't know since the reporter just silently gets un-scheduled forever. I'll try a patched version catching everything to see if I can find out. |
Ah, I see. Thanks for hunting the issue! |
Ok I can confirm that fixes the issue because there are a lot of cases where an AssertionError (Error, not Exception) is being throwned, for example: {code} I've sent #1128 |
With the current code exceptions that do not inherit from RuntimeException will
make this task die silently.