Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ThreadLocal Memory Leak: com.codahale.metrics.ThreadLocalRandom #742

Closed
jnehlmeier opened this issue Jan 26, 2015 · 31 comments
Closed

ThreadLocal Memory Leak: com.codahale.metrics.ThreadLocalRandom #742

jnehlmeier opened this issue Jan 26, 2015 · 31 comments

Comments

@jnehlmeier
Copy link

Using Metrics 3.1.0 there is a static ThreadLocal in com.codahale.metrics.ThreadLocalRandom that is never cleaned up.
This causes app server threads to hold references to this ThreadLocalRandom and thus the app ClassLoader can never be garbage collected.

It is a common misunderstanding that the ThreadLocalMap.Entry class prevents such leaks as it extends WeakReference. However the Entry class only weakly references the key (the ThreadLocal instance) and not the value (the com.codahale.metrics.ThreadLocalRandom instance).
So when a web app gets undeployed then the weakly referenced ThreadLocal instance might get cleaned up but the corresponding ThreadLocalMap.Entry instance is still in the map and still holds a strong reference to its value. ThreadLocalMap has some code to detect these stale entries but this code only gets triggered when calling ThreadLocal.set() or ThreadLocal.remove() ... which does not happen here.

leak

The above screenshot shows the GC root excluding WeakReference to further illustrate that the leak exists and is caused by the ThreadLocal.ThreadLocalMap.Entry.value field.

Either do not use ThreadLocal or provide a clean up mechanism that can be called in a ServletContextListener.

@joe-devries-sp
Copy link

Related: #272 #248

@ryantenney
Copy link
Contributor

I'm open to adding a clean up mechanism and ServletContextListener, but ideally we'd make it completely transparent to the user.

@jnehlmeier
Copy link
Author

I have also observed that this ThreadLocal classloader leak occurs on other persistent app server threads like Jetty's Scanner Thread. During deployment it probably touches every class once which initializes the static ThreadLocal field in ThreadLocalRandom. So without any request the deployed app can not be garbage collected anymore.

To fix all possible leaks metrics would need to:

  • Make initialization of the ThreadLocal in ThreadLocalRandom lazy so that it won't be triggered during deployment / class loading.
  • Either provide a servlet filter that can call a cleanup method to remove the ThreadLocalRandom from the current request thread or a ServletContextListener that does cleanup for all threads once the app is undeployed. However if more than one app is deployed at the same time then the ServletContextListener solution might clean threads that are currently handling requests for other apps. So I guess a servlet filter would be preferable.

Alternatively for Java7+ users it might be possible to provide a separate metrics-core library that uses java.util.concurrent.ThreadLocalRandom of Java7+. In that case the classloader leak would be gone as ThreadLocalRandom would be loaded by the system classloader.
However you might still have a slightly memory leak because continuous deployment of apps would cause new ThreadLocal instances to be stored in a thread's ThreadLocalMap. But maybe these ThreadLocal instances get cleaned up over time by the code provided in ThreadLocalMap.

@DustinChaloupka
Copy link

Where does this issue currently stand? Curious about if someone is working on it or not.

@adamdecaf
Copy link

Would a patch be accepted that copied over the updated improvements from some of the updated source? http://gee.cs.oswego.edu/cgi-bin/viewcvs.cgi/jsr166/src/main/java/util/concurrent/ThreadLocalRandom.java?view=markup

@DustinChaloupka
Copy link

So was doing some digging around to see if I could get something to work, but it looks like what the Java 7 of ThreadLocalRandom is using for its seeding (declared fields in class Thread namely threadLocalRandomSeed and threadLocalRandomProbe) are not a part of Thread in Java 6. So unless someone else has other ideas on patching that over, I'm not sure we can. =\

@ghost
Copy link

ghost commented May 17, 2015

Using dropwizard 3.1.0(core) with Jdk7 and jboss8. Observing threadlocal related memory issues....any fix to overcome this?

@ghost
Copy link

ghost commented May 17, 2015

How to use Java7's ThreadLocalRandom to avoid Thread local related issues?

@jnehlmeier
Copy link
Author

@nani83 checkout the 3.1 branch, delete ThreadLocalRandom from Metrics and fix imports so it uses the Java7 version.

@ghost
Copy link

ghost commented May 17, 2015

Thanks. Since, it is manual, I thought there might be another way/version which I can use. Will check this. Another query, As you were mentioning in previous post, Do we need explicit cleanup of threadlocals in filters/listener to overcome this issue completely?

@philippe-tr
Copy link

What is the status of this issue? We're having the same problem I think. When we issue the command catalina.sh stop we get the following error messages in catalina.out.

SEVERE [localhost-startStop-2] org.apache.catalina.loader.WebappClassLoader.checkThreadLocalMapForLeaks The web application [/capdServices] created a ThreadLocal with key of type [com.codahale.metrics.Striped64.ThreadHashCode] (value [com.codahale.metrics.Striped64$ThreadHashCode@xyz]) and a value of type [com.codahale.metrics.Striped64.HashCode] (value [com.codahale.metrics.Striped64$HashCode@xyz]) but failed to remove it when the web application was stopped. Threads are going to be renewed over time to try and avoid a probable memory leak.
SEVERE [localhost-startStop-2] org.apache.catalina.loader.WebappClassLoader.checkThreadLocalMapForLeaks The web application [/capdServices] created a ThreadLocal with key of type [com.codahale.metrics.ThreadLocalRandom$1] (value [com.codahale.metrics.ThreadLocalRandom$1@xyz]) and a value of type [com.codahale.metrics.ThreadLocalRandom] (value [com.codahale.metrics.ThreadLocalRandom@xyz]) but failed to remove it when the web application was stopped. Threads are going to be renewed over time to try and avoid a probable memory leak.

@adamdecaf
Copy link

Hey all, just wanted to say we've forked this repo and back ported the fix (diff) (it wont work on java6) from @DustinChaloupka (#757) over under banno/metrics.

@philippe-tr
Copy link

@adamdecaf , will you issue a pull request to @ryantenney ? Is there any plan for the code to come back into this repo?

@adamdecaf
Copy link

@philippe-tr The fix was already committed to dropwizard/master. So everyone is free to cherry-pick that off. The rest of the changes were just to get the versions correct when packaging the jar.

@philippe-tr
Copy link

Ok. Too bad it did not make it to a release packaged on maven central. I can make my own in-house release for now I suppose.

@adamdecaf
Copy link

@philippe-tr yea, I didn't want to deal with that, and we have an internal repository we use.

@horn-rimmed-glasses
Copy link

@philippe-tr Did this resolve your issue? I am experiencing the same first error message as you, even with the cherry-pick on top of 3.1.0.

SEVERE [localhost-startStop-2] org.apache.catalina.loader.WebappClassLoader.checkThreadLocalMapForLeaks The web application [/capdServices] created a ThreadLocal with key of type [com.codahale.metrics.Striped64.ThreadHashCode] (value [com.codahale.metrics.Striped64$ThreadHashCode@xyz]) and a value of type [com.codahale.metrics.Striped64.HashCode] (value [com.codahale.metrics.Striped64$HashCode@xyz]) but failed to remove it when the web application was stopped. Threads are going to be renewed over time to try and avoid a probable memory leak.

Any idea of how to resolve?

@philippe-tr
Copy link

@horn-rimmed-glasses No I did not work through this yet.

@static-max
Copy link

Any news on this?

@adamdecaf
Copy link

We're still running our fork of this library. Has there been much development on this project?

@spc16670
Copy link

+1

@vazzick
Copy link

vazzick commented Nov 17, 2016

Is this still leaking in 3.1.2?

@tadgh
Copy link

tadgh commented Nov 22, 2016

I still have this issue on latest.

@adamdecaf
Copy link

We're still on the fork I've mentioned previously on this.

@fjamal21
Copy link

fjamal21 commented Jan 7, 2017

any update on this? when can we expect a fix for this?

@gymaganis
Copy link

+1

@arteam
Copy link
Member

arteam commented Feb 1, 2017

Submitted #1052 to fix this for the 3.1 branch.

@arteam
Copy link
Member

arteam commented Feb 17, 2017

I guess it should be fixed by #1052

@arteam arteam closed this as completed Feb 17, 2017
@fbarbat
Copy link

fbarbat commented Jun 14, 2017

Just in case it is useful for somebody, this problem is still happening in 3.2.2 on Java 6. If I am correct, this is because #1052 fixes the problem for JDK7 and above. Another argument for convincing management to invest time in updating Java...

@arteam
Copy link
Member

arteam commented Jun 14, 2017

Yes. Unfortunately, it's not possible to patch the internal ThreadLocalRandom, because the JDK version uses the internal JDK8 API. I couldn't find a version with the fix in the CVS history of JSR-166 compatible with Java 6/Java 7.

@codesinthedark
Copy link

I had similar problem and I implemented a different ThreadLocal class that will enable class loaders to be GC-ed on redeploy. It is different than other similar attempts (like VicariousThreadLocal and other classes) because it is completely non-blocking (even for setting values, not only for getting).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests