A coercion-related memory leak #15424

simon-king-jena · 2013-11-15T16:35:24Z

Yet another leak (this test is with #15303):

sage: import gc
sage: K = IntegerModRing(111115)
sage: C = type(K)
sage: len([1 for bla in gc.get_objects() if isinstance(bla,C)])
1
sage: K.has_coerce_map_from(ZZ)
True
sage: del K
sage: _ = gc.collect()
sage: len([1 for bla in gc.get_objects() if isinstance(bla,C)]) # no leak
0
sage: a = K.get_action(ZZ, op=operator.add, self_on_left=True)
sage: del K,a
sage: _ = gc.collect()
sage: len([1 for bla in gc.get_objects() if isinstance(bla,C)])
0
sage: K = IntegerModRing(111115)
sage: a = K.get_action(ZZ, op=operator.mul, self_on_left=True)
sage: b = ZZ.get_action(K, op=operator.mul, self_on_left=False)
sage: del K,a,b
sage: _ = gc.collect()
sage: len([1 for bla in gc.get_objects() if isinstance(bla,C)]) # no leak
0
sage: K = IntegerModRing(111115)
sage: x = K.one()
sage: del K,x
sage: _ = gc.collect()
sage: len([1 for bla in gc.get_objects() if isinstance(bla,C)]) # no leak
0
sage: K = IntegerModRing(111115)
sage: x = K.one()*2
sage: del K,x
sage: _ = gc.collect()
sage: len([1 for bla in gc.get_objects() if isinstance(bla,C)]) # LEAK
1

This is astonishing. Wouldn't one think that getting (thus, caching) a coercion or getting an action should trigger a leak, or creating an element? No, we actually need to multiply two elements to find the leak.

Surprising:

sage: K = IntegerModRing(111115)
sage: K.get_action(ZZ, op=operator.mul, self_on_left=True) is None
True
sage: ZZ.get_action(K, op=operator.mul, self_on_left=True) is None
True

Shouldn't there be an action? OK, perhaps not, since coercion is used for the multiplication:

sage: cm = sage.structure.element.get_coercion_model()
sage: cm.explain(K,ZZ, op=operator.mul)
Coercion on right operand via
    Natural morphism:
      From: Integer Ring
      To:   Ring of integers modulo 111115
Arithmetic performed after coercions.
Result lives in Ring of integers modulo 111115
Ring of integers modulo 111115

But if coercion is used, then why is establishing a coercion not enough to trigger the leak? In a new session:

sage: import gc
sage: K = IntegerModRing(111115)
sage: C = type(K)
sage: phi = K.coerce_map_from(ZZ)
sage: x = phi(2)
sage: del K,phi,x
sage: _ = gc.collect()
sage: len([1 for bla in gc.get_objects() if isinstance(bla,C)])
0

CC: @nbruin

Component: memleak

Keywords: coercion model, weak reference

Reviewer: Simon King

Issue created by migration from https://trac.sagemath.org/ticket/15424

The text was updated successfully, but these errors were encountered:

nbruin · 2013-11-16T01:08:02Z

comment:1

I tried to see where the reference can be coming from:

%cpaste
import gc
from sage.structure.coerce_dict import *
def all_referrers(c,X):
    found_IDs=set()
    R=[]
    new=gc.get_referrers(c)
    found_IDs.add(id(new))
    found_IDs.add(id(R))
    found_IDs.add(id(globals()))    
    while len(new)>0:
        r=new.pop()
        if id(r) in found_IDs or type(r) not in X:
            print "skipping",type(r)
            continue
        R.append(r)
        found_IDs.add(id(r))
        new.extend(gc.get_referrers(r))
        print "type(r)=%s len(R)=%s len(new)=%s"%(type(r),len(R),len(new))
    return R

def getR():
    K = IntegerModRing(111115)
    C = type(K)
    phi = K.coerce_map_from(ZZ)
    del phi
    #vary this line:
    x = K.one()*2
    #del K,x
    _ = gc.collect()
    X=set([list,dict,tuple,
           RingHomset_generic_with_category,
           sage.rings.finite_rings.integer_mod.Integer_to_IntegerMod,
           TripleDict, MonoDict, C])
    R=all_referrers(list(c for c in gc.get_objects() if isinstance(c,C))[0],X)
    return R
--
RingHomset_generic_with_category=type(ZZ.Hom(QQ))
R=getR()

and I found that with x=K.one()*2 there is also a TripleDict that show up. It's fairly big (44 entries) and the entries all seem to be of the form D[domain,codomain,None]=morphism from domain to codomain (or None entries). Since the morphism has a strong reference to the codomain, this would keep our ring alive [it doesn't seem like the kind of dictionary that can afford to be weak on its values]. Judging from the entries, this dictionary is a global one. I think there is a ticket somewhere that mentions the same phenomenon (also doing arithmetic), in the context of finite fields. Somewhere from the era when we started to work on #715 in earnest.

simon-king-jena · 2013-11-16T16:05:00Z

comment:2

Replying to @nbruin:

and I found that with x=K.one()*2 there is also a TripleDict that show up. It's fairly big (44 entries) and the entries all seem to be of the form D[domain,codomain,None]=morphism from domain to codomain (or None entries). Since the morphism has a strong reference to the codomain, this would keep our ring alive [it doesn't seem like the kind of dictionary that can afford to be weak on its values]. Judging from the entries, this dictionary is a global one.

Hm. Is there any global TripleDict beside the one in sage.categories.homset? With grep, I found none. But in this case, the values would not be morphisms but homsets.

simon-king-jena · 2013-11-16T16:19:14Z

comment:3

I just notice: #14711 is still in need of review. I am currently building the branch from there, to see if it fixes the problem from here.

simon-king-jena · 2013-11-16T17:11:24Z

comment:4

Replying to @simon-king-jena:

I just notice: #14711 is still in need of review. I am currently building the branch from there, to see if it fixes the problem from here.

No, it does not.

nbruin · 2013-11-16T18:14:37Z

comment:5

OK, looking at the referrers to the TripleDict I get a sage.structure.coerce.CoercionModel_cache_maps sage.structure.coerce.CoercionModel_cache_maps, so I guess it's not a "global" cache but one stored on an object that has an awfully long lifespan.

Correction: the values are either None or a pair of maps. So I guess the keys are not domain and codomain but parents of pairs of elements and the pair of maps are the ones that map into the common parent.

So my guess is that this cache is used when determining a common parent and then stores the maps needed to get to that. It seems to me we can mitigate this leak considerably if we DON'T store the maps here if we find a coercion from one into the other would do the job: instead store a symbolic "coerce_to_first" or "coerce_to_second" value in there.

In cases where there is a genuine third parent into which we're mapping, it's a little safer: then the codomain at least doesn't keep alive the keys. It's unclear to me how long we should be keeping the common overparent in that case. This cache will mean its life is bounded below by the shortest life time of the two "covered" parents.

simon-king-jena · 2013-11-16T18:15:58Z

Attachment: chain.png

A reference chain preventing an integer mod ring from garbage collection

simon-king-jena · 2013-11-16T18:18:28Z

comment:6

The attached picture shows that the offending reference chain starts in the module sage.functions.other (which I have never even heard of) and proceeds (via some __dict__ of 73 items) to CoercionModel_cache_maps, which then references a TripleDict.

simon-king-jena · 2013-11-16T18:20:49Z

comment:7

Indeed, sage.functions.other contains the line coercion_model = sage.structure.element.get_coercion_model(). But I think keeping a reference to the coercion model must be legal. Hence I need to look further.

simon-king-jena · 2013-11-16T18:30:53Z

comment:8

Ouch. I thought that in our coercion model, maps and actions are cached on the level of parents. Now I see that they additionally are cached in the coercion model!

Apparently this additional cache is not always used. For example, it is not used if you just do K.coerce_map_from(ZZ). But it seems that a coercion map is stored in the additional cache if you do a multiplication.

And then you have a strongly referenced TripleDict, a key (ZZ,K,None) and a morphism phi:ZZ->K as value. Even with #14711, phi would strongly reference K. ZZ is immortal. Hence, the callback for the item (ZZ,K,None):phi will never be called.

What shall we do about it? Isn't it the case that we are only storing maps that already are cached on the level of parents? Then it would be safe to just store a weak reference to this map, and the problem was solved.

simon-king-jena · 2013-11-16T18:49:49Z

comment:9

PS: Similarly for CoercionModel_cache_maps._action_maps.

simon-king-jena · 2013-11-16T19:29:18Z

comment:10

PPS: Which means that we need to make maps weakrefable (actions already are weakrefable).

simon-king-jena · 2013-11-16T19:55:08Z

Branch: u/SimonKing/ticket/15424

simon-king-jena · 2013-11-16T20:02:24Z

Author: Simon King

simon-king-jena · 2013-11-16T20:02:24Z

Commit: b052e5e

simon-king-jena · 2013-11-16T20:02:24Z

comment:12

I did not run the complete doctests, but I have added a new test that shows that I have fixed the problem. It is orthogonal to #14711.

As I said above: The idea is to only keep a weak reference from coercion model to coerce maps, which needs to make morphisms weakrefable, and which should be safe because the maps are cached on their dodomain anyway.

New commits:

`[b052e5e](https://github.com/sagemath/sagetrac-mirror/commit/b052e5e)`	`Coercion model should only store weak references to coerce maps`

simon-king-jena · 2013-11-16T20:02:24Z

Changed keywords from none to coercion model, weak reference

simon-king-jena · 2013-11-16T20:20:34Z

comment:13

Argh. Schemes seem to hate me.

Almost always when I implement something that fixes a problem, it turns out that some examples from sage.schemes will fail, because the elliptic curve code relies on what I intended to fix.

But this time it is extreme:

sage -t src/sage/schemes/elliptic_curves/ell_rational_field.py  # 652 doctests failed

simon-king-jena · 2013-11-16T20:20:34Z

Work Issues: Convince elliptic curves to not fail

nbruin · 2013-11-16T20:55:00Z

comment:15

Don't store weakrefs as values in TripleDict and MonoDict. That's what we have weakvalues=True for. Or are you afraid the callback is too expensive to handle?

simon-king-jena · 2013-11-16T22:08:28Z

comment:16

Replying to @nbruin:

Don't store weakrefs as values in TripleDict and MonoDict. That's what we have weakvalues=True for.

Oops, I forgot about this.

Or are you afraid the callback is too expensive to handle?

No. CoercionModel_cache_maps._coercion_maps stores tuples (namely: pairs of coercion maps, in both directions, hence, often (mor,None) or (None,mor)). We can't weakref the tuples. However, we could have a weakrefable extension type with two cdef attributes, so that we can store this instead.

simon-king-jena · 2013-11-16T22:14:10Z

comment:17

Replying to @simon-king-jena:

We can't weakref the tuples. However, we could have a weakrefable extension type with two cdef attributes, so that we can store this instead.

No, this won't work, because there would be no reference back to the pair, and thus it would be immediately garbage collected.

nbruin · 2013-11-16T22:22:47Z

comment:18

Found it: #14058 seems relevant. Especially read the sage-devel thread referenced there. We've thought quite a bit about this stuff before. Especially with the newly MUCH better lookup performance of TripleDict and MonoDict, we may be able to afford some more lookup indirection.

simon-king-jena · 2013-11-16T23:28:19Z

comment:19

Replying to @nbruin:

Found it: #14058 seems relevant. Especially read the sage-devel thread referenced there. We've thought quite a bit about this stuff before. Especially with the newly MUCH better lookup performance of TripleDict and MonoDict, we may be able to afford some more lookup indirection.

OK, then let us see if it fixes the problem.

simon-king-jena · 2013-11-16T23:51:40Z

comment:20

Yep, #14058 fixes the problem. Then I give a review to this ticket, saying that it is a duplicate and please be resolved as such.

simon-king-jena · 2013-11-16T23:51:40Z

Changed work issues from Convince elliptic curves to not fail to none

simon-king-jena · 2013-11-16T23:51:40Z

Changed author from Simon King to none

simon-king-jena · 2013-11-16T23:51:40Z

Reviewer: Simon King

jdemeyer · 2013-11-21T13:02:54Z

Changed branch from u/SimonKing/ticket/15424 to none

jdemeyer · 2013-11-21T13:02:54Z

Changed commit from b052e5e to none

simon-king-jena added this to the sage-5.13 milestone Nov 15, 2013

simon-king-jena added t: bug labels Nov 15, 2013

simon-king-jena added the s: needs review label Nov 16, 2013

simon-king-jena added s: needs work and removed s: needs review labels Nov 16, 2013

simon-king-jena added s: positive review and removed s: needs work labels Nov 16, 2013

simon-king-jena removed this from the sage-5.13 milestone Nov 16, 2013

jdemeyer added r: duplicate and removed p: major / 3 labels Nov 21, 2013

jdemeyer closed this as completed Nov 21, 2013

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

A coercion-related memory leak #15424

A coercion-related memory leak #15424

simon-king-jena commented Nov 15, 2013

nbruin commented Nov 16, 2013

simon-king-jena commented Nov 16, 2013

simon-king-jena commented Nov 16, 2013

simon-king-jena commented Nov 16, 2013

nbruin commented Nov 16, 2013

simon-king-jena commented Nov 16, 2013

simon-king-jena commented Nov 16, 2013

simon-king-jena commented Nov 16, 2013

simon-king-jena commented Nov 16, 2013

simon-king-jena commented Nov 16, 2013

simon-king-jena commented Nov 16, 2013

simon-king-jena commented Nov 16, 2013

simon-king-jena commented Nov 16, 2013

simon-king-jena commented Nov 16, 2013

simon-king-jena commented Nov 16, 2013

simon-king-jena commented Nov 16, 2013

simon-king-jena commented Nov 16, 2013

simon-king-jena commented Nov 16, 2013

nbruin commented Nov 16, 2013

simon-king-jena commented Nov 16, 2013

simon-king-jena commented Nov 16, 2013

nbruin commented Nov 16, 2013

simon-king-jena commented Nov 16, 2013

simon-king-jena commented Nov 16, 2013

simon-king-jena commented Nov 16, 2013

simon-king-jena commented Nov 16, 2013

simon-king-jena commented Nov 16, 2013

jdemeyer commented Nov 21, 2013

jdemeyer commented Nov 21, 2013

A coercion-related memory leak #15424

A coercion-related memory leak #15424

Comments

simon-king-jena commented Nov 15, 2013

nbruin commented Nov 16, 2013

simon-king-jena commented Nov 16, 2013

simon-king-jena commented Nov 16, 2013

simon-king-jena commented Nov 16, 2013

nbruin commented Nov 16, 2013

simon-king-jena commented Nov 16, 2013

simon-king-jena commented Nov 16, 2013

simon-king-jena commented Nov 16, 2013

simon-king-jena commented Nov 16, 2013

simon-king-jena commented Nov 16, 2013

simon-king-jena commented Nov 16, 2013

simon-king-jena commented Nov 16, 2013

simon-king-jena commented Nov 16, 2013

simon-king-jena commented Nov 16, 2013

simon-king-jena commented Nov 16, 2013

simon-king-jena commented Nov 16, 2013

simon-king-jena commented Nov 16, 2013

simon-king-jena commented Nov 16, 2013

nbruin commented Nov 16, 2013

simon-king-jena commented Nov 16, 2013

simon-king-jena commented Nov 16, 2013

nbruin commented Nov 16, 2013

simon-king-jena commented Nov 16, 2013

simon-king-jena commented Nov 16, 2013

simon-king-jena commented Nov 16, 2013

simon-king-jena commented Nov 16, 2013

simon-king-jena commented Nov 16, 2013

jdemeyer commented Nov 21, 2013

jdemeyer commented Nov 21, 2013