Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make MonoDictEraser and TripleDictEraser safe against "recursion depth exceeded" #15069

Closed
simon-king-jena opened this issue Aug 20, 2013 · 12 comments

Comments

@simon-king-jena
Copy link
Member

The following happens both with Sage's MonoDict and with Python's weakref.WeakKeyDictionary:

from sage.structure.coerce_dict import MonoDict
M = MonoDict(11)

class A: pass
a = A()
prev = a

for i in range(1000):
    newA = A()
    M[prev] = newA
    prev = newA

len(M)
del a
Exception RuntimeError: 'maximum recursion depth exceeded while calling a Python object' in <sage.structure.coerce_dict.MonoDictEraser object at 0x5a13788> ignored

Replace M = MonoDict(11) by M = weakref.WeakKeyDictionary(), and you get essentially the same error:

sage: del a
Exception RuntimeError: 'maximum recursion depth exceeded while calling a Python object' in <function remove at 0x5f9d578> ignored

However, a weakref.WeakValueDictionary seems safe:

sage: class A: pass
sage: M = weakref.WeakValueDictionary()
sage: a = A()
....: prev = a
....: for i in range(1000):
....:     newA = A()
....:     M[newA] = prev
....:     prev = newA
....:     
sage: len(M)
1000
sage: del a
sage: len(M)
0

even though the recursive deletion of dictionary items seems similar.

Aim of this ticket: Make it so that the erasers of MonoDict and TripleDict are not recursively called and thus the dreaded 'maximum recursion depth exceeded ... ignored' finally vanishes.

CC: @nthiery @nbruin @vbraun

Component: performance

Author: Simon King

Reviewer: Volker Braun

Merged: sage-5.12.beta4

Issue created by migration from https://trac.sagemath.org/ticket/15069

@simon-king-jena
Copy link
Member Author

comment:1

Comments 97--102 at #10963 provide some ideas how to solve the problem.

Actually I originally thought that the solution is easy: When the eraser of a MonoDict key is called, the to-be-deleted value shall be assigned to a temporary variable and thus be prevented from being deleted while the corresponding items from the bucket of the MonoDict are deleted; the value can only be deleted (causing a new invocation of the eraser) when the original call to the eraser is done and the temporary variable vanishes.

Inserting print statements shows that in unpatched Sage the eraser calls are nested, which explains the "recursion depth exceeded". And in fact, the trick with the temporary variable has the effect that the eraser calls are now neatly ordered and not nested. And now comes the punch-line: The "recursion depth exceeded" error does not vanish!!!

The comments at #10963 seem to show that the error behaves different if the class A in the example from the ticket description is a Python or a Cython class. But there is a recursion depth problem in either case.

@simon-king-jena
Copy link
Member Author

comment:2

Defining

class A:                  
    def __del__(self):
        print "__del__",id(self)

and keeping track of the order in which the elements are created by

L = []
a = A()
prev = a
for i in range(1000):
    L.append(id(prev))
    newA = A()
    M[prev] = newA
    prev = newA
del a

I see that the n-th element created (the first one is a) is the "last but n"-th element for which __del__ is called. This comes as a surprise. One should think that del a first makes a disappear, which then triggers a call to the "eraser" callback, which then makes M[a] disappear, which triggers the next "eraser" callback.

@simon-king-jena
Copy link
Member Author

comment:3

Replying to @simon-king-jena:

I see that the n-th element created (the first one is a) is the "last but n"-th element for which __del__ is called. This comes as a surprise.

Or not? The weakref module says:

Weak references to an object are cleared before the object's __del__() is called, to ensure that the weak reference callback (if any) finds the object still alive.

Hm. I understand that a.__del__() is executed only after the callback of the weak reference to a has finished. But by print statements I found that the callback for a is done before the callback for M[a] is invoked! So, why is a.__del__() not executed directly after the first callback?

@simon-king-jena
Copy link
Member Author

comment:4

It seems that it is not enough to keep a reference to the value of the key-value pair being deleted by the eraser---but if the eraser returns the value, then I see the recursion depth error disappear. Hence, with this patch

diff --git a/sage/structure/coerce_dict.pyx b/sage/structure/coerce_dict.pyx
--- a/sage/structure/coerce_dict.pyx
+++ b/sage/structure/coerce_dict.pyx
@@ -187,14 +187,17 @@
         h,offset = r.key
         cdef list bucket = <object>PyList_GET_ITEM(buckets, (<size_t>h) % PyList_GET_SIZE(buckets))
         cdef Py_ssize_t i
+        cdef object val
         for i from 0 <= i < PyList_GET_SIZE(bucket) by 3:
             if PyInt_AsSsize_t(PyList_GET_ITEM(bucket,i))==h:
                 if PyList_GET_ITEM(bucket,i+offset)==<void *>r:
+                    val = <object>PyList_GET_ITEM(bucket,i+2)
                     del bucket[i:i+3]
                     D._size -= 1
                     break
                 else:
                     break
+        return val
 

seems to work. Doing tests now...

@simon-king-jena
Copy link
Member Author

Author: Simon King

@simon-king-jena
Copy link
Member Author

comment:5

Attachment: trac15069-recursion_safe_callback.patch.gz

With the patch that I have attached, the "recursion depth exceeded" problem seems to vanish. Let's see if the full test suite passes.

@simon-king-jena
Copy link
Member Author

comment:6

The tests pass for me. Needs review, then!

@vbraun
Copy link
Member

vbraun commented Aug 21, 2013

comment:7

I don't think Python makes any guarantees that this is safe. The issue is precisely where the object is deallocated. As you noted, a local variable doesn't cut it as it gets destroyed while we are still in the "remove' function frame. A return value apparently works, but that is just an implementation detail of CPython. A different Python version might very well destroy returned-but-discarded values while still in the function frame.

Its still better than before, so if you want to run with this workaround then that is fine with me.

@nthiery
Copy link
Contributor

nthiery commented Aug 21, 2013

comment:8

Replying to @vbraun:

I don't think Python makes any guarantees that this is safe. The issue is precisely where the object is deallocated. As you noted, a local variable doesn't cut it as it gets destroyed while we are still in the "remove' function frame. A return value apparently works, but that is just an implementation detail of CPython. A different Python version might very well destroy returned-but-discarded values while still in the function frame.

Its still better than before, so if you want to run with this workaround then that is fine with me.

In the worst case we will get an explicit recursion error, some
objects won't get deleted, and the calculation will proceed safely,
right?

If yes, then I agree it's not yet perfect but safe enough to go
ahead. Thanks Simon!

I don't feel quite qualified for the fine technical details. Volker,
if you give me a green light, I am happy finishing the rest of the
review.

Cheers,
Nicolas

@vbraun
Copy link
Member

vbraun commented Aug 21, 2013

Reviewer: Volker Braun

@vbraun
Copy link
Member

vbraun commented Aug 21, 2013

comment:9

At least we'll have a doctest, so if it breaks in the future we'll be notified ;-)

@jdemeyer
Copy link
Contributor

Merged: sage-5.12.beta4

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants