Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FIX: Numpy pre-release accommodations #700

Merged
merged 25 commits into from
Dec 31, 2018
Merged

Conversation

effigies
Copy link
Member

@effigies effigies commented Dec 10, 2018

This has turned into something of an omnibus PR, which does the following things:

Closes #702.


Original

It looks like we've been taking advantage of numpy letting us declare a memory block writeable even when the underlying object is ostensibly immutable. This has allowed us to avoid copies for objects that we knew wouldn't be accessed through their original interfaces, but they decided this was a bug and closed this loophole in numpy/numpy#11739.

Don't know what else to do but make the copies they think we should be making, but perhaps others have more numpy arcana to share.

Fixes #697.

@matthew-brett
Copy link
Member

For volumeutils, I think this only represents a change in behavior for bz2 files on Python 2.7, because:

  • we first check if the file-like object has a readinto method, and use it if so, to read into a bytearray. This is mutable. GzipFile, io.BytesIO and standard file objects have readinto in Python 2.7 and >=3.5. BZ2File also had this method for Python >= 3.5, but not for Python 2.7. So bz2 files fall through to the next check, on Python 2.7.
  • Then we check if the file is in the SAFE_STRINGERS tuple; this only includes GzipFile and BZ2File;
  • If so, then we attempt to avoid the copy. Here's where the BZ2File objects will cause a new double copy, because we cannot set the read contents as writeable.

The neater fix, for volumeutils, would be to drop the SAFE_STRINGERS check. This will make the code simpler at the expense of making BZFile objects copy for numpies, where we can set the writeable flag.

@matthew-brett
Copy link
Member

For the trackvis reader, we could probably avoid nearly all the memory inflation by using the readinto trick that volumeutils is already using.

@matthew-brett
Copy link
Member

This suggestion would help - we could depend on bz2file on Python 2.7, and then drop the SAFE_STRINGERS logic, with no loss of memory performance on any Numpy version.

nibabel/info.py Outdated
@@ -209,4 +209,5 @@ def cmp_pkg_version(version_str, pkg_version_str=__version__):
ISRELEASE = _version_extra == ''
VERSION = __version__
PROVIDES = ["nibabel", 'nisext']
REQUIRES = ["numpy (>=%s)" % NUMPY_MIN_VERSION]
REQUIRES = ["numpy (>=%s)" % NUMPY_MIN_VERSION,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We still need to use the bz2file.BZ2File object instead of the the bz2.BZ2File object, in openers, for Python 2. Maybe:

import sys
if sys.version_info[0] < 3:
    from bz2file import BZ2File
else:
    from bz2 import BZ2File

and so on, maybe importing from here for other uses of BZ2File. I can't think immediately of a test for saving memory here, can you?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As in to avoid an early import? pkgutil.find_loader() is a Python 2/3-safe way to see if a module can be imported without actually importing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, sorry, the bz2file package provides an alternative BZ2File implementation, it does not replace the implementation in the bz2 package. If you check, I think you'll find that bz2.BZ2File still doesn't have readinto after you've installed bz2file, but bz2file.BZ2File does have readinto (on Python 2.7).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, reparsed. Yeah, I still need to make that change. On the whole, this PR just isn't ready. To get tests working again, I still need to figure out why we can't use version selectors in install_requires. I suspect it's a distutils or setuptools problem.

```
/Users/arokem/.virtualenvs/afq/lib/python3.7/site-packages/nibabel/streamlines/trk.py:562: DeprecationWarning: The binary mode of fromstring is deprecated, as it behaves surprisingly on unicode inputs. Use frombuffer instead
  header_rec = np.fromstring(string=header_str, dtype=header_2_dtype)
```
@coveralls
Copy link

coveralls commented Dec 21, 2018

Coverage Status

Coverage decreased (-0.004%) to 91.846% when pulling ea1b0cc on effigies:fix/writable_bug into d9c864b on nipy:master.

@codecov-io
Copy link

codecov-io commented Dec 21, 2018

Codecov Report

Merging #700 into master will decrease coverage by 0.01%.
The diff coverage is 92.3%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #700      +/-   ##
==========================================
- Coverage   88.88%   88.86%   -0.02%     
==========================================
  Files          93       93              
  Lines       11449    11455       +6     
  Branches     1892     1894       +2     
==========================================
+ Hits        10176    10180       +4     
- Misses        933      934       +1     
- Partials      340      341       +1
Impacted Files Coverage Δ
nibabel/volumeutils.py 92.75% <100%> (-0.03%) ⬇️
nibabel/streamlines/trk.py 94.31% <100%> (+0.01%) ⬆️
nibabel/gifti/parse_gifti_fast.py 84.51% <100%> (ø) ⬆️
nibabel/streamlines/tck.py 98.9% <100%> (+0.01%) ⬆️
nibabel/info.py 100% <100%> (ø) ⬆️
nibabel/nifti1.py 91.22% <100%> (ø) ⬆️
nibabel/openers.py 75.8% <71.42%> (-0.67%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d9c864b...ea1b0cc. Read the comment docs.

@effigies effigies changed the title FIX: Immutable buffers should not be set writeable FIX: Numpy pre-release accommodations Dec 21, 2018
@effigies
Copy link
Member Author

@arokem I ended up needing your Opener.readinto to resolve the streamlines complaints, so I just merged your PR into this one.

(Moved here from the edited top post, since I'm not sure if adding a handle in an edit notifies people.)

@effigies
Copy link
Member Author

@matthew-brett This one is ready for review.

Copy link
Member

@matthew-brett matthew-brett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great - thanks - just a couple of comments.

header_str = f.read(header_2_dtype.itemsize)
header_rec = np.fromstring(string=header_str, dtype=header_2_dtype)

# Read the header into a bytearray.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be worth a comment explaining use of readinto.

nisext/sexts.py Outdated
@@ -191,7 +191,6 @@ def version_getter(pkg_name):
optional,
dependency)
return
_add_append_key(setuptools_args, 'install_requires', dependency)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it's just possible someone else is using nisext. Any way to avoid the API change?

@effigies
Copy link
Member Author

@matthew-brett Any more comments?

@effigies
Copy link
Member Author

@matthew-brett Going to merge this and start the 2.3.2 process. If you have further concerns, can you open a PR with a suggested fix?

@effigies effigies merged commit a13de6e into nipy:master Dec 31, 2018
@effigies effigies deleted the fix/writable_bug branch December 31, 2018 17:38
@matthew-brett
Copy link
Member

Good call - looks fine to me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Numpy pre-release breaks setting some arrays writable
5 participants