-
Notifications
You must be signed in to change notification settings - Fork 260
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: Support TCK streamlines file format #486
Conversation
Codecov Report
@@ Coverage Diff @@
## master #486 +/- ##
=========================================
+ Coverage 94.05% 94.25% +0.2%
=========================================
Files 166 177 +11
Lines 22044 24273 +2229
Branches 2345 2604 +259
=========================================
+ Hits 20733 22878 +2145
- Misses 877 920 +43
- Partials 434 475 +41
Continue to review full report at Codecov.
|
@jchoude @arnaudbore @matthew-brett @FrancoisRheaultUS @samuelstjean |
Can have a quick look after ismrm deadline. |
So far:
Be really carefull with the software that you use to visualise your streamlines Everything works perfectly (Great job @MarcCote) Again, great job, very helpful ! |
Just had a very brief look into this (why did you have to pick the week before ISMRM deadline?!?). I can't verify whether your handling of the coordinate system matches the expectations of the format, but yes, the positions are expected to be given in world/scanner/real coordinates in mm units, assuming the standard NIfTI coordinate system. Assuming that's what you mean by the 'world space' ( Only one minor thing to add: we tend to set the offset to the data such that it guarantees memory alignment for 32-bit floats when accessed via memory-mapping, just in case that affects performance (a lot of CPU optimisations rely on memory alignment). That means it's set to the lowest multiple of 4 bytes that exceeds your computed offset. It doesn't look like that's what you're doing? All good otherwise, but like I said, I've only skimmed through the code... |
@arnaudbore @jdtournier: Thank you both for your time (especially with ISMRM deadline closing in!). @jdtournier: You are correct there is no memory alignment right now but I'll fix that and ping you once it is done (after ISMRM deadline). @jdtournier: Is there any convention in MRtrix to save information alongside streamlines and streamlines' coordinates? For instance, some information we might want to keep are: the mean FA (or a bunch of metrics) for each streamline, some ID/name of the bundle a streamline belongs to, or simply a different RGB color for each point of the streamlines? In Trackvis/.trk those are known as @jdtournier: Regarding the 'world space When saving a TCK file, I make sure the streamlines are in ras+mm (world/scanner/real coordinates). See https://github.com/nipy/nibabel/pull/486/files#diff-14a106e7412b371d3c333051564ccd3eR205 |
OK, good to hear wrt the alignment issue. It's not essential though, it's just that it might make a difference in terms of performance, but I have to admit that so far, I don't think I've noticed any particular issues on that front. It's probably more relevant for images. As to saving per vertex/streamline information, this is something that we discussed extensively a while back. We opted for storing this information separately in associated files, either track scalar files (for the per-vertex case, e.g. as might be produced by tcksample) or straight ASCII files (for the per-streamline case, e.g. as used for the SIFT2 weights). We figured that maintaining compatibility with the simple existing format, and the flexibility of separating the two types of information to be sufficient reasons not to complicate the format (for example, we might want to have multiple per-vertex scalar information for the same streamlines, in which case storing everything in a single file means duplicating most of the contents of the file needlessly, etc). So if you want to add support for importing per vertex/streamline scalars, that's how we handle it in MRtrix3... Just checked your definitions, and that matches my expectations. However, whether the origin of the coordinate system corresponds to the centre or corner of the voxel is unrelated to the streamlines storage - that really is about the image storage convention. On that front, we also assume that the voxel position at (0,0,0) corresponds to the centre of the voxel, that's the standard adopted in both DICOM and NIfTI, and that's how images are displayed in MRView. But by design, in MRtrix, the streamlines are stored in scanner coordinates to make them independent of whatever image was used to generate them. So I'm slightly puzzled by your question. In MRtrix, the world coordinate (0,0,0) is the origin of the world/scanner coordinate system, irrespective of whether you're taking about streamlines or images, and whatever image you might be talking about. A given image's transformation matrix is what tells you how to figure out the position of its voxel centres in the world/scanner coordinate system. But it is absolutely not the case that the "streamline coordinate (0,0,0) refers to the center of the voxel" - it refers to the isocentre of the scanner, which will rarely be aligned with any particular voxel. Just making sure we're on the same page...? |
Where are we with this one? Needs a rebase and PEP8 fixes at some point. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry - I got confused in the reading code, would you mind explaining a bit?
@@ -0,0 +1,440 @@ | |||
from __future__ import division |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add module level docstring describing format and pointing to format docs.
nibabel/streamlines/tck.py
Outdated
---------- | ||
tractogram : :class:`Tractogram` object | ||
Tractogram that will be contained in this :class:`TckFile`. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No blank space between parameter defs
nibabel/streamlines/tck.py
Outdated
with Opener(fileobj) as f: | ||
magic_number = f.fobj.readline() | ||
f.seek(-len(magic_number), os.SEEK_CUR) | ||
return magic_number.strip() == cls.MAGIC_NUMBER |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe nicer to dedent this line.
nibabel/streamlines/tck.py
Outdated
except StopIteration: | ||
# Empty tractogram | ||
header[Field.NB_STREAMLINES] = 0 | ||
# Overwrite header with updated one. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make this into a local function def finish_hdr():
etc, call here and at the end of the method, to avoid code duplication?
|
||
Parameters | ||
---------- | ||
fileobj : file-like object |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this file must be open for writing as binary? Note in the docstring?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point.
nibabel/streamlines/tck.py
Outdated
offset += 1 # +1, we need one more character for that new digit. | ||
|
||
fileobj.write(asbytes(str(offset) + "\n")) | ||
fileobj.write(asbytes(b"END\n")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Already bytes.
nibabel/streamlines/tck.py
Outdated
raise HeaderError("Missing 'file' attribute in TCK header.") | ||
|
||
# Set endianness and _dtype attributes in the header. | ||
hdr[Field.ENDIANNESS] = '<' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about:
hdr[Field.ENDIANNESS] = '>' if hdr['datatype'].endswith('BE') else '<'
nibabel/streamlines/tck.py
Outdated
|
||
i = 0 | ||
|
||
while not eof or not np.all(np.isinf(pts)): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand the not np.all(np.isinf(pts))
part of this - can you explain?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the TCK format, the end of file is defined as inf inf inf
. In my code, the variable eof
indicates if there are data left to read (i.e. put in the buffer) in the file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a comment here with that information (about inf, inf, inf)?
nibabel/streamlines/tck.py
Outdated
" might be corrupted.") | ||
raise DataError(msg) | ||
|
||
# Otherwise read a bit more. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What will happen if eof == True
and np.all(np.isinf(pts)) == True
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then the while loop should terminate. It means the only thing that is left in the buffer is the Inf triple which indicates the end of the TCK file. Did I miss something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No I don't think you missed anything - I see now that we'll go back to the while condition, then exit the loop.
nibabel/streamlines/tck.py
Outdated
eof = len(bytes_read) == 0 | ||
|
||
# Read floats. | ||
pts = np.frombuffer(buff, dtype=dtype) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for my confusion, but won't this code re-cast the first block of data over and over, for each iteration of the while loop?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only a small portion will get re-casted. It would correspond to the last sequence of points that is partially contained in buff
.
In short, the code is doing the following.
- Read a chunk of data from the file (a buffer).
- Determine how many points we've read (i.e. nb. of triples [NaN, NaN, NaN])
3.Yield points, one streamline at the time. - Flush the beginning of the buff (that is up to the last NaN triple we've detected).
- Goto step 1 (making sure we concatenate the new buffer to what is remaining of the current buffer) until we detect a Inf triple (or there is nothing to read).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah - I see - you are dropping the read part of buff at the end of the loop.
@MarcCote - where are we with this one? |
@matthew-brett sorry for the delay. I'll have more time in the following weeks :). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @MarcCote and sorry for the delay. I mainly checked the tck code, not the tests. I had other comments that I began to enter, but @matthew-brett was quicker than me :) I deleted them, I hope they don't show up as duplicate.
nibabel/streamlines/tck.py
Outdated
|
||
|
||
def create_empty_header(): | ||
''' Return an empty compliant TCK header. ''' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't those be double quotes? """
nibabel/streamlines/tck.py
Outdated
|
||
lines = [] | ||
lines.append(header[Field.MAGIC_NUMBER]) | ||
lines.append("count: {0:010}".format(header[Field.NB_STREAMLINES])) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What will happen if someone tries to store more than 10M streamlines in this file? Not a very frequent case for now, but some papers suggest we need to do more than that.
I know that the issue is that you want to avoid shifting the whole content of the file if you allow adding streamlines to the file. Maybe this should be adjusted in a followup PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually it would need more than 10 Billions streamlines. If that is an issuewe can simply bump this number up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oups, my bad. I think we're safe for now :)
nibabel/streamlines/tck.py
Outdated
raise HeaderError("Missing 'datatype' attribute in TCK header.") | ||
|
||
if not hdr['datatype'].startswith('Float32'): | ||
msg = ("TCK only supports float32 dtype but 'dataype: {}' was" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dataype -> datatype
hdr[Field.MAGIC_NUMBER] = magic_number | ||
|
||
# Check integrity of TCK header. | ||
if 'datatype' not in hdr: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this really mandatory? Might not be the best comparison, but for example your old TractConverter tool only wrote the count
field. All other fields were discarded.
I think it is useful to have it, but should you really reject streamlines created in another tool that don't have it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, I could raise a warning and try loading them using Float32LE, i.e. the only dtype supported right now anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think that would be better.
nibabel/streamlines/tck.py
Outdated
" specified in the header.").format(hdr['datatype']) | ||
raise HeaderError(msg) | ||
|
||
if 'file' not in hdr: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same question here: is this really mandatory?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. Now, raising a warning instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking about this one, the file
field indicates the offset where the data starts. In the case the fieldwas not provided, should I assume the data starts directly after END ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would guess that's the best way to go, yes. Use it if there, else, use that heuristic.
Thanks for the review @jchoude. @matthew-brett I've addressed you last comments and increase test coverage. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Last couple of changes.
nibabel/streamlines/tck.py
Outdated
MEGABYTE = 1024 * 1024 | ||
|
||
|
||
def create_empty_header(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think about my suggestion to make this a method of the TckFile class?
See: #486 (comment)
nibabel/streamlines/tck.py
Outdated
Parameters | ||
---------- | ||
fileobj : string or file-like object | ||
If string, a filename; otherwise an open file-like object |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add something like a file-like object opened for binary reads, pointing to
...
What do you think of
Streamlines of the tractogram are assumed to be in *RAS+* and *mm*
space. It is also assumed that when streamlines are mapped back to
voxel space, a streamline point located at an integer coordinate
(i,j,k) is considered to be at the center of the corresponding
voxel.
This is in contrast with TRK's internal convention where it would
have referred to a corner.
2017-04-06 13:06 GMT-0400 Matthew Brett <[email protected]>
…
***@***.**** commented on this pull request.
------------------------------------------------------------------------
In nibabel/streamlines/tck.py
<#486 (comment)>:
> + FIBER_DELIMITER = np.array([[np.nan, np.nan, np.nan]], '<f4')
+ EOF_DELIMITER = np.array([[np.inf, np.inf, np.inf]], '<f4')
+
+ def __init__(self, tractogram, header=None):
+ """
+ Parameters
+ ----------
+ tractogram : :class:`Tractogram` object
+ Tractogram that will be contained in this :class:`TckFile`.
+ header : dict (optional)
+ Metadata associated to this tractogram file.
+
+ Notes
+ -----
+ Streamlines of the tractogram are assumed to be in *RAS+*
+ and *mm* space where coordinate (0,0,0) refers to the center
How about |where coordinate (i, j, k) refers to the center of the voxel|.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#486 (comment)>, or
mute the thread
<https://github.com/notifications/unsubscribe-auth/AAoSJJT-wgbrt72J7nqvGOFdnQzfN3Rbks5rtRuRgaJpZM4J7SeH>.
|
Sure, streamlines phrasing is fine. |
@matthew-brett I moved |
Move the creation of the empty streamline header into the kinda-abstract base class. Adjust the Trk file header creation accordingly.
When making the default trk header structured array, make a numpy array scalar instead of a 1D array. Adjust the code accordingly. Take opportunity to rename header to st_arr to make clear it is not the 'header' of `self.header` (which is a dict).
TractogramFile now implements create_empty_header
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few comments. See also MarcCote#13
nibabel/streamlines/tck.py
Outdated
Parameters | ||
---------- | ||
fileobj : string or file-like object | ||
If string, a filename; otherwise an open file-like object |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For some reason I can't see the modification in this interface, or the code on my local machine - any idea what's going on?
nibabel/streamlines/tck.py
Outdated
Parameters | ||
---------- | ||
fileobj : string or file-like object | ||
If string, a filename; otherwise an open file-like object |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"open file-like object in binary mode" or similar as above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know what happened, but you are right,t it was missing some docstrings that weren't specifying binary mode. Should be fixed by commit 0d0ece1.
MRG: refactoring for tractogram headers
Thanks for your patience - in it goes. |
This PR adds support for MRtrix streamlines file to the Streamlines API (#391).
Eventually, in a future PR, I will add some functions that allows easy conversion between all different formats. For now, going from TRK to TCK is easy. See
https://gist.github.com/MarcCote/ea6842cc4c3950f7596fc3c8a0be0154
Converting from TCK to TRK requires additional information from a reference anatomy. See
https://gist.github.com/MarcCote/ebf0ae83a9202eb96d1dfbb949cd98af
EDIT: Added links to some Gist code for streamlines conversion.