Adapt code to use NestedTensor #313

bubas3000 · 2020-12-03T14:21:15Z

I have a model where I would love to use NestedTensor, I have a lot of padding going on and nested tensors would save a lot of memory, the net where I would like to use them is composed by a linear layer followed by batchnorm and Relu, finally a max operation is done over the channels.

Foward looks like this
def forward(self, inputs):

    x = self.linear(inputs)
    x = self.norm(x.permute(0, 2, 1).contiguous()).permute(0, 2, 1).contiguous()
    x = F.relu(x)
    x_max = torch.max(x, dim=1, keepdim=True)[0]
    return x_max

Is it possible to use Nested Tensors? The project supports python 3.6+, pytorch 0.4.1+.

Thank you in advance

The text was updated successfully, but these errors were encountered:

cpuhrsch · 2020-12-03T15:18:46Z

Hello @bubas3000,

Yes! This should be possible :)

What would be the shape of inputs here? I can then run your snippet and make sure all ops are implemented.

Thanks,
Christian

bubas3000 · 2020-12-03T19:11:23Z

Hello @cpuhrsch ,

Thanks for your quick reply!

The shape of inputs is (12000,100,9) I will try to explain what each one means to be more clear.
Basically I have a matrix of 12000 voxels (3d pixels) where each voxel as 100 points with 9 dimensions. In reality almost every voxel has less than 100 points so i wanted to have a nested tensor with 12000 voxels with variable number of points.

The linear layer transforms the 9 dimensions to 64 (is done pointwise).

Thank you for your help,
Afonso

bubas3000 · 2020-12-03T20:18:24Z

I forgot to mention that norm is BatchNorm1d.

One more question if you allow me. How is the time performance of nested tensors in comparison to normal torch tensors?

Thank you once again,
Afonso

cpuhrsch · 2020-12-04T16:04:48Z

Hello @bubas3000,

I wrote up a codesnippet and am now working on adding ops required to do this. For now this is without autograd support, which will follow in another PR. Here is the snippet from the PR referenced in this issue

        linear = nn.Linear(9, 64)
        norm = nn.BatchNorm1d(64)
        # 3 voxel with 40, 50 and 90 points respectively
        x = ntnt([torch.randn(i, 9) for i in [40, 50, 90]])
        x = linear(x)
        x = norm(x.transpose(2, 1).contiguous()).transpose(2, 1).contiguous()
        x = F.relu(x)
        x_max = torch.max(x, dim=1, keepdim=True)[0]

Does this align with your goals?

Thanks,
Christian

bubas3000 · 2020-12-05T00:09:26Z

Hello @cpuhrsch ,

That's what I am looking for, thank you! Is it expected to have autograd support soon or should I try to do it "by hand"?
I will begin to work on the changes I have to make to use Nested Tensor.

Thank you once more,
Afonso

cpuhrsch · 2020-12-05T17:30:27Z

Hello @bubas3000,

Autograd is already supported, but I need to double check all backward passes have been implemented. The forward PR was merged, so I'm doing that next now.

Regarding time performance, most of these kernels are currently still implemented as for-loops. However, let me trace through the ops you're using and see if we can implement a fast-path for those shapes.

As an aside, BatchNorm1d will be the least likely to match performance of a regular torch.Tensor, because PyTorch calls into cudnn's highly optimizes version of it. To support irregular shapes BatchNorm1d here is implemented via regular math operators.

Thank you,
Christian

bubas3000 · 2020-12-05T22:19:26Z

Hello @cpuhrsch ,

Using for-loops will definitely hurt my time performance... I can try to run without BatchNorm1d if you think it would help.

Thank you,
Afonso

bubas3000 · 2020-12-05T22:22:24Z

Ps tried to run this snippet and I got the following error:

Traceback (most recent call last):
File "a.py", line 26, in
x_max = torch.max(x, dim=1, keepdim=True)[0]
File "/home/afonso/anaconda3/envs/teste/lib/python3.7/site-packages/nestedtensor/nested/nested.py", line 440, in torch_function
return _wrap_result(func(*impl_args, **impl_kwargs))
RuntimeError: Internal error: NestedTensorImpl doesn't support sizes. Please file an issue on https://github.com/pytorch/nestedtensor

Hello @bubas3000,

I wrote up a codesnippet and am now working on adding ops required to do this. For now this is without autograd support, which will follow in another PR. Here is the snippet from the PR referenced in this issue
        linear = nn.Linear(9, 64)
        norm = nn.BatchNorm1d(64)
        # 3 voxel with 40, 50 and 90 points respectively
        x = ntnt([torch.randn(i, 9) for i in [40, 50, 90]])
        x = linear(x)
        x = norm(x.transpose(2, 1).contiguous()).transpose(2, 1).contiguous()
        x = F.relu(x)
        x_max = torch.max(x, dim=1, keepdim=True)[0]
Does this align with your goals?

Thanks,
Christian

cpuhrsch · 2020-12-06T18:12:44Z

Hello @bubas3000,

Are you using the most recent commit? If you're using the binaries, make sure to force a clean reinstall to get the newest ones (they get automatically rebuilt over night). You can print the version+hash via print(nestedtensor.version.__version__). Yours is the error I got before #316 merged.

Thanks,
Christian

bubas3000 · 2020-12-17T16:39:48Z

Hi @cpuhrsch ,
I was able to run it! Thank you for your help, I ended up using an 1D tensor and a tensor of indexes using scatter_max to compute the maximum. It is much faster.
However, I believe Nested tensors can be very valuable to deep learning once they are fully optimized.
I would like to mention them in my thesis, is there any paper on Nested tensors? Or should I cite this git?

Thank you,
Afonso

cpuhrsch · 2020-12-17T21:02:40Z

Hello @bubas3000,

I'm happy to hear that! It's enough to cite this git, there is no paper yet.

Would you be willing to share your solution? We can use that as a baseline for future performance improvements.

Thank you,
Christian

cpuhrsch mentioned this issue Dec 4, 2020

Check operator coverage (forward only) for issue 313 #316

Merged

cpuhrsch mentioned this issue Dec 5, 2020

Add operator coverage (backward only) for issue 313 #317

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adapt code to use NestedTensor #313

Adapt code to use NestedTensor #313

bubas3000 commented Dec 3, 2020

cpuhrsch commented Dec 3, 2020

bubas3000 commented Dec 3, 2020 •

edited

Loading

bubas3000 commented Dec 3, 2020 •

edited

Loading

cpuhrsch commented Dec 4, 2020 •

edited

Loading

bubas3000 commented Dec 5, 2020

cpuhrsch commented Dec 5, 2020

bubas3000 commented Dec 5, 2020

bubas3000 commented Dec 5, 2020

cpuhrsch commented Dec 6, 2020 •

edited

Loading

bubas3000 commented Dec 17, 2020

cpuhrsch commented Dec 17, 2020

Adapt code to use NestedTensor #313

Adapt code to use NestedTensor #313

Comments

bubas3000 commented Dec 3, 2020

cpuhrsch commented Dec 3, 2020

bubas3000 commented Dec 3, 2020 • edited Loading

bubas3000 commented Dec 3, 2020 • edited Loading

cpuhrsch commented Dec 4, 2020 • edited Loading

bubas3000 commented Dec 5, 2020

cpuhrsch commented Dec 5, 2020

bubas3000 commented Dec 5, 2020

bubas3000 commented Dec 5, 2020

cpuhrsch commented Dec 6, 2020 • edited Loading

bubas3000 commented Dec 17, 2020

cpuhrsch commented Dec 17, 2020

bubas3000 commented Dec 3, 2020 •

edited

Loading

bubas3000 commented Dec 3, 2020 •

edited

Loading

cpuhrsch commented Dec 4, 2020 •

edited

Loading

cpuhrsch commented Dec 6, 2020 •

edited

Loading