-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support a to_torchscript function on the LightningModule #3080
Comments
I like this proposal and it shall be quite straight forward, mind send a PR? 🐰 |
@Borda regarding the API, what if someone wanted to expose multiple torchscript modules from their lightning module. E.g. their lightning module does GAN training, and they want to export both the generator and discriminator separately. Should the interface allow for multiple scripted modules to be returned? |
I think that would be super useful in these cases! How about something like model.to_torchscript() # exports whole module
model.to_torchscript(children=["discriminator", "generator"]) # returns the list of script modules or should that just be part of the override the user can do, without any argument passed to the method? |
so, the PL module would need attrs with discriminator and generator no? i think this makes sense as long as we specify that. @ananthsub what about transforms? or a data pipeline the user might also need when exporting to torchscript/onnx. how are you thinking about that? |
Yeah, the children paraneter is good =) |
Yes, combining with transforms is an important need. In order to serve models, we will also extract the transforms from the data module and "glue" them into the forward function (assuming the transforms are also TorchScriptable)
I think it'd be easier to define these on the pl_module outside of this function. I'm hesitant to make this take arbitrary args and add magic inside of the base call here. What do you think about this proposal? I think an API like this would be extensible. We are incubating this via a Lightning callback which people can extend
|
For consistency's sake, pointing out that to_onnx() takes a file path, though I would much rather prefer this proposed solution of returning Dict[str, nn.Module]. |
there are train_transforms, val_transforms and test_transforms in data module. Could we assume test_transforms always be the one to torchscript with model? Sometimes, the transform used in training/testing might be different than in inference, we might need additional logic to patch the transform during export. How about add "to_torchscript()" to data module? such that model doesn't need to know the members in data module and any special patching for transform happens inside of data module. |
@hudeven I think that datamodule shall not be part of the exported model as we move to data-agnostic models... |
That makes sense. I think we can go with the simplest approach relying on just the It'd be great if we could also bake in bundled inputs into the torchscript export, as this would be broadly useful: https://github.com/pytorch/pytorch/blob/master/torch/utils/bundled_inputs.py#L17 |
@ananthsub we also have the option to use torch.jit.trace. Should we support both in the same function, or better not? |
I would prefer the parameter way... |
Tracing also requires the user to provide an example input. So the API would become The PyTorch JIT team encourages users to use scripting over tracing because tracing has a number of limitations, which is why I didn't want to add a new dependency here. Given that this is new, maybe we could start without tracing and then add it as a feature request if many lightning users are interested. What do you think? |
yes, perfect. was wondering the same, but script seems to be very versatile. |
let's follow the same parameter order as we already have for ONNX export |
I've implemented and tested now the basic version as proposed by @ananthsub. @Borda example input is not required for scripting module, but we can make it have the file_path=None and if provided, save to file directly. @ananthsub what is the benefit of returning a dict instead of the script module directly? Do you plan to call this on submodules and collect them into a single dict, or let user return multiple entries by overriding? Furthermore, it seems bundled inputs are not yet there, I couldn't get it to work with PyTorch 1.7 so, not sure if people even know about it :) |
I initially proposed this to keep the output parsing simpler. But I agree the most common case would be returning the script module directly. Maybe the type signature could be I suppose I'm jumping ahead with bundled inputs then |
🚀 Feature
Support a conversion function to PyTorch JIT similar to what's available for ONNX.
Motivation
TorchScript is a way to create serializable and optimizable models from PyTorch code. Any TorchScript program can be saved from a Python process and loaded in a process where there is no Python dependency. TorchScript is a method by which users can serve PyTorch models efficiently.
Pitch
By default, we can use TorchScript to script the LightningModule. Users can override this in their own lightning modules to use tracing, or to script specific nn.Modules inside their LightningModule. This can then be extended to other Lightning utilities like model checkpointing, so we can save TorchScript or ONNX converted models alongside the best model checkpoints to make going to serving even easier to do
The text was updated successfully, but these errors were encountered: