Add core support for decoding from Python file-like objects #564

scotts · 2025-03-14T19:56:04Z

The purpose of this PR is to provide functionality in the core API that allows users to provide a Python file-like object as the video for us to decode. Specifically, we're exposing:

def create_from_file_like(
    file_like: Union[io.RawIOBase, io.BytesIO], seek_mode: Optional[str] = None
) -> torch.Tensor:

Everything else in this PR is in direct support of what we need to do to support this new function.

On the kind of file-like object, note that we accept and test both io.RawIOBase and io.BytesIO. I'm confident we should support RawIOBase, as it's unbuffered, byte oriented reading. I'm less sure about BytesIO, because it is buffered byte oriented reading. The current tests pass, but we're not stressing it much.

On what we had to do to expose this new function:

Implement pybind11 functions that don't go through PyTorch C++ custom ops.
Split the shared libraries into three: libtorchcodec_decoderN.so, libtorchcodec_custom_opsN.so, libtorchcodec_pybind_opsN.so.
Generalize how we handle AVIOContext objects in the C++ VideoDecoder.

Going over each in turn:

Why directly use pybind11?

We're already using PyTorch C++ custom ops for our interface between Python and C++, and they already have some dependence on pybind11, so why do we need to create non-custom op pybind11 functions?

From what I can tell, the custom ops were not really intended for what we want to do, which is call C++ code which will itself make callbacks back up to user-provide Python functions. That is, when the user passes us a file-like object, we want FFmpeg to call the read and seek methods on the Python file-like object for all reading and seeking. I don't think custom ops are designed for that kind of dynamic callbacks to Python code. (Custom ops definitely can call Python custom ops, but they need to be registered as such ahead of time. We want to call arbitrary Python functions.) I may be wrong here, and we can explore that in the future.

What's I'm more confident of is that we need to store an actual reference to the Python file-like object in the C++ side. Using pybind11 directly, that's easy: we keep a pointer to a py::object*. PyTorch custom ops only accept tensors as arguments. We're already smuggling pointers through tensors in the rest of the core API, but we're going from C++ to Python and back to C++. When we store a pointer from C++ in a tensor to go back up to Python, we know it's the right thing. In this instance, we would want to smuggle a pointer from Python through a tensor to C++ - but Python doesn't have pointers. We may be able to get something that will work most of the time by just asking for id(file_like), but even then, I'm not sure how to reliably turn that into a py::object* on the C++ side.

If we just use a pybind11 function, all of this difficulty goes away. The cost is that using a file-like object is definitely not going to be compatible with torch.export and torch.compile, but I'm not sure how to make that known to the PyTorch custom ops. This warrants further investigation.

Why do we need multiple shared libraries?

I think things are simpler if we have a shared library just for the custom ops and a separate shared library just for the pybind11 ops. That then means we need a third library which holds the actual decoder logic. I was not able to get anything working until I made this division, as I am currently using importlib.util.spec_from_file_location() and importlib.util.module_from_spec() to load the pybind11 module. We just use torch.ops.load_library() for the custom ops; that function has machinery that then exposes available functions as fields of the module.

What I'm currently doing works on Linux, but is failing on Mac, so something is wrong. It may be possible we don't need to do the split.

Generalization of handling AVIOContext

Custom reading and seeking is done in FFmpeg by setting up an AVIOContext object. You call avio_alloc_context() where you provide a pointer to some external state, and then functions for read, seek and write. Then, during decoding, when FFmpeg needs to get more bytes from the file, it calls the callbacks, providing a pointer to the external state. You're responsible for managing that external state in your callbacks.

We already were using this for when users provided us the entire file as just plain bytes. I generalized this handling into three classes:

AVIOContexHolder which is a base class that knows how to allocate an AVIOContext. It cannot be instantiated directly. The VideoDecoder can be instantiated with an AVIOContextHolder and it uses it appropriately.
AVIOBytesContext which is the existing functionality we already had. It derives from AVIOContextHolder.
AVIOFileLikeContext which is the new functionality, and it also derives from AVIOContextHolder.

src/torchcodec/decoders/_core/CMakeLists.txt

scotts · 2025-03-21T20:01:17Z

src/torchcodec/decoders/_core/AVIOBytesContext.cpp

+      ", size=",
+      dataContext->size);
+
+  int64_t numBytesRead = std::min(


Note that this avoids the negative buffer problem. Before, we were doing a narrowing conversion: we casted the int64_t value to an int in order to compare it to an int. That caused a large positive value to become a negative value. It's safer to cast the int to an int64_t and then compare both values as int64_t. This is fine because std::memcpy() takes a size_t, which the int64_t can safely convert to.

NicolasHug

Thanks a lot for the great work @scotts !

Since the PR is growing quite large, I'm happy to go ahead and merge this now if you'd like, and iterate on the tests as follow-ups.

NicolasHug · 2025-03-24T12:06:29Z

src/torchcodec/decoders/_core/ops.py

+    #   2. importlib: For pybind11 modules. We load them dynamically, rather
+    #      than using a plain import statement. A plain import statement only
+    #      works when the module name and file name match exactly, and the
+    #      shared library file is in the import path. Our shared libraries do
+    #      not meet those conditions.


Nit: I think we should remove this part from the sentence:

and the shared library file is in the import path

Because the so/dylib files we produce are definitely in the "import path" (they're right next to other files we happily import). My point being that the above is not one of the conditions we fail to meet.

NicolasHug · 2025-03-24T12:07:19Z

src/torchcodec/decoders/_core/ops.py

+    #
+    # Note that we use two different methods for loading shared libraries:
+    #
+    #   1. torch.ops.load_library(): For PyTorch custom ops. Loading libraries


Nit: not just custom ops, it's also for the "pure C++" decoder library

facebook-github-bot · 2025-03-24T14:12:19Z

@scotts has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

facebook-github-bot · 2025-03-26T18:37:44Z

@scotts has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

NicolasHug

Latest changes LGTM still

scotts added 4 commits March 7, 2025 11:56

Remove unused C++ decoder creation

30fe734

Add support for decoding from Python file-like objects

a093003

Merge branch 'main' of github.com:pytorch/torchcodec into file_like

1ca8443

Forgot the new file. :/

53d0729

facebook-github-bot added the CLA Signed This label is managed by the Meta Open Source bot. label Mar 14, 2025

scotts added 25 commits March 14, 2025 12:57

Lint.

6bae172

Remove unneded namespace alias.

70a8364

Remove asserts.

edce04b

Cleanup pybind ops loading.

7741ae4

Explicitly say _pybind_ops is a module type

0117a78

Refactor AVIOContextHolder

681b9cc

AVIOFileLikeContext refactoring

43d6dde

Better comment for AVIOContextHolder.

d301f53

Break out AVIOContext stuff into their own header and source files

a76d6a0

Lint

fa2445e

Merge branch 'main' of github.com:pytorch/torchcodec into file_like

f56b259

Explicit assert on spec object

ffdbbfb

Manual exception raising

c7d9df3

Undo in order to merge

5134aff

Merge branch 'main' of github.com:pytorch/torchcodec into file_like

330b4d5

Raise ImportError on spec failure

7993070

Print path

f4ece88

Close paren

2b4f213

Load and importlib

01884b3

Lint

45342a7

Add FFmpeg version in exception traceback message

3608b50

Make exception args tuple; refactor visiblity of context stuff

f36d050

Try find_spec

6819070

Trying import_module as backup

89c8698

Using plain _trochcodec_pybind_ops

59c129f

scotts commented Mar 20, 2025

View reviewed changes

src/torchcodec/decoders/_core/CMakeLists.txt Show resolved Hide resolved

scotts added 3 commits March 20, 2025 11:35

Better comments

9b06e79

More comments

3896a70

More more comments

e9b6c76

scotts marked this pull request as ready for review March 20, 2025 18:54

scotts added 2 commits March 20, 2025 13:50

Add pybind11 in some workflows

d94b97c

Make sure custom_ops has Python dependencies

bd598c3

NicolasHug reviewed Mar 21, 2025

View reviewed changes

src/torchcodec/decoders/_core/CMakeLists.txt Outdated Show resolved Hide resolved

scotts added 8 commits March 21, 2025 07:35

Add pre-build script to wheel building

bd3ecab

Forgot a g

e7f49c4

Add pre-build script to rest of workflows

0f8556a

Lint

66db272

Better comments

4ca294b

Use string_view instead of string for bytes

d280888

Remove todo

52d5a6f

Avoid negative buffer sizes

9f2469e

scotts commented Mar 21, 2025

View reviewed changes

Better comment

72f4ffa

NicolasHug approved these changes Mar 24, 2025

View reviewed changes

scotts added 2 commits March 24, 2025 06:49

Update comments

9e84c98

Merge branch 'main' of github.com:pytorch/torchcodec into file_like

ae9b7b6

scotts added 4 commits March 24, 2025 08:06

Merge branch 'main' of github.com:pytorch/torchcodec into file_like

5964537

Merge branch 'main' of github.com:pytorch/torchcodec into file_like

1ab8669

More generic way to import pybind11

b034fff

Assert origin is there

0ac90ba

NicolasHug approved these changes Mar 27, 2025

View reviewed changes

scotts merged commit d830af9 into pytorch:main Mar 27, 2025
47 of 48 checks passed

scotts deleted the file_like branch March 27, 2025 16:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add core support for decoding from Python file-like objects #564

Add core support for decoding from Python file-like objects #564

scotts commented Mar 14, 2025 •

edited

Loading

scotts Mar 21, 2025

NicolasHug left a comment

NicolasHug Mar 24, 2025

NicolasHug Mar 24, 2025

facebook-github-bot commented Mar 24, 2025

facebook-github-bot commented Mar 26, 2025

NicolasHug left a comment

Add core support for decoding from Python file-like objects #564

Add core support for decoding from Python file-like objects #564

Conversation

scotts commented Mar 14, 2025 • edited Loading

scotts Mar 21, 2025

Choose a reason for hiding this comment

NicolasHug left a comment

Choose a reason for hiding this comment

NicolasHug Mar 24, 2025

Choose a reason for hiding this comment

NicolasHug Mar 24, 2025

Choose a reason for hiding this comment

facebook-github-bot commented Mar 24, 2025

facebook-github-bot commented Mar 26, 2025

NicolasHug left a comment

Choose a reason for hiding this comment

scotts commented Mar 14, 2025 •

edited

Loading