Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

copy_file_range: use FICLONERANGE when possible #12489

Closed
wants to merge 1 commit into from

Conversation

motiejus
Copy link
Contributor

This is a follow-up from #12476: instead of adding a new abstraction to
clone files, start with copy_file_range. They use the same mechanism
in the kernel.

Benefits of FICLONERANGE vs copy_file_range(2):

  • O(1).
  • CoW: so if files are not modified, it will not take additional space.

However, it is more restricted than copy_file_range: source and
destination must be on the same partition, and only some file systems
implement this. As of Linux 5.19 those are btrfs, cifs, nfs, ocfs2,
overlayfs and xfs1.

Note: I removed flags from copy_file_range (that must be 0 as of writing). If we want to retain flags, this change shouldn't be as-is, and we probably need os.clone_file_range. If we add clone_file_range, should that function have fallbacks, like the copy_file_range does now?

This is a follow-up from ziglang#12476: instead of adding a new abstraction to
clone files, start with `copy_file_range`. They use the same mechanism
in the kernel.

Benefits of FICLONERANGE vs `copy_file_range(2)`:
- O(1).
- CoW: so if files are not modified, it will not take additional space.

However, it is more restricted than copy_file_range: source and
destination must be on the same partition, and only some file systems
implement this. As of Linux 5.19 those are btrfs, cifs, nfs, ocfs2,
overlayfs and xfs[1].

[1]: https://elixir.bootlin.com/linux/v5.19/A/ident/remap_file_range
@motiejus motiejus force-pushed the copy_file_range-clone branch from aa4df54 to 9178a10 Compare August 19, 2022 08:48
.dest_offset = off_out,
};
while (true) {
const rc = system.ioctl(fd_out, linux.T.FICLONERANGE, @ptrToInt(&arg));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you benchmark the impact when used on filesystems which do not support CoW clones? In that case, this adds an extra system call for each run

As of at least Ubuntu 20.04, the default filesystem is ext4 which does not support CoW clones

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should have put this to the commit message; I am not overly concerned with the extra syscall, since coreutils is doing reflinking by default for cp; so I punted on the costs.

I see some other issues with the PR, marking as draft.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing out the number of syscalls. We may get rid of one in #12491

@motiejus motiejus marked this pull request as draft August 19, 2022 10:56
@motiejus
Copy link
Contributor Author

Linux kernel refuses to accept the argument, returns EINVAL for anything I give it. I will try with a C version, but not today.

@motiejus
Copy link
Contributor Author

Here is the C version:

#include <stdio.h>
#include <sys/ioctl.h>
#include <linux/fs.h>

int main(int argc, const char** argv) {
    int exit_code = 0;
    const char *arg1 = argv[1];
    const char *arg2 = argv[2];

    FILE *f1 = fopen(arg1, "r");
    if (f1 == NULL) { perror("fopen 1"); }
    FILE *f2 = fopen(arg2, "w");
    if (f2 == NULL) { perror("fopen 2"); }
    int fd1 = fileno(f1);
    int fd2 = fileno(f2);

    struct file_clone_range arg;
    arg.src_fd = fd1;
    arg.src_offset = 0;
    arg.src_length = 209;
    arg.dest_offset = 0;

    int res = ioctl(fd2, FICLONERANGE, &arg);
    if (res != 0) {
        perror("ioctl_FICLONERANGE");
        exit_code = 1;
    }

cleanup:
    fclose(f1);
    fclose(f2);

    return exit_code;
}

Observations:

  1. When src_offset = 0, dest_offset = 0, src_length = N, when N is the size of fd_in in bytes, then the function succeeds.
  2. On any other cases it fails with EINVAL.

Tried on Linux 5.15.0-40 (ubuntu) x86_64 with XFS and btrfs.

I may dig into the kernel code separately (to report and/or fix), but for now it's suffice to say that the FICLONERANGE is not stable enough for use in zig.

@motiejus motiejus closed this Aug 20, 2022
@motiejus motiejus deleted the copy_file_range-clone branch August 20, 2022 16:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants