-
-
Notifications
You must be signed in to change notification settings - Fork 2.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
support pointer subtraction #1738
Comments
yes I've used that as a workaround. However #770 said that subtraction directly should be defined. Also I saw this go past in IRC:
|
Your example doesn't work with addition either. Pointer arithmetic only works on certain types. As
You'll note a few things about this. First, you still can't directly add and subtract two pointers, only a pointer and an int. Second, when performing pointer arithmetic you're working with the size of the child type, so |
Note that you can subtract integers from (unknown length) pointers, and you can add integers to pointers. What does not work is adding pointers to pointers or subtracting pointers from pointers. I'm inclined to leave the behavior as status quo. I think that
These questions all go away if we reject this proposal. |
I am with @andrewrk, such actions should come with explicit verbosity. Move to reject -- thanks for the question! |
That's the issue I'm running into here: when I subtract two pointers I expect them to work on size of the object taken up by that object in an array. (@ptrToInt(&item_pointer) - @ptrToInt(&base))/@sizeOf(@typeOf(base[0])) Are there scenarios where an item will take more than its size? This is the code where I ran into this issue: https://github.com/daurnimator/zig-timeout-wheel/blob/5fa2c57b2b24e40d5220c1b6fe995bab92b28d8c/timeout_wheel.zig#L215 |
It turns out there are. We discussed this in IRC, copying it here:
|
I think that const std = @import("std");
test "oaeu" {
std.debug.warn("u24={}\n", usize(@sizeOf(u24)));
std.debug.warn("struct={}\n", usize(@sizeOf(struct {
x: u24,
})));
}
As far as I'm aware, |
Why should a |
I'm guessing you'll see this behavior for any type that is between 2 "native" types. For instance:
On x64 that should mean u56 is the upper bound. Anyway, I agree with andrewrk that |
To make the elements aligned properly: const std = @import("std");
test "[2]u24" {
var array: [2]u24 = []u24{ 0xaabbcc, 0xddeeff };
std.debug.warn("size={}\n", usize(@sizeOf([2]u24)));
std.debug.warn("ptr of index 0 = 0x{x}\n", @ptrToInt(&array[0]));
std.debug.warn("ptr of index 1 = 0x{x}\n", @ptrToInt(&array[1]));
std.debug.warn("difference = {}\n", @ptrToInt(&array[1]) - @ptrToInt(&array[0]));
}
The processor needs 4 byte alignment, because loads/stores/registers are not actually 24 bits, but 32 bits. If the first element of the array took up 3 bytes, then the element at index 1 would start at |
Also here is an alternative to @tgschultz's example, without const std = @import("std");
pub fn main() void {
const foo: [50]u32 = []u32{1} ** 50;
const a = foo[20..].ptr;
const b = foo[40..].ptr;
std.debug.warn("a={}, b={}\n", a, b);
std.debug.warn("b-a={}\n", b - a);
} |
Pointer-pointer subtraction should be supported as the inverse function of pointer+scalar addition. var array: [32]u32 = undefined;
var b: [*]u32 = array[0..].ptr; // base
var i: usize = 5; // index
var e: [*]u32 = base + index; // element address
assert(e - i == b); // inverse of e = b + i, solved for b.
assert(e - b == i); // inverse of e = b + i, solved for i
var e2: *u32 = &array[i];
assert(e2 - b == i); // this should also work The usecase is: subtract a pointer to an element in an array from the pointer to the base of the array to get the index of the element in the array. There are several restrictions we can assume from this usecase:
There's another usecase we could consider, but I don't know if it's important. The usecase is subtracting the addresses of two elements in an array, but you don't know which element has a greater index than the other. This would cause trouble with the "What if you would get a negative number" situation above. I don't think this is a problem in practice, because this usecase is rare and is fraught with peril even if the language supported it in some capacity. I'd be happy to reconsider if shown some real actual code trying to do this. |
After reconsidering and reading your counter proposal, I agree. I think the usecase of finding array index based on an element pointer and base pointer is valid, as @daurnimator demonstrated above. Let me consider whether to allow single-item pointers at all. My intuition is that single-item pointers should be forbidden from participating in pointer addition and subtraction without a cast. I note that in the timeout wheel usecase example above, both sides of the subtraction are single-item pointers. However one could make the argument that the
Yes they are allowed and they work the same as single-item pointers. I think align qualifiers can be ignored when doing pointer subtraction. As a rule of thumb it should be true that anywhere a less-aligned pointer is needed, a more-aligned pointer is accepted. |
Pointer math seems necessary for interfacing with existing C code and libraries, but would it be considered a best practice within a pure Zig project? If not, it might be worth requiring a builtin function to use it rather than making it possible via the |
I was thinking more or less the same thing, this could all be easily accomplished with a few functions in std. But just because I personally haven't had any particular need for this feature doesn't mean it wouldn't be used often enough in some domain to be part of the language. |
On what cpu(s)? 24 bit architectures exist e.g. eZ80. |
On such an architecture, a
const std = @import("std");
test "@sizeOf(u24) is the store size of a u24" {
var x: u24 = 0xaabbcc;
const ptr = @ptrCast([*]u8, &x);
ptr[@sizeOf(u24)] = 0x99;
std.debug.assert(x == 0xaabbcc);
} I was going to cite this test as a problem, but it actually passes. LLVM generates code like this: 0000000000000010 <entry>:
10: 55 push %rbp
11: 48 89 e5 mov %rsp,%rbp
14: c6 45 fe aa movb $0xaa,-0x2(%rbp)
18: 66 c7 45 fc cc bb movw $0xbbcc,-0x4(%rbp)
1e: 48 8d 45 fc lea -0x4(%rbp),%rax
22: 48 89 45 f0 mov %rax,-0x10(%rbp)
26: 48 8b 45 f0 mov -0x10(%rbp),%rax
2a: c6 40 03 99 movb $0x99,0x3(%rax)
2e: 0f b6 4d fe movzbl -0x2(%rbp),%ecx
32: c1 e1 10 shl $0x10,%ecx
35: 0f b7 55 fc movzwl -0x4(%rbp),%edx
39: 09 ca or %ecx,%edx
3b: 89 d0 mov %edx,%eax
3d: 5d pop %rbp
3e: c3 retq You can see that the load actually only reads 3 bytes, and the store only reads 3 bytes as well. So the size 3 only becomes a problem when it's in an array, where the actual formula is I can't draw a conclusion yet, but I have to go, so I'm posting this comment in its current form and I will follow up later. |
True. Yet something about
Note that modern x86_64 doesn't really have any penalties for unaligned accesses. See e.g. https://stackoverflow.com/a/45116730/282536 |
Such operation should be implemented in stdlib. Possible implementation: fn getIndex(array: [*]T, element: *T) usize {
return (@ptrToInt(element) - @ptrToInt(array)) / @sizeOf(T);
} Maybe even remove pointer +/- integer and implement it in stdlib. For example: var array: [32]u32 = undefined;
var b: [*]u32 = array[0..].ptr; // base
var i: usize = 5; // index
var e: [*]u32 = std.ptrmath.offsetForward(base, index); // element address
assert(std.ptrmath.offsetBackward(i, e) == b); // inverse of e = b + i, solved for b.
assert(std.ptrmath.getIndex(b, e) == i); // inverse of e = b + i, solved for i
var e2: *u32 = &array[i];
assert(e2 - b == i); // this should also work Another idea is fn getRelativeIndex(element: *T, element2: *T) isize; |
It'd still be an issue on architectures like ARM and PPC/POWER, where unaligned access can trap. Before you go and try that: you might never see an actual trap; the kernel fixes them up unless you disable that with a prctl(). They're quite literally 1,000x slower than aligned accesses, however. (PPC is interestingly quirky in that the high-end chips normally support unaligned loads and stores, whereas the low-end chips - what sits in your toaster or microwave - do not. A function of restricted die area.) |
I don't think your conclusion follows from the citation you gave. Here's a breakdown of the benchmark that it cites: https://stackoverflow.com/a/45129784/432 Alignment is talking about the virtual memory address where the data is stored. The places that alignment can be specified in Zig is in variable declarations and functions - which provides a guarantee that the data will be stored at an address with at least the requested alignment - and in pointers - which is a type system feature to enable code to specify minimum alignment requirements of pointers. |
I did read through that. As far as I understand it there is only a penalty when you go across cache lines. |
I am missing some overall picture of what is allowed and not in this proposal. The C standard guarantees subtraction of pointers is only meaningful, if the result is within the array object or one past the array object: arraystart..arrayend+1. Otherwise this is UB in the C standard. For both Zig and C interaction sanity this should be list the type of things that are supported and not with what is checked and what not. As another motivating example: Tracking indirect pointers (and the objects they may point to) or worse reconstruction from void pointers has a compilation time cost. |
Pointer subtraction is not only useful to efficiently find the array element index, it is also a handy way to compute offsets within a multi-level nested struct without resorting to manually adding To make pointer subtraction a useful tool for this, one might need to use a counterpart of C++'s Further notes:
|
Now that ziglang/zig#1738 is fixed.
Fails with:
Related to #770 ?
The text was updated successfully, but these errors were encountered: