Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

getelementptr turns into add turns into or due to wrong alignment setting #63

Closed
gergoerdi opened this issue May 21, 2017 · 5 comments
Closed
Labels
A-llvm Affects the LLVM AVR backend has-reduced-testcase A small LLVM IR file exists that demonstrates the problem

Comments

@gergoerdi
Copy link
Collaborator

gergoerdi commented May 21, 2017

LLVM IR:

%"chip8_engine::machine::Machine" = type { i16, [0 x i8], i16, [0 x i8], [16 x i8], [0 x i8] }

; Function Attrs: noreturn nounwind uwtable
define void @main() unnamed_addr {
start:
  %_12.i = alloca %"core::option::Option<chip8_engine::opcodes::Op>", align 8
  %vmachine = alloca %"chip8_engine::machine::Machine", align 8
  call void @_ZN12chip8_engine7machine7Machine3new17hc6366a1c56686268E(%"chip8_engine::machine::Machine"* noalias nocapture nonnull sret dereferenceable(20) %vmachine) #6
  tail call fastcc void @_ZN9chip8_avr3spi5setup17hf04b9f9f265ec074E() #6
  tail call fastcc void @_ZN9chip8_avr10serial_ram5setup17h96855e7b777cd1ceE() #6
  %v1 = getelementptr inbounds %"chip8_engine::machine::Machine", %"chip8_engine::machine::Machine"* %vmachine, i16 0, i32 2
  br label %bb4

bb4:  
  %v6 = load i16, i16* %v1, align 2
  %v7 = call fastcc i8 @"_ZN75_$LT$chip8_avr..Board$u20$as$u20$chip8_engine..peripherals..Peripherals$GT$8read_ram17h4a6718a884d22702E"(i16 %v6) #6
  ret void
}

This dereferences the wrong address when setting %v6. If I remove the br label %bb4 and replace it with just a fallthrough, then the right address is loaded into %v6.

Wrong assembly:

main:                                   ; @main
; BB#0:                                 ; %start
	push	r28
	push	r29
	push	r16
	push	r17
	in	r28, 61
	in	r29, 62
	sbiw	r28, 28
	in	r0, 63
	cli
	out	62, r29
	out	63, r0
	out	61, r28
	movw	r16, r28
	subi	r16, 255
	sbci	r17, 255
	mov	r24, r16
	mov	r25, r17
	call	_ZN12chip8_engine7machine7Machine3new17hc6366a1c56686268E
	call	_ZN9chip8_avr3spi5setup17hf04b9f9f265ec074E
	call	_ZN9chip8_avr10serial_ram5setup17h96855e7b777cd1ceE
	mov	r30, r16
	mov	r31, r17
	ori	r30, 2
	ld	r24, Z
	ldd	r25, Z+1
	call	_ZN75_$LT$chip8_avr..Board$u20$as$u20$chip8_engine..peripherals..Peripherals$GT$8read_ram17h4a6718a884d22702E
	adiw	r28, 28
	in	r0, 63
	cli
	out	62, r29
	out	63, r0
	out	61, r28
	pop	r17
	pop	r16
	pop	r29
	pop	r28
	ret

Correct assembly, when the br label %bb4 is removed:

main:                                   ; @main
; BB#0:                                 ; %start
	push	r28
	push	r29
	in	r28, 61
	in	r29, 62
	sbiw	r28, 30
	in	r0, 63
	cli
	out	62, r29
	out	63, r0
	out	61, r28
	movw	r24, r28
	adiw	r24, 1
	call	_ZN12chip8_engine7machine7Machine3new17hc6366a1c56686268E
	call	_ZN9chip8_avr3spi5setup17hf04b9f9f265ec074E
	call	_ZN9chip8_avr10serial_ram5setup17h96855e7b777cd1ceE
	ldd	r24, Y+3
	ldd	r25, Y+4
	call	_ZN75_$LT$chip8_avr..Board$u20$as$u20$chip8_engine..peripherals..Peripherals$GT$8read_ram17h4a6718a884d22702E
	adiw	r28, 30
	in	r0, 63
	cli
	out	62, r29
	out	63, r0
	out	61, r28
	pop	r29
	pop	r28
	ret
@gergoerdi
Copy link
Collaborator Author

This seems to be caused because the load of the %Machine field gets lowered into a different instruction:

Good:

ISEL: Starting pattern match on root node: t25: i16,ch = load<LD2[%v1](dereferenceable)> t20, t35, undef:i16

  Initial Opcode index to 581
  TypeSwitch[i16] from 590 to 609
  Morphed node: t25: i16,ch = LDDWRdPtrQ<Mem:LD2[%v1](dereferenceable)> TargetFrameIndex:i16<1>, TargetConstant:i16<2>, t20

Bad:

ISEL: Starting pattern match on root node: t6: i16,ch = load<LD2[%v1](dereferenceable)> t0, t2, undef:i16

  Initial Opcode index to 581
  TypeSwitch[i16] from 590 to 609
  Match failed at index 611
  Continuing at 624
  Match failed at index 627
  Continuing at 790
  Match failed at index 795
  Continuing at 882
  TypeSwitch[i16] from 889 to 904
  Morphed node: t6: i16,ch = LDWRdPtr<Mem:LD2[%v1](dereferenceable)> t2, t0

@gergoerdi
Copy link
Collaborator Author

Is it because in the good case, t20 and t35 are from the same BB, so they are transparent enough that the memri pattern can match them, whereas in the bad case, t0 and t2 are opaque? So maybe this is the result of the interaction of two bugs:

  • The memri pattern is not smart enough to see through the definition of t0 and t2
  • Either the expansion of LDWRdPtr screws up the offsetting, or t0 and t2 are incorrectly defined to get the right value from the previous BB

@gergoerdi
Copy link
Collaborator Author

My current working theory is that the real problem is that the add from the getelementptr is turned into an or by the target-agnostic DAGCombiner, but then that or is reshuffled somewhere where it is no longer valid:

Combining: t22: i16 = add FrameIndex:i16<1>, Constant:i16<2>
 ... into: t26: i16 = or FrameIndex:i16<1>, Constant:i16<2>

but then, before we do that or, there's this:

	movw	r16, r28
	subi	r16, 255
	sbci	r17, 255
...
	mov	r30, r16
	mov	r31, r17
	ori	r30, 2

I think that subtraction (effectively, an increment by 1) causes ori r30, 2 to be a NOP.

@gergoerdi
Copy link
Collaborator Author

I've now confirmed with a debugger that ori r30, 2 is indeed a NOP.

So it seems there's some other transformation that breaks LLVM's initial assumption that add 2 is equivalent to or 2 in that case.

@gergoerdi
Copy link
Collaborator Author

I think I've tracked this down to getAlignmentInfo(AGGREGATE_ALIGN, 0) returning 8 instead of the correct result of 1. I've tried changing the LLVM IR file's datalayout declaration to a:8 but it doesn't seem to take. Does the AVR target override alignment values programmatically?

@gergoerdi gergoerdi added A-llvm Affects the LLVM AVR backend has-reduced-testcase A small LLVM IR file exists that demonstrates the problem labels May 24, 2017
gergoerdi added a commit to gergoerdi/llvm-avr that referenced this issue May 27, 2017
@gergoerdi gergoerdi changed the title Jump instead of fallthrough breaks getelementptr calculation getelementptr turns into add turns into or due to wrong alignment setting May 27, 2017
@gergoerdi gergoerdi changed the title getelementptr turns into add turns into or due to wrong alignment setting getelementptr turns into add turns into or due to wrong alignment setting May 28, 2017
gergoerdi added a commit to gergoerdi/llvm-avr that referenced this issue May 28, 2017
gergoerdi added a commit to gergoerdi/llvm-avr that referenced this issue Jun 17, 2017
gergoerdi added a commit to gergoerdi/llvm-avr that referenced this issue Jun 18, 2017
dylanmckay pushed a commit to avr-rust/llvm that referenced this issue Sep 24, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-llvm Affects the LLVM AVR backend has-reduced-testcase A small LLVM IR file exists that demonstrates the problem
Projects
None yet
Development

No branches or pull requests

1 participant