nac3core: do not get llvm value too eagerly for kernel invariant #253

Merged
pca006132 merged 1 commits from kernel_invariant_fix into master 2022-03-31 12:48:50 +08:00
Collaborator

Notice this when looking into optimizing unwrap for KernelInvariant fields of Option type. I previously modifed the code such that it is getting llvm value too eagerly, disabling the KernelInvariant optimization.

@nac3
class Demo:
    core: KernelInvariant[Core]
    t: KernelInvariant[int32]
    def __init__(self):
        self.core = Core()
        self.t = 123

    @kernel
    def run(self):
        print_int32(self.t)


if __name__ == "__main__":
    Demo().run()
previous output
; ModuleID = 'main'
source_filename = "main"

%__main__.Demo = type { %min_artiq.Core*, i32 }
%min_artiq.Core = type { double }

@"139867665387424" = local_unnamed_addr global %__main__.Demo { %min_artiq.Core* @"139867665386944", i32 123 }
@"139867665386944" = global %min_artiq.Core { double 1.000000e-09 }
@tagptr0 = private global [2 x i8] c":n"
@"2400020124162657182" = private global { i8*, i64 } { i8* getelementptr inbounds ([2 x i8], [2 x i8]* @tagptr0, i32 0, i32 0), i64 2 }

define void @__modinit__() local_unnamed_addr personality i32 (...)* @__nac3_personality {
init:
  %argptr1.i = alloca [0 x i8*], align 8
  %load1.i = load i32, i32* getelementptr inbounds (%__main__.Demo, %__main__.Demo* @"139867665387424", i64 0, i32 1), align 8
  tail call void @print_int32(i32 %load1.i)
  %0 = bitcast [0 x i8*]* %argptr1.i to i8*
  call void @llvm.lifetime.start.p0i8(i64 0, i8* nonnull %0)
  %rpc.stack.i = tail call i8* @llvm.stacksave()
  %argptr1.sub.i = getelementptr inbounds [0 x i8*], [0 x i8*]* %argptr1.i, i64 0, i64 0
  call void @rpc_send(i32 0, { i8*, i64 }* nonnull @"2400020124162657182", i8** nonnull %argptr1.sub.i)
  call void @llvm.stackrestore(i8* %rpc.stack.i)
  %rpc_recv.i = call i32 @rpc_recv(i8* null)
  call void @llvm.lifetime.end.p0i8(i64 0, i8* nonnull %0)
  ret void
}

declare i32 @__nac3_personality(...)

declare void @print_int32(i32) local_unnamed_addr

; Function Attrs: mustprogress nofree nosync nounwind willreturn
declare i8* @llvm.stacksave() #0

declare void @rpc_send(i32, { i8*, i64 }*, i8**) local_unnamed_addr

; Function Attrs: mustprogress nofree nosync nounwind willreturn
declare void @llvm.stackrestore(i8*) #0

declare i32 @rpc_recv(i8*) local_unnamed_addr

; Function Attrs: argmemonly nofree nosync nounwind willreturn
declare void @llvm.lifetime.start.p0i8(i64 immarg, i8* nocapture) #1

; Function Attrs: argmemonly nofree nosync nounwind willreturn
declare void @llvm.lifetime.end.p0i8(i64 immarg, i8* nocapture) #1

attributes #0 = { mustprogress nofree nosync nounwind willreturn }
attributes #1 = { argmemonly nofree nosync nounwind willreturn }
output after this PR
; ModuleID = 'main'
source_filename = "main"

%min_artiq.Core = type { double }
%__main__.Demo = type { %min_artiq.Core*, i32 }

@"140475415924160" = global %min_artiq.Core { double 1.000000e-09 }
@"140475415924640" = local_unnamed_addr global %__main__.Demo { %min_artiq.Core* @"140475415924160", i32 123 }
@tagptr0 = private global [2 x i8] c":n"
@"2400020124162657182" = private global { i8*, i64 } { i8* getelementptr inbounds ([2 x i8], [2 x i8]* @tagptr0, i32 0, i32 0), i64 2 }

declare i32 @__nac3_personality(...)

declare void @print_int32(i32) local_unnamed_addr

define void @__modinit__() local_unnamed_addr personality i32 (...)* @__nac3_personality {
init:
  %argptr1.i = alloca [0 x i8*], align 8
  tail call void @print_int32(i32 123)
  %0 = bitcast [0 x i8*]* %argptr1.i to i8*
  call void @llvm.lifetime.start.p0i8(i64 0, i8* nonnull %0)
  %rpc.stack.i = tail call i8* @llvm.stacksave()
  %argptr1.sub.i = getelementptr inbounds [0 x i8*], [0 x i8*]* %argptr1.i, i64 0, i64 0
  call void @rpc_send(i32 0, { i8*, i64 }* nonnull @"2400020124162657182", i8** nonnull %argptr1.sub.i)
  call void @llvm.stackrestore(i8* %rpc.stack.i)
  %rpc_recv.i = call i32 @rpc_recv(i8* null)
  call void @llvm.lifetime.end.p0i8(i64 0, i8* nonnull %0)
  ret void
}

; Function Attrs: mustprogress nofree nosync nounwind willreturn
declare i8* @llvm.stacksave() #0

declare void @rpc_send(i32, { i8*, i64 }*, i8**) local_unnamed_addr

; Function Attrs: mustprogress nofree nosync nounwind willreturn
declare void @llvm.stackrestore(i8*) #0

declare i32 @rpc_recv(i8*) local_unnamed_addr

; Function Attrs: argmemonly nofree nosync nounwind willreturn
declare void @llvm.lifetime.start.p0i8(i64 immarg, i8* nocapture) #1

; Function Attrs: argmemonly nofree nosync nounwind willreturn
declare void @llvm.lifetime.end.p0i8(i64 immarg, i8* nocapture) #1

attributes #0 = { mustprogress nofree nosync nounwind willreturn }
attributes #1 = { argmemonly nofree nosync nounwind willreturn }
Notice this when looking into optimizing `unwrap` for `KernelInvariant` fields of `Option` type. I previously modifed the code such that it is getting llvm value too eagerly, disabling the KernelInvariant optimization. ```python @nac3 class Demo: core: KernelInvariant[Core] t: KernelInvariant[int32] def __init__(self): self.core = Core() self.t = 123 @kernel def run(self): print_int32(self.t) if __name__ == "__main__": Demo().run() ``` <details> <summary>previous output</summary> ```llvm ; ModuleID = 'main' source_filename = "main" %__main__.Demo = type { %min_artiq.Core*, i32 } %min_artiq.Core = type { double } @"139867665387424" = local_unnamed_addr global %__main__.Demo { %min_artiq.Core* @"139867665386944", i32 123 } @"139867665386944" = global %min_artiq.Core { double 1.000000e-09 } @tagptr0 = private global [2 x i8] c":n" @"2400020124162657182" = private global { i8*, i64 } { i8* getelementptr inbounds ([2 x i8], [2 x i8]* @tagptr0, i32 0, i32 0), i64 2 } define void @__modinit__() local_unnamed_addr personality i32 (...)* @__nac3_personality { init: %argptr1.i = alloca [0 x i8*], align 8 %load1.i = load i32, i32* getelementptr inbounds (%__main__.Demo, %__main__.Demo* @"139867665387424", i64 0, i32 1), align 8 tail call void @print_int32(i32 %load1.i) %0 = bitcast [0 x i8*]* %argptr1.i to i8* call void @llvm.lifetime.start.p0i8(i64 0, i8* nonnull %0) %rpc.stack.i = tail call i8* @llvm.stacksave() %argptr1.sub.i = getelementptr inbounds [0 x i8*], [0 x i8*]* %argptr1.i, i64 0, i64 0 call void @rpc_send(i32 0, { i8*, i64 }* nonnull @"2400020124162657182", i8** nonnull %argptr1.sub.i) call void @llvm.stackrestore(i8* %rpc.stack.i) %rpc_recv.i = call i32 @rpc_recv(i8* null) call void @llvm.lifetime.end.p0i8(i64 0, i8* nonnull %0) ret void } declare i32 @__nac3_personality(...) declare void @print_int32(i32) local_unnamed_addr ; Function Attrs: mustprogress nofree nosync nounwind willreturn declare i8* @llvm.stacksave() #0 declare void @rpc_send(i32, { i8*, i64 }*, i8**) local_unnamed_addr ; Function Attrs: mustprogress nofree nosync nounwind willreturn declare void @llvm.stackrestore(i8*) #0 declare i32 @rpc_recv(i8*) local_unnamed_addr ; Function Attrs: argmemonly nofree nosync nounwind willreturn declare void @llvm.lifetime.start.p0i8(i64 immarg, i8* nocapture) #1 ; Function Attrs: argmemonly nofree nosync nounwind willreturn declare void @llvm.lifetime.end.p0i8(i64 immarg, i8* nocapture) #1 attributes #0 = { mustprogress nofree nosync nounwind willreturn } attributes #1 = { argmemonly nofree nosync nounwind willreturn } ``` </details> <details> <summary>output after this PR</summary> ```llvm ; ModuleID = 'main' source_filename = "main" %min_artiq.Core = type { double } %__main__.Demo = type { %min_artiq.Core*, i32 } @"140475415924160" = global %min_artiq.Core { double 1.000000e-09 } @"140475415924640" = local_unnamed_addr global %__main__.Demo { %min_artiq.Core* @"140475415924160", i32 123 } @tagptr0 = private global [2 x i8] c":n" @"2400020124162657182" = private global { i8*, i64 } { i8* getelementptr inbounds ([2 x i8], [2 x i8]* @tagptr0, i32 0, i32 0), i64 2 } declare i32 @__nac3_personality(...) declare void @print_int32(i32) local_unnamed_addr define void @__modinit__() local_unnamed_addr personality i32 (...)* @__nac3_personality { init: %argptr1.i = alloca [0 x i8*], align 8 tail call void @print_int32(i32 123) %0 = bitcast [0 x i8*]* %argptr1.i to i8* call void @llvm.lifetime.start.p0i8(i64 0, i8* nonnull %0) %rpc.stack.i = tail call i8* @llvm.stacksave() %argptr1.sub.i = getelementptr inbounds [0 x i8*], [0 x i8*]* %argptr1.i, i64 0, i64 0 call void @rpc_send(i32 0, { i8*, i64 }* nonnull @"2400020124162657182", i8** nonnull %argptr1.sub.i) call void @llvm.stackrestore(i8* %rpc.stack.i) %rpc_recv.i = call i32 @rpc_recv(i8* null) call void @llvm.lifetime.end.p0i8(i64 0, i8* nonnull %0) ret void } ; Function Attrs: mustprogress nofree nosync nounwind willreturn declare i8* @llvm.stacksave() #0 declare void @rpc_send(i32, { i8*, i64 }*, i8**) local_unnamed_addr ; Function Attrs: mustprogress nofree nosync nounwind willreturn declare void @llvm.stackrestore(i8*) #0 declare i32 @rpc_recv(i8*) local_unnamed_addr ; Function Attrs: argmemonly nofree nosync nounwind willreturn declare void @llvm.lifetime.start.p0i8(i64 immarg, i8* nocapture) #1 ; Function Attrs: argmemonly nofree nosync nounwind willreturn declare void @llvm.lifetime.end.p0i8(i64 immarg, i8* nocapture) #1 attributes #0 = { mustprogress nofree nosync nounwind willreturn } attributes #1 = { argmemonly nofree nosync nounwind willreturn } ``` </details>
sb10q requested review from pca006132 2022-03-31 10:26:52 +08:00
Owner

nac3core not na3core

nac3core not na3core
ychenfo force-pushed kernel_invariant_fix from 8d66426139 to 2edeb31d21 2022-03-31 10:28:47 +08:00 Compare
Author
Collaborator

nac3core not na3core

Thanks! Force pushed to fix the typo.

> nac3core not na3core Thanks! Force pushed to fix the typo.
ychenfo changed title from na3core: do not get llvm value too eagerly for kernel invariant to nac3core: do not get llvm value too eagerly for kernel invariant 2022-03-31 10:30:34 +08:00
pca006132 merged commit 4f66bdeda9 into master 2022-03-31 12:48:50 +08:00
Owner

Might be good to have tests for this kind of thing that look at the IR.

Might be good to have tests for this kind of thing that look at the IR.
sb10q deleted branch kernel_invariant_fix 2022-03-31 12:50:30 +08:00
Contributor

Might be good to have tests for this kind of thing that look at the IR.

Yes, we can see if there is a call void @print_int32(i32 123). We should not match the entire IR as it is too brittle.

> Might be good to have tests for this kind of thing that look at the IR. Yes, we can see if there is a `call void @print_int32(i32 123)`. We should not match the entire IR as it is too brittle.
Sign in to join this conversation.
No reviewers
No Milestone
No Assignees
3 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: M-Labs/nac3#253
No description provided.