artiq/artiq/test/lit/interleaving/nonoverlapping.py

# RUN: %python -m artiq.compiler.testbench.jit %s >%t
# RUN: OutputCheck %s --file-to-check=%t

def g():
    with interleave:
        with sequential:
            print("A", now_mu())
            delay_mu(2)
            #
            print("B", now_mu())
        with sequential:
            print("C", now_mu())
            delay_mu(2)
            #
            print("D", now_mu())
            delay_mu(2)
            #
            print("E", now_mu())

# CHECK-L: A 0
# CHECK-L: C 0
# CHECK-L: B 2
# CHECK-L: D 2
# CHECK-L: E 4
g()
transforms.interleaver: implement (without inlining). 2015-11-20 00:03:26 +08:00			`# RUN: %python -m artiq.compiler.testbench.jit %s >%t`
			`# RUN: OutputCheck %s --file-to-check=%t`

			`def g():`
Rename 'with parallel' to 'with interleave' (#265). 2016-02-22 21:24:43 +08:00			`with interleave:`
transforms.interleaver: implement (without inlining). 2015-11-20 00:03:26 +08:00			`with sequential:`
			`print("A", now_mu())`
compiler: actually implement interleaving correctly (calls are still broken). The previous implementation was completely wrong: it always advanced the global timeline by the same amount as the non-interleaved basic block did. The new implementation only advances the global timeline by the difference between its current time and the virtual time of the branch, which requires it to adjust the delay instructions. Previously, the delay expression was present in the IR twice: once as the iodelay.Expr transformation-visible form, and once as regular IR instructions, with the latter form being passed to the delay_mu builtin and advancing the runtime timeline. As a result of this change, this strategy is no longer valid: we can meaningfully mutate the iodelay.Expr form but not the IR instruction form. Thus, IR instructions are no longer generated for delay expressions, and the LLVM lowering pass now has to lower the iodelay.Expr objects as well. This works OK for flat `with parallel:` expressions, but breaks down outside of `with parallel:` or when calls are present. The reasons it breaks down are as follows: * Outside of `with parallel:`, delay() and delay_mu() must accept any expression, but iodelay.Expr's are not nearly expressive enough. So, the IR instruction form must actually be kept as well. * A delay instruction is currently inserted after a call to a user-defined function; this delay instruction introduces a point where basic block reordering is possible as well as provides delay information. However, the callee knows nothing about the context in which it is called, which means that the runtime timeline is advanced twice. So, a new terminator instruction must be added that combines the properties of delay and call instructions (and another for delay and invoke as well). 2015-11-21 00:02:47 +08:00			`delay_mu(2)`
			`#`
transforms.interleaver: implement (without inlining). 2015-11-20 00:03:26 +08:00			`print("B", now_mu())`
			`with sequential:`
			`print("C", now_mu())`
			`delay_mu(2)`
compiler: actually implement interleaving correctly (calls are still broken). The previous implementation was completely wrong: it always advanced the global timeline by the same amount as the non-interleaved basic block did. The new implementation only advances the global timeline by the difference between its current time and the virtual time of the branch, which requires it to adjust the delay instructions. Previously, the delay expression was present in the IR twice: once as the iodelay.Expr transformation-visible form, and once as regular IR instructions, with the latter form being passed to the delay_mu builtin and advancing the runtime timeline. As a result of this change, this strategy is no longer valid: we can meaningfully mutate the iodelay.Expr form but not the IR instruction form. Thus, IR instructions are no longer generated for delay expressions, and the LLVM lowering pass now has to lower the iodelay.Expr objects as well. This works OK for flat `with parallel:` expressions, but breaks down outside of `with parallel:` or when calls are present. The reasons it breaks down are as follows: * Outside of `with parallel:`, delay() and delay_mu() must accept any expression, but iodelay.Expr's are not nearly expressive enough. So, the IR instruction form must actually be kept as well. * A delay instruction is currently inserted after a call to a user-defined function; this delay instruction introduces a point where basic block reordering is possible as well as provides delay information. However, the callee knows nothing about the context in which it is called, which means that the runtime timeline is advanced twice. So, a new terminator instruction must be added that combines the properties of delay and call instructions (and another for delay and invoke as well). 2015-11-21 00:02:47 +08:00			`#`
transforms.interleaver: implement (without inlining). 2015-11-20 00:03:26 +08:00			`print("D", now_mu())`
			`delay_mu(2)`
compiler: actually implement interleaving correctly (calls are still broken). The previous implementation was completely wrong: it always advanced the global timeline by the same amount as the non-interleaved basic block did. The new implementation only advances the global timeline by the difference between its current time and the virtual time of the branch, which requires it to adjust the delay instructions. Previously, the delay expression was present in the IR twice: once as the iodelay.Expr transformation-visible form, and once as regular IR instructions, with the latter form being passed to the delay_mu builtin and advancing the runtime timeline. As a result of this change, this strategy is no longer valid: we can meaningfully mutate the iodelay.Expr form but not the IR instruction form. Thus, IR instructions are no longer generated for delay expressions, and the LLVM lowering pass now has to lower the iodelay.Expr objects as well. This works OK for flat `with parallel:` expressions, but breaks down outside of `with parallel:` or when calls are present. The reasons it breaks down are as follows: * Outside of `with parallel:`, delay() and delay_mu() must accept any expression, but iodelay.Expr's are not nearly expressive enough. So, the IR instruction form must actually be kept as well. * A delay instruction is currently inserted after a call to a user-defined function; this delay instruction introduces a point where basic block reordering is possible as well as provides delay information. However, the callee knows nothing about the context in which it is called, which means that the runtime timeline is advanced twice. So, a new terminator instruction must be added that combines the properties of delay and call instructions (and another for delay and invoke as well). 2015-11-21 00:02:47 +08:00			`#`
transforms.interleaver: implement (without inlining). 2015-11-20 00:03:26 +08:00			`print("E", now_mu())`

compiler: actually implement interleaving correctly (calls are still broken). The previous implementation was completely wrong: it always advanced the global timeline by the same amount as the non-interleaved basic block did. The new implementation only advances the global timeline by the difference between its current time and the virtual time of the branch, which requires it to adjust the delay instructions. Previously, the delay expression was present in the IR twice: once as the iodelay.Expr transformation-visible form, and once as regular IR instructions, with the latter form being passed to the delay_mu builtin and advancing the runtime timeline. As a result of this change, this strategy is no longer valid: we can meaningfully mutate the iodelay.Expr form but not the IR instruction form. Thus, IR instructions are no longer generated for delay expressions, and the LLVM lowering pass now has to lower the iodelay.Expr objects as well. This works OK for flat `with parallel:` expressions, but breaks down outside of `with parallel:` or when calls are present. The reasons it breaks down are as follows: * Outside of `with parallel:`, delay() and delay_mu() must accept any expression, but iodelay.Expr's are not nearly expressive enough. So, the IR instruction form must actually be kept as well. * A delay instruction is currently inserted after a call to a user-defined function; this delay instruction introduces a point where basic block reordering is possible as well as provides delay information. However, the callee knows nothing about the context in which it is called, which means that the runtime timeline is advanced twice. So, a new terminator instruction must be added that combines the properties of delay and call instructions (and another for delay and invoke as well). 2015-11-21 00:02:47 +08:00			`# CHECK-L: A 0`
compiler: maintain both the IR and iodelay forms of delay expressions. After this commit, the delay instruction (again) does not generate any LLVM IR: all heavy lifting is relegated to the delay and delay_mu intrinsics. When the interleave transform needs to adjust the global timeline, it synthesizes a delay_mu intrinsnic. This way, the interleave transformation becomes composable, as the input and the output IR invariants are the same. Also, code generation is adjusted so that a basic block is split off not only after a delay call, but also before one; otherwise, e.g., code immediately at the beginning of a `with parallel:` branch would have no choice but to execute after another branch has already advanced the timeline. This takes care of issue #1 described in 50e7b44 and is a step to solving issue #2. 2015-11-21 03:22:47 +08:00			`# CHECK-L: C 0`
compiler: actually implement interleaving correctly (calls are still broken). The previous implementation was completely wrong: it always advanced the global timeline by the same amount as the non-interleaved basic block did. The new implementation only advances the global timeline by the difference between its current time and the virtual time of the branch, which requires it to adjust the delay instructions. Previously, the delay expression was present in the IR twice: once as the iodelay.Expr transformation-visible form, and once as regular IR instructions, with the latter form being passed to the delay_mu builtin and advancing the runtime timeline. As a result of this change, this strategy is no longer valid: we can meaningfully mutate the iodelay.Expr form but not the IR instruction form. Thus, IR instructions are no longer generated for delay expressions, and the LLVM lowering pass now has to lower the iodelay.Expr objects as well. This works OK for flat `with parallel:` expressions, but breaks down outside of `with parallel:` or when calls are present. The reasons it breaks down are as follows: * Outside of `with parallel:`, delay() and delay_mu() must accept any expression, but iodelay.Expr's are not nearly expressive enough. So, the IR instruction form must actually be kept as well. * A delay instruction is currently inserted after a call to a user-defined function; this delay instruction introduces a point where basic block reordering is possible as well as provides delay information. However, the callee knows nothing about the context in which it is called, which means that the runtime timeline is advanced twice. So, a new terminator instruction must be added that combines the properties of delay and call instructions (and another for delay and invoke as well). 2015-11-21 00:02:47 +08:00			`# CHECK-L: B 2`
			`# CHECK-L: D 2`
			`# CHECK-L: E 4`
transforms.interleaver: implement (without inlining). 2015-11-20 00:03:26 +08:00			`g()`