Commit Graph

104 Commits

Author SHA1 Message Date
whitequark
b1f371e578 llvm_ir_generator: update debug info emission for LLVM 3.8. 2016-04-02 18:05:40 +00:00
whitequark
10108e6d64 compiler: mark loaded pointers as !dereferenceable.
Also, lower the bound for test_pulse_rate_dds, since we generate
better code for it now.
2016-03-31 12:04:38 +00:00
whitequark
2c04979727 compiler: update for LLVM 3.7. 2016-03-31 11:23:47 +00:00
whitequark
8a908a7656 llvm_ir_generator: do not use 'coldcc' calling convention.
First, this calling convention doesn't actually exist in OR1K
and trying to use it in Asserts build causes an UNREACHABLE.

Second, I tried to introduce it and it does not appear to produce
any measurable benefit: not only OR1K has a ton of CSRs but also
it is quite hard, if not realistically impossible, to produce
the kind of register pressure that would be relieved by sparing
a few more CSRs for our exception raising function calls, since
temporaries don't have to be preserved before a noreturn call
and spilling over ten registers across an exceptional edge
is not something that the code we care about would do.

Third, it produces measurable drawbacks: it inflates code size
of check:* functions by adding spills. Of course, this could be
alleviated by making __artiq_raise coldcc as well, but what's
the point anyway?
2016-03-29 15:19:46 +00:00
whitequark
586022023b llvm_ir_generator: mark loads as non-null where applicable. 2016-03-29 13:34:37 +00:00
whitequark
a9489246a4 llvm_ir_generator: don't mark non-constant attribute loads as invariant.
Oops.
2016-03-29 13:03:49 +00:00
whitequark
3ed852e077 Commit missing parts of 1d8b0d46. 2016-03-28 21:44:31 +00:00
whitequark
1038f1321f compiler: allow specifying per-function "fast-math" flags.
Fixes #351.
2016-03-28 21:44:08 +00:00
whitequark
6f5332f892 compiler: allow flagging syscalls, providing information to optimizer.
This also fixes a crash in test_cache introduced in 1d8b0d46.
2016-03-28 19:56:56 +00:00
whitequark
1d8b0d46bc compiler: mark FFI functions as ModRef=Ref using TBAA metadata.
Fascinatingly, the fact that you can mark call instructions with
!tbaa metadata is completely undocumented. Regardless, it is true:
a !tbaa metadata for an "immutable" type will cause
AliasAnalysis::getModRefBehavior to return OnlyReadsMemory for that
call site.

Don't bother marking loads with TBAA yet since we already place
!load.invariant on them (which is as good as the TBAA "immutable"
flag) and after that we're limited by lack of !nonnull anyway.

Also, add TBAA analysis passes in our pipeline to actually engage it.
2016-03-28 02:52:15 +00:00
whitequark
418f0a5234 compiler: mark loads of kernel constant attributes as load invariant.
Also, enable LICM, since it can take advantage of this.
2016-03-28 00:44:32 +00:00
whitequark
ca7463a054 compiler: do not write back kernel constant attributes.
Fixes #322.
2016-03-28 00:05:57 +00:00
whitequark
e75ad3d1aa compiler: extract runtime checks into separate cold functions.
This reduces register pressure as well as function size, which
favorably affects the inliner.
2016-03-27 01:02:15 +00:00
whitequark
20ad7629ac llvm_ir_generator: generate code more amenable to LLVM's GlobalOpt.
This exposes almost all embedded methods to inlining, with massive
gains.
2016-03-26 21:35:48 +00:00
whitequark
186a564ba8 compiler: make quoted functions independent of outer environment. 2016-03-26 20:46:45 +00:00
whitequark
3ee9834197 compiler: significantly increase readability of LLVM and ARTIQ IRs. 2016-03-26 12:21:02 +00:00
whitequark
8d0566661a compiler: quote functions directly instead of going through a local. 2016-03-25 22:35:55 +00:00
whitequark
39d23793a4 transforms.llvm_ir_generator: extract class function attributes.
This should give LLVM more visibility.
2016-03-25 19:01:39 +00:00
whitequark
39599d4508 compiler: get rid of the GetConstructor opcode. 2016-03-25 19:01:39 +00:00
whitequark
f4ab507ab2 Bring back target print function. 2016-03-18 04:42:06 +00:00
whitequark
7e16da4a77 transforms.llvm_ir_generator: ignore assignments of None (fixes #309). 2016-03-01 12:26:42 +00:00
whitequark
82a8e819ac transforms.llvm_ir_generator: use private linkage instead of internal.
This reduces the size of symbol tables (internal adds an STB_LOCAL
symbol, but private doesn't).
2016-02-25 20:15:40 +00:00
whitequark
a1dd909bc4 Take alignment into account during attribute writeback (fixes #293). 2016-02-25 01:44:05 +00:00
whitequark
74a75841f6 Add rtio_log() and make print() an RPC (#206). 2016-02-15 03:56:56 +00:00
whitequark
85222783ee transforms.llvm_ir_generator: fix memoize(). 2016-01-16 00:38:57 +00:00
whitequark
bed62349d2 transforms.llvm_ir_generator: i64 doesn't need sret (fixes #228). 2016-01-15 23:58:11 +00:00
whitequark
63f7899f93 Commit missing parts of 7f914a0. 2016-01-10 20:25:58 +00:00
whitequark
1be9e7576d transforms.llvm_ir_generator: use byval for FFI calls where appropriate. 2016-01-10 18:31:35 +00:00
whitequark
d436093317 transforms.llvm_ir_generator: use sret attribute on FFI calls too. 2016-01-10 17:42:40 +00:00
whitequark
7f914a057c transforms.llvm_ir_generator: use sret attribute. 2016-01-10 17:31:59 +00:00
whitequark
edb7423a4f transforms.llvm_ir_generator: use getelementptr inbounds. 2016-01-10 16:36:04 +00:00
whitequark
f92ce291ef transforms.llvm_ir_generator: call devirtualized functions directly. 2016-01-10 15:54:46 +00:00
whitequark
c46fccdf4b transforms.llvm_ir_generator: annotate invariant loads as such. 2016-01-10 15:45:34 +00:00
whitequark
f2f1deb9db transforms.llvm_ir_generator: quote recrusive objects correctly (fixes #213). 2016-01-07 19:52:31 +00:00
whitequark
3aa5acbaf3 transforms.llvm_ir_generator: don't crash if embedding is not enabled. 2016-01-03 19:07:11 +08:00
whitequark
81427af506 Remove selective attribute writeback; always synchronize all attributes.
The reasons are:
  1. Shadow memory manipulation added ~12 instructions to TTLOut.pulse
     (without inlining), and it's already barely fast enough.
  2. More importantly, code such as self.ts[1] did not trigger
     attribute writeback, and there seems to be no easy way to fix
     that.
2016-01-03 04:57:58 +08:00
whitequark
38a99fde52 Implement selective attribute writeback using shadow memory. 2016-01-02 22:51:04 +08:00
whitequark
2e33084a5f transforms.llvm_ir_generator: implement instrumentation for attribute writeback. 2016-01-02 06:51:30 +08:00
whitequark
8aa34ee952 compiler: don't require exceptions to inherit ARTIQException. 2015-12-31 21:54:54 +08:00
whitequark
0b69e488e6 transforms.llvm_ir_generator: compare exn typeinfo using strcmp.
This is necessary to correctly catch exceptions from RPCs.
2015-12-31 19:53:28 +08:00
whitequark
787a1d34fe transforms.llvm_ir_generator: don't crash when quoting over than one list. 2015-12-30 16:22:26 +08:00
whitequark
ac3a170c8f transforms.llvm_ir_generator: don't crash when quoting empty lists. 2015-12-30 15:47:57 +08:00
whitequark
f957be4e6f transforms.llvm_ir_generator: handle loop instruction (fixes #202). 2015-12-21 21:12:17 +08:00
whitequark
4fb1de33c9 Initial invocation of a @kernel function can now return a value (fixes #197). 2015-12-19 05:26:18 +08:00
whitequark
0395efd479 compiler: give environment types in LLVM IR readable names. 2015-12-18 23:41:51 +08:00
whitequark
2d906daf7f compiler: implement 'with watchdog' support. 2015-12-10 23:11:00 +08:00
whitequark
c660028abf transforms.llvm_ir_generator: fix reraise inside EH block. 2015-12-10 23:02:32 +08:00
whitequark
22dab0c7cc compiler: clarify confusing comment (NFC). 2015-12-10 20:50:28 +08:00
whitequark
32fe4a8a0c transforms.llvm_ir_generator: don't assert on inlined functions. 2015-11-24 00:20:33 +08:00
whitequark
cb3b811fd7 compiler: maintain both the IR and iodelay forms of delay expressions.
After this commit, the delay instruction (again) does not generate
any LLVM IR: all heavy lifting is relegated to the delay and delay_mu
intrinsics. When the interleave transform needs to adjust the global
timeline, it synthesizes a delay_mu intrinsnic. This way,
the interleave transformation becomes composable, as the input and
the output IR invariants are the same.

Also, code generation is adjusted so that a basic block is split off
not only after a delay call, but also before one; otherwise, e.g.,
code immediately at the beginning of a `with parallel:` branch
would have no choice but to execute after another branch has already
advanced the timeline.

This takes care of issue #1 described in 50e7b44 and is a step
to solving issue #2.
2015-11-21 03:22:47 +08:00