Raise error when index out of range. Note that we use llvm.expect to
tell the optimizer that we expect not to raise an exception, so the
normal path performance would be better. If this assumption is violated,
the exception overhead might be slightly larger, but the percentage
increase in overhead should not be high since exception unwinding is
already pretty slow.
- No longer check if the statement will return. Instead, we check if
the current basic block is terminated, which is simpler and handles
exception/break/continue better.
- Use invoke statement when unwind is needed.
- Moved codegen for a block of statements into a separate function.
- Added `Exception` primitive type and some builtin exception types.
Note that all exception types share the same layout, and should
inherit from the base `Exception` type. There are some hacks in the
toplevel module for handling exception types, we should revisit and
fix them later.
- Added new primitive types to concrete type module, otherwise there
would be some weird type errors.
- Changed the representation of strings to CSlice<u8>, instead of
CString.
Behavior of parallel and sequential:
Each function call (indirectly, can be inside a sequential block) within a parallel
block will update the end variable to the maximum now_mu in the block.
Each function call directly inside a parallel block will reset the timeline after
execution. A parallel block within a sequential block (or not within any block) will
set the timeline to the max now_mu within the block (and the outer max now_mu will also
be updated).
Implementation: We track the start and end separately.
- If there is a start variable, it indicates that we are directly inside a
parallel block and we have to reset the timeline after every function call.
- If there is a end variable, it indicates that we are (indirectly) inside a
parallel block, and we should update the max end value.
Note: requires testing, it is difficult to inspect the output IR
Previously, we have to copy types from one unification table to another,
and make the table sendable. This requires cloning (processing) the
whole table 3 times per function call which is not efficient and uses
more memory than required when the unification table is large.
We now use a concrete type table to only copy the type we need. This
reduces the overhead as we only need to process the unification table
for once (when we do the function codegen), and reduces memory usage by
a bit (but not noticeable when the unification table is small, i.e. the
types are simple).