Introduce IRRT and implement power operator (#50) #160
No reviewers
Labels
No Milestone
No Assignees
3 Participants
Notifications
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: M-Labs/nac3#160
Loading…
Reference in New Issue
Block a user
No description provided.
Delete Branch "power_operator"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
All the functional parts is implemented using the exponentiation by squaring algorithm adapted from GNU Scientific Library for integers (the bug mentioned in #50 is also fixed):
int32**int32 -> int32
(only non-negative exponent is allowed)double**double -> double
int64**int64 -> int64
(do we need to change this toint64**int32 -> int64
?)double**int32 -> double
(it's only in our type system the exponent isint32
- llvm only supporti16
in its intrinsic functions, so currently the implementation is that we should throw exception when the exponent is bigger thani16
)This is a WIP PR mainly because it seems not very appropriate to set it to merge into thelist_slice
branch (the irrt in that branch is also used to implemented the exponentiation by squaring algorithm). So maybe now it's for review only and later I will change the target branch to master after list_slice is merged..Does it play nice with constant propagation/folding? A common situation is shifting bits by multiplying by a power of 2, and this absolutely has to be folded. The code in the legacy compiler uses LLVM intrinsics so it potentially has an advantage there.
Oh sorry I did not consider much about constant folding for this. It seems the current implementation for
int**int
does not do very well on that. I will look more into that.you can print the IR and run it through opt to see if it can fold well.
and why can't we use the intrinsic? is it removed in newer versions?
Yes another way is to use the
double**int -> double
llvm intrinsic (llvm does not provideint**int -> int
intrinsic..). And to address this bug use some float to int conversion that can take care of integer overflow.OK. It seems that libm doesn't provide integer power function.
I think this is better for constant folding if the function is inlined. I think we can just compile some examples and inspect the IR.
We're building IRRT at the IR level precisely to allow this kind of inlining.
BTW this relatively simple PR may be a good occasion to introduce IRRT, instead of the complicated list slicing one.
Example use case:
https://github.com/m-labs/artiq/blob/nac3/artiq/coredevice/ttl.py#L516-L519
The user calls that function with a fixed frequency. We should be able to fold/inline everything so it results in a single call to
rtio_output
with constant parameters.I'm not really sure if that's possible... there's also the issue of loss of precision with floats and int64. A float64 has only 52 bits of mantissa and large int64 results will be imprecise and potentially result in very confusing behavior.
If we can implement the integer case with IRRT exponentiation-by-squaring and LLVM handles well the "constant inputs" case, then we're good.
78c0561b02
to9586910bec
WIP: implement power operator (#50)to Introduce IRRT and implement power operator (#50)Sure, the exponentiation-by-squaring method in irrt should be ok for constant folding/inlining.
I have removed list slice related part in this branch and changed the target branch to master, thanks for the suggestion!
I think I must have messed up with some settings... the inline and constant folding seems to be ok. How about this result?
output:
You can't get any better than this :)
@ -0,0 +75,4 @@
(64, 64) => IrrtSymbolTable::POWER_I64,
_ => unreachable!(),
};
let pow_fun = ctx.module.get_function(symbol).unwrap_or_else(|| load_irrt(ctx, symbol));
This creates multiple copies of IRRT with multithreaded codegen.
Why not have only one IRRT module that is simply linked in with the main thread?
@ -0,0 +8,4 @@
# define MAX(a, b) (a > b ? a : b)
# define MIN(a, b) (a > b ? b : a)
int32_t __nac3_irrt_range_slice_len(const int32_t start, const int32_t end, const int32_t step) {
Let's remove list stuff and other unrelated code from this PR, which could then be merged before the more complex list features.
@ -0,0 +30,4 @@
if f == &IrrtSymbolTable::POWER_I32 || f == &IrrtSymbolTable::POWER_I64 {
// add alwaysinline attributes to power function to help them get inlined
// alwaysinline enum = 1, see release/13.x/llvm/include/llvm/IR/Attributes.td
fun.add_attribute(AttributeLoc::Function, ctx.ctx.create_enum_attribute(1, 0));
Isn't there an easier way to do this with inkwell?
@ -0,0 +22,4 @@
);
let irrt_mod = Module::parse_bitcode_from_buffer(&bitcode_buf, ctx.ctx).unwrap();
irrt_mod.set_data_layout(&ctx.module.get_data_layout());
irrt_mod.set_triple(&ctx.module.get_triple());
Is set_data_layout and set_triple necessary?
@ -437,3 +449,3 @@
args.insert(0, FuncArg { name: "self".into(), ty: obj.0, default_value: None });
}
let params = args.iter().map(|arg| ctx.get_llvm_type(generator, arg.ty).into()).collect_vec();
let params =
Please put unrelated style changes in separate PR (or commit them directly).
@ -633,0 +662,4 @@
ctx.gen_int_ops(op, left, right)
} else if ty1 == ty2 && ctx.primitives.float == ty1 {
ctx.gen_float_ops(op, left, right)
} else if ty1 == ctx.primitives.float && ty2 == ctx.primitives.int32 {
Shouldn't we assert that
op
isPow
in this case?I've reimplemented it based on your code. Can we delete the
power_operator
branch?Thanks! Sure,
power_operator
branch can be deleted, and I will open up a new branch to handle the irrt linking in nac3standalone and nix build failure.Pull request closed