Tuple by value (#172) #174
No reviewers
Labels
No Milestone
No Assignees
3 Participants
Notifications
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: M-Labs/nac3#174
Loading…
Reference in New Issue
No description provided.
Delete Branch "tuple_by_value"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Handle nac3 python tuple always by llvm struct value as #172 specified.
@ -130,2 +130,4 @@
DEF_SLICE_ASSIGN(uint32_t)
DEF_SLICE_ASSIGN(uint64_t)
int32_t __nac3_list_slice_assign_var_size(
Isn't this new functionality that should go into a separate PR/commit?
I think it is not a new functionality as this is just to make list of tuples to be handled correctly. Previously lists of tuples are just list of pointers, but now after this patch lists of tuples are lists of structs.
We really could use some unit tests...
b74731589c
toe042db0734
rebased to include tests and latest commits in master branch.
@ -130,2 +130,4 @@
DEF_SLICE_ASSIGN(uint32_t)
DEF_SLICE_ASSIGN(uint64_t)
int32_t __nac3_list_slice_assign_var_size(
Could all the other functions be replaced with this one (and without performance impact)? Seems there's a lot of code duplication currently.
Yes indeed there's a lot of code duplication now.. functionality wise all the other ones can be replaced by this, but this one will be a bit slower when step is not 1, since the original assignment in the for loop is replaced by
memcpy
called repeatedly; and since thismemcpy
is inside the function I am not sure whether they will eventually be inlined contionally when calling__nac3_list_slice_assign_var_size
with smallsize
.ah what about this? might still be a bit less efficient because of the condition but would be more compact.
Isn't it the point of
__builtin_memcpy
that the compiler can automatically replace it with an efficient implementation for small constant sizes? Please take a look.I think calling
memcpy
will already optimize it for small constant size.will be compiled as
and lowered as (RISC-V)
Sorry forgot there is a suffix__builtin_memcpy
guarantees the memcpy will be inlined, but this is probably not we want (for larger arrays we should call the fast SIMD memcpy implementation in our runtime)_inline
there... I guess for our purpose, the compiler will emit a memcpy intrinsic call and optimize for small size already.btw I am quite concerned about this one:
It is possible to have a tuple of
(u8, u8, u8, u8)
that aligns to byte boundary, and if you cast the array to an int32 it will cause alignment fault... I guess you should just let memcpy do its job. If you are worried about not getting constant optimizations, you can do something like this:Which will be optimized to
(not very beautiful but works, it is common when you do performance tuning...)
Thanks for the suggestions! Indeed using the trick to do constant optimization will be safer, and I was just a bit worried about constant optimization of the call to
__builtin_memcpy
intrinsic as previously I just call it with thesize
argument always being a variable.sorry that I do not quite understand the
(u8, u8, u8, u8)
alignment error case.. I tougly test this on host (something likel = [(True,False,True,False), (True,True,True,True), (False, False, False, False)]
) and it seems to be giving me the expected output?It may give expected result, but it is UB.
I see, thanks!
@ -132,0 +150,4 @@
const int32_t src_len = (src_end >= src_start) ? (src_end - src_start + 1) : 0;
const int32_t dest_len = (dest_end >= dest_start) ? (dest_end - dest_start + 1) : 0;
if (src_len > 0) {
__builtin_memmove(
what if
src_len > dest_len
?I think then it would be extending the list, and we would later add the check to throw an exception prior to entering this function?
Maybe mark it as a TODO.
I mean this irrt function then is supposed to be only run with valid input parameters, for checks on things like
src_len > dest_len
and throwing exceptions about them, we already haveTODO
here?e042db0734
toef4598ba99
rebased on the latest master branch, and use one list slice assignment function to handle various sizes of list element.
btw, just found some style problem in the previous code, can you fix this together in this PR?
https://git.m-labs.hk/M-Labs/nac3/src/branch/master/nac3core/src/codegen/stmt.rs#L86-L154
Please change this to a match with 3 clauses.
ef4598ba99
tod6ab73afb0
rebased as the latest master branch and fix the style problem.