Strided slicing of one-dimensional arrays (i.e. with non-trivial
steps) might have previously been working, but would have had
different semantics, as all slices were copies rather than a view
into the original data.
Fixing this in the future will require adding support for an index
stride field/tuple to our array representation (and all the
associated indexing logic).
GitHub: Fixes#1627.
This allows assert() to be used on Zynq, where abort() is not
currently implemented for kernels. Furthermore, this is arguably
the more natural implementation of assertions on all kernel targets
(i.e. where embedding into host Python is used), as it matches host
Python behavior, and the exception information actually makes it to
the user rather than leading to a ConnectionClosed error.
Since this does not implement printing of the subexpressions, I
left the old print+abort implementation as default for the time
being.
The lit/integration/instance.py diff isn't just a spurious change;
the exception-based assert implementation exposes a limitation in
the existing closure lifetime tracking algorithm (which is not
supposed to be what is tested there).
GitHub: Fixes#1539.
This generates rather more code than necessary, but has
the advantage of automatically handling incomplete
multi-dimensional subscripts which still leave arrays
behind.
LLVM will take care of optimising the loops. This was still
unnecessarily painful; implementing generics and implementing
this in ARTIQ Python looks very attractive right now.
So far, this is not exposed to the user beyond implicit conversions.
Note that all the implicit conversions, such as triggered by adding
arrays of mismatching types, or dividing integer arrays, are currently
emitted in a maximally inefficient way, where a temporary copy is first
made for the type conversion. The conversions would more sensibly be
implemented during the per-element operations to save on the extra
copies, but the current behaviour fell out of the rest of the IR
generator structure without extra changes.
Matches NumPy. Slicing a TList reallocates, this doesn't; offsetting
couldn't be handled in the IR without introducing new semantics
(the Alloc kludge; could/should be made its own IR type).
I had accidentally not included this local diff in the PR that was
merged as commit cdaf554736.
Doesn't include tests for any of the interesting edge cases yet
(nans, infinities, signed zero, int.min), some of which might be
handled differently than by the host Python. In particular, the
select condition should logically use copysign(arg, 1) for the
comparison (i.e., always clear the sign bit), but currently the
chance of anyone running FP-heavy algorithms where this would
matter on the core device is close to zero [1].
[1] Sign of your choice.