implement numpy arrays, math functions, and matrix math #149
Labels
No Milestone
No Assignees
4 Participants
Notifications
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: M-Labs/nac3#149
Loading…
Reference in New Issue
Block a user
No description provided.
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
same as legacy compiler
https://github.com/m-labs/artiq/issues/1485
M-Labs/artiq-zynq#64
We also have to implement the numpy array type.
implement numpy matrix math and math functionsto implement numpy arrays, math functions, and matrix mathDo we want to try to implement this as a library, or hard-code an
array[element_type, num_dims]
type into the compiler like before? The first option would involve adding enough in terms of template value parameters, function overloading, and operator overloading to implement something like a C++ array library (Eigen, …), which is probably a non-trivial undertaking. Just implementing the array type and simple numpy function broadcasting in the compiler like inartiq.compiler
definitely works, but is maybe a bit inflexible.I guess we probably have to hardcode the array type in the compiler? Not sure how should we handle
num_dims
otherwise.But for function implementations we might be able to implement it as a library.
It's easily possible to implement template value parameters if you don't need to do global type inference for them (as long as you have function templates).
But yes, it seems like adding an array type is overall less complexity and effort, given that at least the nac3artiq layer thens needs to explicitly handle the conversion from host NumPy arrays to whatever array type anyway.
With the general move towards "proper" Python type annotations in NAC3, it would be nice if we could choose a syntax in line with host NumPy, such that users can use static analysis tools, etc.
Unfortunately, this is very much still brewing in upstream NumPy, and what seems like a tentative proposal won't be finalised until PEP646 is in (Python 3.11) and has trickled down into mypy, etc.: https://github.com/numpy/numpy/issues/16544. Then again, much of that complexity comes from trying to encode the actual shape, not just its rank (which adds considerable complexity).
I guess what we could do is to represent our type as
ndarray[num_dims, elem_type]
in anticipation that this will probably be compatible with whatever rank-only syntax they choose.There is also the separate https://github.com/ramonhagenaars/nptyping.
A subscriptable
np.ndarray
has been merged into NumPy https://github.com/numpy/numpy/pull/19879, although the extranp.dtype
wrapper seems undesirable, and I'm not sure what form the shape type is supposed to take (some comments indicate that they haven't figured this out yet; perhaps this is even https://github.com/numpy/numpy/issues/16544). There is alsonumpy.typing.NDArray
, but that doesn't allow specifying a shape type at all.@sb10q Revisiting this problem, I currently have a design dilemma.
Right now, the builtin
NDArray[dtype]
is represented astype { usize, usize*, dtype* }
. This has the benefit of allowing multidimensional arrays to be represented by a flattened 1D representation, but type checking for indexing is to be solved.Consider the following example:
Since the shape information is a runtime property, it will probably be difficult (if not impossible) to tell the number of dimensions, and hence, the type of the expression after each index operation.
The alternative would be to use naive type nested, where a
(2, 3)
-shaped NDArray would be represented asNDArray[NDArray[dtype]]
, but if this is directly translated into LLVM it would introduce overhead due to the increased number of indirections.A middle-ground solution would be to decouple the type from its memory representation, e.g. use
NDArray[NDArray[dtype]]
in the type system but use a flattened 1D array in its in-memory representation. Special lowering will need to be performed for getting and/or setting individual rows/elements, but otherwise should work. The only issue is we will lose the ability to declareNDArray
types with an unknown number of dimensions, but that probably would not be necessary?Let me know what you think about this.
Did you discard requiring the number of dimensions to be known at compile-time (like in the current compiler) for some particular reason?
This struck me as the most suitable tradeoff between flexibility and runtime performance/indexing ergonomics.
There is some prior art in this domain in the form of all the linear-algebra-related libraries for C++, D and so on, for instance Eigen.
In actual scientific code using NumPy/SciPy, reshaping between arrays with different numbers of dimensions is reasonably common. While IIRC not currently supported in
artiq.compiler
, staying with NumPy's model of a flat data buffer + whatever extra indexing metadata made a lot of sense. It also helps with being able to do strided/multidimensional/… slicing in a zero-copy way.After discussing with @sb10q a few days ago, we decided to embed the NDArray number of dimensions into the type itself while keeping the in-memory representation the same.
Effectively the array would be represented as
NDArray[dtype, ndims]
in the type system, wherendims
is a const-generic argument specifying the number of dimensions. The only drawback of this approach AFAICT is some checks will need to be performed at runtime, e.g. checking the number of dimensions inshape
against the number of declaredndims
in the type parameter for array creation or reshape functions, e.g.Yep, this then matches my
artiq.compiler
implementation, and also what I had sketched out for NAC3 a few years ago.By "the same", presumably you mean
(dtype*, (size_t, size_t, …))
, where the tuple of lengths isndims
in size? That sounds good; we can always add support for explicit strides if we want to support multi-dimensional slicing in the future.Yep, this then matches my
artiq.compiler
implementation, and also what I had sketched out for NAC3 a few years ago.By "the same", presumably you mean
(dtype*, (size_t, size_t, …))
, where the tuple of lengths isndims
in size?(apologies, clicked the "Comment and Close" button by mistake)