ndstrides: [1] Introduce models #509
No reviewers
Labels
No Milestone
No Assignees
2 Participants
Notifications
Due Date
No due date set.
Blocks
Reference: M-Labs/nac3#509
Loading…
Reference in New Issue
Block a user
No description provided.
Delete Branch "ndstrides-1-model"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Added the
Model<'ctx>
abstraction. This is a thin layer of Rust type-hints over LLVM values and types.It also has mechanisms to help defining new LLVM struct types with very little boilerplate. LLVM 15 opaque pointers have also been accounted for. Shortly there will be a PR that refactors nac3core with Models.
If this is undesired, I can rewrite the entire ndstrides branch to use
Proxy{Value,Type}
.When put in practice, I think
a531d0127a (diff-d202bc814a4d1cf1e6c1a040946e598e69197246)
might give some ideas of how it is used.In addition to the comments provided,
Into
implementations to convertInstance
intoBasicValue
andModel
intoBasicType
?Traversal
classes forStruct
is really confusing me. Why is it necessary? Why couldn't it be made as aVec
and then useiter
to collect information regarding each field?@ -0,0 +11,4 @@
use super::*;
/// Trait for Rust structs identifying length values for [`Array`].
pub trait LenKind: fmt::Debug + Clone + Copy {
If this is only specific to
Array
(and not other kinds of containers), it might be better if it's named asArrayLen
.@ -0,0 +12,4 @@
/// Trait for Rust structs identifying length values for [`Array`].
pub trait LenKind: fmt::Debug + Clone + Copy {
fn get_length(&self) -> u32;
I think you can just name it as
length()
.@ -0,0 +39,4 @@
///
/// `Len` should be of a [`LenKind`] and `Item` should be a of [`Model`].
#[derive(Debug, Clone, Copy, Default)]
pub struct Array<Len, Item> {
Why are the type variables not directly bounded by
LenKind
andModel
respectively then, rather than only bounding theimpl
s?Bounding say
Item: Model<'ctx>
requiresArray
itself to also have'ctx
, then a_PhantomData
.PhantomData
might be worth it in this case, because the entireArray
class is bounded by the aforementioned type bounds, not just the specificimpl
s.I am not sure if this is a good idea. I intuitively think it is less risky by keeping the generics as lax as possible even if it seems very appealing to constraint the struct, though I only first touched Rust 3 months ago.
I was trying to foresee if there will be any super wacky unfixable because-of-the-Rust-type-system-and-its-limitations issue that will crop up in the future because we are over-traiting, but honestly I can't tell.
Also a minor drawback is that
'ctx
is now everywhere when you writeArray<'ctx, X, Y>
, and this will includeInt
,Float
,Struct
andPtr
.Though it can prevent confusions such as "why my
Array<Len<3>, Int32>
is not a model?? Oh, I have to putInt<Int32>
" which is quite nice.Leaving everything as it is also does not seem to have any drawbacks in my naive opinion.
I wouldn't be too concerned with using
PhantomData
here. I'd rather the type bounds of the struct be explicit so that users of this class do not misuse the fields.In fact, we should also introduce a
new
function here and makelen
anditem
non-pub
imo.@ -0,0 +88,4 @@
ctx: &CodeGenContext<'ctx, '_>,
i: IntValue<'ctx>,
) -> Instance<'ctx, Ptr<Item>> {
let zero = ctx.ctx.i32_type().const_zero();
Pretty sure this can be
i64
/usize
- LLVM docs:Change this for the other array stuff as well.
I am not sure what should be changed for line 88-94.
You mean
let zero = <something else?>
@ -0,0 +95,4 @@
}
/// Like `gep` but `i` is a constant.
pub fn gep_const(&self, ctx: &CodeGenContext<'ctx, '_>, i: u64) -> Instance<'ctx, Ptr<Item>> {
i64
- See above.I am confused why we should use
i: i64
ingep_const
.Shouldn't it be the case that
0 <= i < self.model.0.len.length()
? even if indexing with a negative index is technically possible.@ -0,0 +102,4 @@
self.model.0.len.get_length()
);
let i = ctx.ctx.i32_type().const_int(i, false);
https://git.m-labs.hk/M-Labs/nac3/pulls/509/files#issuecomment-11752
@ -0,0 +11,4 @@
pub struct ModelError(pub String);
impl ModelError {
// Append a context message to the error.
nit:
///
@ -0,0 +68,4 @@
/// Return the [`BasicType`] of this model.
#[must_use]
fn get_type<G: CodeGenerator + ?Sized>(&self, generator: &G, ctx: &'ctx Context) -> Self::Type;
I would suggest naming it
llvm_type
. Use ofget_
is discouraged in Rust AFAIK.https://rust-lang.github.io/api-guidelines/naming.html#getter-names-follow-rust-convention-c-getter
@ -0,0 +71,4 @@
fn get_type<G: CodeGenerator + ?Sized>(&self, generator: &G, ctx: &'ctx Context) -> Self::Type;
/// Get the number of bytes of the [`BasicType`] of this model.
fn sizeof<G: CodeGenerator + ?Sized>(
I would suggest following Inkwell conventions -
size_of
.@ -0,0 +91,4 @@
///
/// Caller must make sure the type of `value` and the type of this `model` are equivalent.
#[must_use]
fn believe_value(&self, value: Self::Value) -> Instance<'ctx, Self> {
I suspect this might be a candidate for
unsafe
, as it is a requirement for the caller to enforce this contract.,@ -0,0 +152,4 @@
len: IntValue<'ctx>,
name: Option<&'ctx str>,
) -> Result<Instance<'ctx, Ptr<Self>>, String> {
// TODO: Remove ArraySliceValue
Are there any types in
Model
that distinguishes between an array alloca and alloca? I think it would be preferable to have something like anArrayPtr
to replace the type of anArraySliceValue
.No. An array alloc's length is never coupled with the pointer.
I was doing something like
ArraySliceValue
before, but the length value didn't have much use and often ends up being redundant.@ -0,0 +195,4 @@
pub struct Instance<'ctx, M: Model<'ctx>> {
/// The model of this instance.
pub model: M,
/// The value of this instance.
nit: newline between fields
@ -0,0 +57,4 @@
}
#[derive(Debug, Clone, Copy, Default)]
pub struct Float<N>(pub N);
Same as
Array
, boundN
toFloatKind
here as well.@ -0,0 +80,4 @@
let exp_ty = self.0.get_float_type(generator, ctx);
// TODO: Inkwell does not have get_bit_width for FloatType?
No it does not, it has a limited set of supporting fp types.
@ -0,0 +35,4 @@
/// If `my_function_name` has not been declared in `ctx.module`, once `.returning()` is called, a function
/// declaration of `my_function_name` is added to `ctx.module`, where the [`FunctionType`] is deduced from
/// the argument types and returning type.
pub struct CallFunction<'ctx, 'a, 'b, 'c, 'd, G: CodeGenerator + ?Sized> {
I think
FunctionCall
orFuncCall
might be a better name.Sure
@ -0,0 +47,4 @@
}
impl<'ctx, 'a, 'b, 'c, 'd, G: CodeGenerator + ?Sized> CallFunction<'ctx, 'a, 'b, 'c, 'd, G> {
pub fn begin(generator: &'d mut G, ctx: &'b CodeGenContext<'ctx, 'a>, name: &'c str) -> Self {
I would name this
builder
instead.Unrelated: Is there any way to improve the situation about the lifetime signature for
CallFunction
? or is this unavoidable.I did some testing and this might work:
AFAICT this will work as long as the
CallFunction
instance is used in the same scope as it is declared, orname
's lifetime extends untilreturning
is invoked.@ -0,0 +72,4 @@
/// Call the function and expect the function to return a value of type of `return_model`.
#[must_use]
pub fn returning<M: Model<'ctx>>(self, name: &str, return_model: M) -> Instance<'ctx, M> {
I don't really agree with the naming of the
returning
family of functions, as it doesn't indicate to the caller that by invokingreturning
it actually invokes the function as well.I would suggest adding the return name and model to the
begin
function as anOption<(&str, Option<M>)>
.Passing
Option<(&str, Option<M>)>
tobegin
to define the return type may not be a good idea - If the function returns nothing, there is no type forM
and Rust complains.What about two functions, one that creates a void func and another that creates a value-returning func?
If you are talking about something like
begin_void
andbegin_with_return
, sadly no.It requires the return model
M
to somehow be propagated through the.arg().arg().arg()
chain until, say,.finish_and_call()
consuming theFnCall
and returning anInstance<'ctx, M>
.This means that
FnCall
itself would have to holdM
in its type.So in the end
M
is still there even forbegin_void
, and there is no type forM
.@ -0,0 +94,4 @@
}
#[derive(Debug, Clone, Copy, Default)]
pub struct Int<N>(pub N);
Same as with
Float
andArray
.@ -0,0 +129,4 @@
}
impl<'ctx, N: IntKind<'ctx>> Int<N> {
pub fn const_int<G: CodeGenerator + ?Sized>(
What if I want a signed constant then?
Oh I forgot to add another interface for it.
@ -0,0 +388,4 @@
}
#[must_use]
pub fn add(&self, ctx: &CodeGenContext<'ctx, '_>, other: Self) -> Self {
Why are operators provided by some functions but not by others?
I was adding the ones I needed to implement ndarrays. I figured I should add these ops on demand.
@ -0,0 +110,4 @@
pub fn offset_const(
&self,
ctx: &CodeGenContext<'ctx, '_>,
offset: u64,
Same as
Array
-i64
.@ -0,0 +112,4 @@
ctx: &CodeGenContext<'ctx, '_>,
offset: u64,
) -> Instance<'ctx, Ptr<Item>> {
let offset = ctx.ctx.i32_type().const_int(offset, false);
i64_type
,is_signed: true
.@ -0,0 +198,4 @@
source: Self,
num_items: IntValue<'ctx>,
) {
// Force extend `num_items` and `itemsize` to `i64` so their types would match.
I would suggest extending them to
SizeT
to match the signature ofmemcpy
in C.@ -0,0 +11,4 @@
use super::*;
/// A traveral that traverses a Rust `struct` that is used to declare an LLVM's struct's field types.
pub trait FieldTraversal<'ctx> {
MapStructFields
? Since the only operations its implementing classes do is to add fields based on some criteria, matching whatmap
does to an iterator. OrFieldMapper
.@ -0,0 +13,4 @@
/// A traveral that traverses a Rust `struct` that is used to declare an LLVM's struct's field types.
pub trait FieldTraversal<'ctx> {
/// Output type of [`FieldTraversal::add`].
type Out<M>;
This name is non-descriptive.
Output
orOutType
would be better.@ -0,0 +39,4 @@
}
/// A traversal to calculate the GEP index of fields.
pub struct GepFieldTraversal {
FieldIndexCollector
? Same for below - Name those as*Collector
?This is a traversal that calculates the correct GEP index of each struct field.
NDArrayFields<GepFieldTraversal>::fields()
yields:And this object is used to enable the
my_ndarray.gep(|f| f.data).set_index_const(0, my_value)
syntax.@ -0,0 +41,4 @@
/// A traversal to calculate the GEP index of fields.
pub struct GepFieldTraversal {
/// The current GEP index.
gep_index_counter: u64,
u32
.@ -0,0 +100,4 @@
self.errors
.push(err.under_context(format!("field #{gep_index} '{name}'").as_str()));
}
} // Otherwise, it will be caught by Struct's `check_type`.
nit: newline
@ -0,0 +161,4 @@
/// let ndarray: Instance<'ctx, Ptr<F64NDArray>> = model.alloca(generator, ctx);
/// ```
///
/// ...and here is how you may manipulate/access `ndarray`:
Move this into the documentation for their respective functions. It is very difficult to refer to these examples when you are trying to look for how to use the APIs.
@ -0,0 +192,4 @@
/// Traverse through all fields of this [`StructKind`].
///
/// Only used internally in this module for implementing other components.
fn traverse_fields<F: FieldTraversal<'ctx>>(&self, traversal: &mut F) -> Self::Fields<F>;
iter_fields
@ -0,0 +198,4 @@
///
/// Only used internally in this module for implementing other components.
fn fields(&self) -> Self::Fields<GepFieldTraversal> {
self.traverse_fields(&mut GepFieldTraversal { gep_index_counter: 0 })
Consider adding a
new()
function forGepFieldTraversal
.@ -0,0 +207,4 @@
generator: &G,
ctx: &'ctx Context,
) -> StructType<'ctx> {
let mut traversal = TypeFieldTraversal { generator, ctx, field_types: Vec::new() };
Consider adding a
new()
function for this.@ -0,0 +258,4 @@
};
// Check each field individually.
let mut traversal = CheckTypeFieldTraversal {
Consider adding a
new()
function for this.No, they could have been added but I just... didn't do that and kept using
Instance::value
for some reason. I can add them now.I guess you mean there could have been a single visitor that collects the fields into a
Vec<dyn Model>
and do stuff with thatVec
(like checking field types and more).The short explanation is that I have tried, but Rust lifetimes were everywhere and it was quite unholy. I settled on this pattern and it works out fine, but I do not know of a better way to design this in Rust and play along with Rust's type system + without running into lifetime hell.
What would a hypothetical lifetime-filled implementation look like?
I was trying to redo what I did that caused a mayhem of lifetimes, but I just came up a minor refactor to allow something like
Vec<dyn Model>
to exist and remove the complexity thatFieldTraversal
brings.The branch https://git.m-labs.hk/M-Labs/nac3/src/branch/ndstrides-1-model-lifetime has almost all of your suggestions applied + the redesign. See
4e714cb53b
for the exact commit that introduces the new design.Basically, I was trying to have something like
Vec<dyn Model>
.Model
itself is not object safe, I segregated thecheck_type
part andget_type
part out intoModelBase
, and then everything else follows. WithVec<dyn ModelBase>
, now I can implementStructKind::get_struct_type
andStruct::check_type_impl
nicely.Should I use this instead of
FieldTraversal
?Update on
An
Into
forModel
toBasicType
is not possible because it requires extra arguments.An
Into
forInstance
toBasicValue
is doable, but I am not sure if it will worth the "bloat" (in my view):Note that trying to implement this is impossible:
So I must implement for all possible types of
BasicValue
variant theirInto
impl. e.g.,Into<PointerValue<'ctx>> for Instance<'ctx, Ptr>
,Into<IntValue<'ctx>> for Instance<'ctx, Int>
, etc...I suggest sticking to
.value
.ndstrides: Introduce modelsto ndstrides #1: Introduce modelsndstrides #1: Introduce modelsto ndstrides: [1] Introduce modelsdfbbe66154
tod1c7a8ee50
Rebased
ndstrides-1-model
ontondstrides
; Branchndstrides
has beenreset --hard
tomaster
. Note that the IRRT changes fromndstrides-enhance-irrt
have been merged intomaster
.d1c7a8ee50
to668cad46b4
668cad46b4
to0bc98c447d
0bc98c447d
to486cf68091
486cf68091
to499efa369c
499efa369c
tod9c7028761
d9c7028761
to870a922367
After looking at the source code more closely, I think I will just refactor this entire changeset to use
Proxy{Type,Value}
instead.The problem I have with the design of
Model
andInstance
is that most of the functionality is provided by theModel
class and its implementations, which IMO is not as good as operating directly on instances (thinkInt(Int32).s_extend(generator, ctx, value)
vsvalue.s_extend(generator, ctx)
. There is also quite a bit of functionality overlap betweenModel
andProxy
to the point where I doubt if everything can really just be implemented in terms of (augmented versions of)ProxyType
andProxyValue
without introducing new classes and paradigms.My plan is to refactor
Proxy{Type,Value}
to also be able to support LLVM primitive types, integrate any changes that will be useful (e.g. structure-related builders and traversers), create migration deprecation to move away fromModel
, and rebase the entire changeset.Pull request closed