core: reimplement assignment type inference and codegen + other minor changes & refactor #483

lyken · 2024-08-02T15:31:11+08:00

lyken commented

2024-08-02 15:31:11 +08:00

Fixes #482; Fixes #459 partially by allowing LHS list patterns; Required by #411.

Details of this PR are as follows:

1. Treating item assignments differently.

Currently, an assignment statement <LHS> = <RHS> is type-checked like so:

Fold <LHS>
Fold <RHS>
Unify the types of <LHS> and <RHS>

However, item assignments should be treated differently; Consider the following when supporting #411:

my_array = np_zeros((3, 4))
my_array[:] = ... # RHS could be a float, or a ndarray[float, <any ndims>]
my_array[0, 0] = ... # RHS could be a float, or a ndarray[float, <any ndims>]

This PR allows these item assignments to be programmed.

Moreover, for an item assignment expression object[key] = value, the current type inferencer actually infers the type of object[key], but this is not necessary, and it actually doesn't make sense. This is because object[key] = value is the same as object.__setitem__(key, value). This PR fixes this.

2. Supporting `'[' ... ']'` targets and starred targets.

'[' ... ']' and '(' ... ')' are apparently the same thing. They are different syntaxes to define a so-called "target_list" in https://docs.python.org/3/reference/simple_stmts.html#assignment-statements.

Starred targets are now supported, but the caveat is that their values would be tuples:

(x, *ys, z) = (1, 2, 3, 4)
# ys = (2, 3); In Python, it is [2, 3]
(x, *ys, z) = (1, 2)
# ys = (); In Python, it is []

Nested item assignments are also supported, and assignment works recursively like Python:

x, [(ys[0],), z], w = 1, ((2,), 3), 4 # ok

More examples in nac3standalone/demo/src/assignment.py.

3. Minor bug fix.

This PR fixes the following error as a by-product:

# Suppose x is undefined
(x, x[1], z) = (1, 2, 3)

nac3core currently reports:

Incompatible types: tuple[typevar237[1=typevar236], typevar236, typevar238] and tuple[int32, int32, int32] at src/_test.py:15:6

With this PR:

type error at identifier `x` (cannot get type of x) at src/_test.py:13:9

This is probably due to a mistake in the Inferencer on when LHS patterns should be registered into defined_identifiers. This PR fixes it.

Other notes in this PR

Right now in this PR, for a target list assignment, its corresponding RHS type must be a tuple. Meaning:

(x, y) = (1, 2) # ok
(x, y) = [1, 2] # no. The type inferencer forces RHS to have type `TypeEnum::TTuple { ty: vec![?, ?] }`

The reason for why only tuple types are allowed is because it is easy to implement and tuples have very rich typechecker types. For lists, the implementation might be very complicated with very specialized code mixed into the type inferencer.

Other comparisons between the current `nac3core` and `nac3core` with this PR.

def f1():
    [y, z] = [1, 2, 3, 4, 5]
    # Current: Error, y (and z) are undefined
    # With this PR: Error, RHS must be a tuple with 2 elements.

def f2():
    [y, z] = (1, 2, 3, 4, 5)
    # Current: LHS (a tuple) and RHS (a list) type mismatch
    # With this PR: Error, RHS must be a tuple with 2 elements.

def f3():
    y = 99 # define
    z = 100
    [y, z] = [1, 2, 3, 4, 5]
    # Current: `nac3core` crashes because of a `unreachable!()` in codegen/stmt.rs
    # With this PR: Error, RHS must be a tuple with 2 elements.

Fixes https://git.m-labs.hk/M-Labs/nac3/issues/482; Fixes https://git.m-labs.hk/M-Labs/nac3/issues/459 partially by allowing LHS list patterns; Required by https://git.m-labs.hk/M-Labs/nac3/issues/411. Details of this PR are as follows: ### 1. Treating item assignments differently. Currently, an assignment statement `<LHS> = <RHS>` is type-checked like so: 1. Fold `<LHS>` 2. Fold `<RHS>` 3. Unify the types of `<LHS>` and `<RHS>` However, item assignments should be treated differently; Consider the following when supporting https://git.m-labs.hk/M-Labs/nac3/issues/411: ```python my_array = np_zeros((3, 4)) my_array[:] = ... # RHS could be a float, or a ndarray[float, <any ndims>] my_array[0, 0] = ... # RHS could be a float, or a ndarray[float, <any ndims>] ``` This PR allows these item assignments to be programmed. Moreover, for an item assignment expression `object[key] = value`, the current type inferencer actually infers the type of `object[key]`, but this is not necessary, and it actually doesn't make sense. This is because `object[key] = value` is the same as `object.__setitem__(key, value)`. This PR fixes this. ### 2. Supporting `'[' ... ']'` targets and starred targets. `'[' ... ']'` and `'(' ... ')'` are apparently the same thing. They are different syntaxes to define a so-called `"target_list"` in https://docs.python.org/3/reference/simple_stmts.html#assignment-statements. Starred targets are now supported, but the caveat is that their values would be tuples: ``` (x, *ys, z) = (1, 2, 3, 4) # ys = (2, 3); In Python, it is [2, 3] (x, *ys, z) = (1, 2) # ys = (); In Python, it is [] ``` Nested item assignments are also supported, and assignment works recursively like Python: ``` x, [(ys[0],), z], w = 1, ((2,), 3), 4 # ok ``` More examples in `nac3standalone/demo/src/assignment.py`. ### 3. Minor bug fix. This PR fixes the following error as a by-product: ```python # Suppose x is undefined (x, x[1], z) = (1, 2, 3) ``` `nac3core` currently reports: ```log Incompatible types: tuple[typevar237[1=typevar236], typevar236, typevar238] and tuple[int32, int32, int32] at src/_test.py:15:6 ``` With this PR: ```log type error at identifier `x` (cannot get type of x) at src/_test.py:13:9 ``` This is probably due to a mistake in the `Inferencer` on when LHS patterns should be registered into `defined_identifiers`. This PR fixes it. ### Other notes in this PR Right now in this PR, for a target list assignment, its corresponding RHS type must be a tuple. Meaning: ``` (x, y) = (1, 2) # ok (x, y) = [1, 2] # no. The type inferencer forces RHS to have type `TypeEnum::TTuple { ty: vec![?, ?] }` ``` The reason for why only tuple types are allowed is because it is easy to implement and tuples have very rich typechecker types. For lists, the implementation might be very complicated with very specialized code mixed into the type inferencer. ### Other comparisons between the current `nac3core` and `nac3core` with this PR. ``` def f1(): [y, z] = [1, 2, 3, 4, 5] # Current: Error, y (and z) are undefined # With this PR: Error, RHS must be a tuple with 2 elements. def f2(): [y, z] = (1, 2, 3, 4, 5) # Current: LHS (a tuple) and RHS (a list) type mismatch # With this PR: Error, RHS must be a tuple with 2 elements. def f3(): y = 99 # define z = 100 [y, z] = [1, 2, 3, 4, 5] # Current: `nac3core` crashes because of a `unreachable!()` in codegen/stmt.rs # With this PR: Error, RHS must be a tuple with 2 elements. ```

derppening was assigned by lyken

2024-08-02 15:31:11 +08:00

derppening was unassigned by lyken

2024-08-02 15:45:22 +08:00

lyken requested review from derppening 2024-08-02 15:45:24 +08:00

lyken added a new dependency 2024-08-02 15:53:06 +08:00

#411 Implement subscript-assignment for NDArrays

derppening reviewed 2024-08-02 16:37:26 +08:00

derppening left a comment

Nits for now, 34661ffa7f needs more time for review.

nac3core/src/lib.rs Outdated

						
				@ -18,2 +18,3 @@

				    clippy::too_many_lines,

				    clippy::wildcard_imports

				    clippy::wildcard_imports,

				    clippy::iter_nth_zero

derppening commented

2024-08-02 16:33:04 +08:00

I don't see why this is necessary. Just use .next() as one would.

I don't see why this is necessary. Just use `.next()` as one would.

lyken commented

2024-08-02 16:39:16 +08:00

iter_type_vars() sometimes uses nth(1) and nth(2), etc. Isn't it weird to suddenly use .next() instead of nth(0)? But this is not too important. I can revert that if you want.

`iter_type_vars()` sometimes uses `nth(1)` and `nth(2)`, etc. Isn't it weird to suddenly use `.next()` instead of `nth(0)`? But this is not too important. I can revert that if you want.

derppening commented

2024-08-05 10:56:55 +08:00

Clippy's justification is here.

I wouldn't fight Clippy about this.

Clippy's justification is [here](https://rust-lang.github.io/rust-clippy/master/index.html#iter_nth_zero). I wouldn't fight Clippy about this.

lyken commented

2024-08-05 10:57:48 +08:00

Ok, let's not do this.

nac3core/src/typecheck/type_inferencer/mod.rs Outdated

						
				@ -12,6 +11,7 @@ use super::{

				        RecordField, RecordKey, Type, TypeEnum, TypeVar, Unifier, VarMap,

				    },

				};

				use crate::typecheck::typedef::Mapping;

derppening commented

2024-08-02 16:35:20 +08:00

Fold this into the use block in the next line.

Fold this into the `use` block in the next line.

derppening marked this conversation as resolved

lyken force-pushed setitem-assign from dc1c4aff0e to 070655669b

2024-08-04 15:54:27 +08:00

Compare

lyken changed title from ~~core: distinguish between setitem and getitem in assignments + other minor changes & refactor~~ to WIP: core: distinguish between setitem and getitem in assignments + other minor changes & refactor

2024-08-04 16:05:47 +08:00

lyken commented

2024-08-04 16:06:59 +08:00

Marking it as WIP, I forgot to account for how annoying asterisks on LHS could be.

(x, y, *zs) = (1, 2, 3, 4, 5)

The current implementation is not flexible enough.

Marking it as WIP, I forgot to account for how annoying asterisks on LHS could be. ```python (x, y, *zs) = (1, 2, 3, 4, 5) ``` The current implementation is not flexible enough.

lyken force-pushed setitem-assign from 070655669b to 813705819e

2024-08-04 21:58:27 +08:00

Compare

lyken changed title from ~~WIP: core: distinguish between setitem and getitem in assignments + other minor changes & refactor~~ to WIP: core: reimplement assignment type inference and codegen + other minor changes & refactor

2024-08-04 21:58:44 +08:00

lyken changed title from ~~WIP: core: reimplement assignment type inference and codegen + other minor changes & refactor~~ to core: reimplement assignment type inference and codegen + other minor changes & refactor

2024-08-04 21:58:51 +08:00

lyken commented

2024-08-04 21:59:54 +08:00

Revised PR. The PR message is updated.

lyken force-pushed setitem-assign from 813705819e to 4b927fa62f

2024-08-05 11:07:40 +08:00

Compare

lyken commented

2024-08-05 11:08:17 +08:00

Amended branch to not use clippy::iter_nth_zero.

Amended branch to not use `clippy::iter_nth_zero`.

lyken added 2 commits 2024-08-05 11:20:02 +08:00