core: reimplement assignment type inference and codegen + other minor changes & refactor #483

Merged
sb10q merged 4 commits from setitem-assign into master 2024-08-05 19:31:06 +08:00
Collaborator

Fixes #482; Fixes #459 partially by allowing LHS list patterns; Required by #411.

Details of this PR are as follows:

1. Treating item assignments differently.

Currently, an assignment statement <LHS> = <RHS> is type-checked like so:

  1. Fold <LHS>
  2. Fold <RHS>
  3. Unify the types of <LHS> and <RHS>

However, item assignments should be treated differently; Consider the following when supporting #411:

my_array = np_zeros((3, 4))
my_array[:] = ... # RHS could be a float, or a ndarray[float, <any ndims>]
my_array[0, 0] = ... # RHS could be a float, or a ndarray[float, <any ndims>]

This PR allows these item assignments to be programmed.

Moreover, for an item assignment expression object[key] = value, the current type inferencer actually infers the type of object[key], but this is not necessary, and it actually doesn't make sense. This is because object[key] = value is the same as object.__setitem__(key, value). This PR fixes this.

2. Supporting '[' ... ']' targets and starred targets.

'[' ... ']' and '(' ... ')' are apparently the same thing. They are different syntaxes to define a so-called "target_list" in https://docs.python.org/3/reference/simple_stmts.html#assignment-statements.

Starred targets are now supported, but the caveat is that their values would be tuples:

(x, *ys, z) = (1, 2, 3, 4)
# ys = (2, 3); In Python, it is [2, 3]
(x, *ys, z) = (1, 2)
# ys = (); In Python, it is []

Nested item assignments are also supported, and assignment works recursively like Python:

x, [(ys[0],), z], w = 1, ((2,), 3), 4 # ok

More examples in nac3standalone/demo/src/assignment.py.

3. Minor bug fix.

This PR fixes the following error as a by-product:

# Suppose x is undefined
(x, x[1], z) = (1, 2, 3)

nac3core currently reports:

Incompatible types: tuple[typevar237[1=typevar236], typevar236, typevar238] and tuple[int32, int32, int32] at src/_test.py:15:6

With this PR:

type error at identifier `x` (cannot get type of x) at src/_test.py:13:9

This is probably due to a mistake in the Inferencer on when LHS patterns should be registered into defined_identifiers. This PR fixes it.

Other notes in this PR

Right now in this PR, for a target list assignment, its corresponding RHS type must be a tuple. Meaning:

(x, y) = (1, 2) # ok
(x, y) = [1, 2] # no. The type inferencer forces RHS to have type `TypeEnum::TTuple { ty: vec![?, ?] }`

The reason for why only tuple types are allowed is because it is easy to implement and tuples have very rich typechecker types. For lists, the implementation might be very complicated with very specialized code mixed into the type inferencer.

Other comparisons between the current nac3core and nac3core with this PR.

def f1():
    [y, z] = [1, 2, 3, 4, 5]
    # Current: Error, y (and z) are undefined
    # With this PR: Error, RHS must be a tuple with 2 elements.

def f2():
    [y, z] = (1, 2, 3, 4, 5)
    # Current: LHS (a tuple) and RHS (a list) type mismatch
    # With this PR: Error, RHS must be a tuple with 2 elements.

def f3():
    y = 99 # define
    z = 100
    [y, z] = [1, 2, 3, 4, 5]
    # Current: `nac3core` crashes because of a `unreachable!()` in codegen/stmt.rs
    # With this PR: Error, RHS must be a tuple with 2 elements.
Fixes https://git.m-labs.hk/M-Labs/nac3/issues/482; Fixes https://git.m-labs.hk/M-Labs/nac3/issues/459 partially by allowing LHS list patterns; Required by https://git.m-labs.hk/M-Labs/nac3/issues/411. Details of this PR are as follows: ### 1. Treating item assignments differently. Currently, an assignment statement `<LHS> = <RHS>` is type-checked like so: 1. Fold `<LHS>` 2. Fold `<RHS>` 3. Unify the types of `<LHS>` and `<RHS>` However, item assignments should be treated differently; Consider the following when supporting https://git.m-labs.hk/M-Labs/nac3/issues/411: ```python my_array = np_zeros((3, 4)) my_array[:] = ... # RHS could be a float, or a ndarray[float, <any ndims>] my_array[0, 0] = ... # RHS could be a float, or a ndarray[float, <any ndims>] ``` This PR allows these item assignments to be programmed. Moreover, for an item assignment expression `object[key] = value`, the current type inferencer actually infers the type of `object[key]`, but this is not necessary, and it actually doesn't make sense. This is because `object[key] = value` is the same as `object.__setitem__(key, value)`. This PR fixes this. ### 2. Supporting `'[' ... ']'` targets and starred targets. `'[' ... ']'` and `'(' ... ')'` are apparently the same thing. They are different syntaxes to define a so-called `"target_list"` in https://docs.python.org/3/reference/simple_stmts.html#assignment-statements. Starred targets are now supported, but the caveat is that their values would be tuples: ``` (x, *ys, z) = (1, 2, 3, 4) # ys = (2, 3); In Python, it is [2, 3] (x, *ys, z) = (1, 2) # ys = (); In Python, it is [] ``` Nested item assignments are also supported, and assignment works recursively like Python: ``` x, [(ys[0],), z], w = 1, ((2,), 3), 4 # ok ``` More examples in `nac3standalone/demo/src/assignment.py`. ### 3. Minor bug fix. This PR fixes the following error as a by-product: ```python # Suppose x is undefined (x, x[1], z) = (1, 2, 3) ``` `nac3core` currently reports: ```log Incompatible types: tuple[typevar237[1=typevar236], typevar236, typevar238] and tuple[int32, int32, int32] at src/_test.py:15:6 ``` With this PR: ```log type error at identifier `x` (cannot get type of x) at src/_test.py:13:9 ``` This is probably due to a mistake in the `Inferencer` on when LHS patterns should be registered into `defined_identifiers`. This PR fixes it. ### Other notes in this PR Right now in this PR, for a target list assignment, its corresponding RHS type must be a tuple. Meaning: ``` (x, y) = (1, 2) # ok (x, y) = [1, 2] # no. The type inferencer forces RHS to have type `TypeEnum::TTuple { ty: vec![?, ?] }` ``` The reason for why only tuple types are allowed is because it is easy to implement and tuples have very rich typechecker types. For lists, the implementation might be very complicated with very specialized code mixed into the type inferencer. ### Other comparisons between the current `nac3core` and `nac3core` with this PR. ``` def f1(): [y, z] = [1, 2, 3, 4, 5] # Current: Error, y (and z) are undefined # With this PR: Error, RHS must be a tuple with 2 elements. def f2(): [y, z] = (1, 2, 3, 4, 5) # Current: LHS (a tuple) and RHS (a list) type mismatch # With this PR: Error, RHS must be a tuple with 2 elements. def f3(): y = 99 # define z = 100 [y, z] = [1, 2, 3, 4, 5] # Current: `nac3core` crashes because of a `unreachable!()` in codegen/stmt.rs # With this PR: Error, RHS must be a tuple with 2 elements. ```
derppening was assigned by lyken 2024-08-02 15:31:11 +08:00
derppening was unassigned by lyken 2024-08-02 15:45:22 +08:00
lyken requested review from derppening 2024-08-02 15:45:24 +08:00
lyken added a new dependency 2024-08-02 15:53:06 +08:00
derppening reviewed 2024-08-02 16:37:26 +08:00
derppening left a comment
Collaborator

Nits for now, 34661ffa7f needs more time for review.

Nits for now, 34661ffa7f needs more time for review.
@ -18,2 +18,3 @@
clippy::too_many_lines,
clippy::wildcard_imports
clippy::wildcard_imports,
clippy::iter_nth_zero
Collaborator

I don't see why this is necessary. Just use .next() as one would.

I don't see why this is necessary. Just use `.next()` as one would.
Author
Collaborator

iter_type_vars() sometimes uses nth(1) and nth(2), etc. Isn't it weird to suddenly use .next() instead of nth(0)? But this is not too important. I can revert that if you want.

`iter_type_vars()` sometimes uses `nth(1)` and `nth(2)`, etc. Isn't it weird to suddenly use `.next()` instead of `nth(0)`? But this is not too important. I can revert that if you want.
Collaborator

Clippy's justification is here.

I wouldn't fight Clippy about this.

Clippy's justification is [here](https://rust-lang.github.io/rust-clippy/master/index.html#iter_nth_zero). I wouldn't fight Clippy about this.
Author
Collaborator

Ok, let's not do this.

Ok, let's not do this.
@ -12,6 +11,7 @@ use super::{
RecordField, RecordKey, Type, TypeEnum, TypeVar, Unifier, VarMap,
},
};
use crate::typecheck::typedef::Mapping;
Collaborator

Fold this into the use block in the next line.

Fold this into the `use` block in the next line.
derppening marked this conversation as resolved
lyken force-pushed setitem-assign from dc1c4aff0e to 070655669b 2024-08-04 15:54:27 +08:00 Compare
lyken changed title from core: distinguish between setitem and getitem in assignments + other minor changes & refactor to WIP: core: distinguish between setitem and getitem in assignments + other minor changes & refactor 2024-08-04 16:05:47 +08:00
Author
Collaborator

Marking it as WIP, I forgot to account for how annoying asterisks on LHS could be.

(x, y, *zs) = (1, 2, 3, 4, 5)

The current implementation is not flexible enough.

Marking it as WIP, I forgot to account for how annoying asterisks on LHS could be. ```python (x, y, *zs) = (1, 2, 3, 4, 5) ``` The current implementation is not flexible enough.
lyken force-pushed setitem-assign from 070655669b to 813705819e 2024-08-04 21:58:27 +08:00 Compare
lyken changed title from WIP: core: distinguish between setitem and getitem in assignments + other minor changes & refactor to WIP: core: reimplement assignment type inference and codegen + other minor changes & refactor 2024-08-04 21:58:44 +08:00
lyken changed title from WIP: core: reimplement assignment type inference and codegen + other minor changes & refactor to core: reimplement assignment type inference and codegen + other minor changes & refactor 2024-08-04 21:58:51 +08:00
Author
Collaborator

Revised PR. The PR message is updated.

Revised PR. The PR message is updated.
lyken force-pushed setitem-assign from 813705819e to 4b927fa62f 2024-08-05 11:07:40 +08:00 Compare
Author
Collaborator

Amended branch to not use clippy::iter_nth_zero.

Amended branch to not use `clippy::iter_nth_zero`.
lyken added 2 commits 2024-08-05 11:20:02 +08:00
Collaborator

No other comments as long as unit tests and standalone tests pass.

No other comments as long as unit tests and standalone tests pass.
derppening requested review from sb10q 2024-08-05 11:44:27 +08:00
sb10q merged commit 3a8c385e01 into master 2024-08-05 19:31:04 +08:00
sb10q deleted branch setitem-assign 2024-08-05 19:32:51 +08:00
Sign in to join this conversation.
No reviewers
No Milestone
No Assignees
2 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Reference: M-Labs/nac3#483
No description provided.