For loop type inference fix #128

Merged
sb10q merged 2 commits from for_loop_type_fix into master 2021-12-19 18:01:49 +08:00
Collaborator

I encounter the bug when testing my fix for len.

Now it seems that with range, the nested for loop type check will be broken:

def run() -> int32:
    for r in [range(3)]:
        for i in r:
            pass
    return 0

will give:

Cannot unify range with list[var26] at line 36 column 14

And if we mention i to be int:

def run() -> int32:
    for r in [range(3)]:
        for i in r:
            output_int(i)
    return 0

it will still give:

Cannot unify range with list[int32] at line 36 column 14

As fas as I understand it, it is due to the sequence we do the type inference: we first go to the for body and then check whether our iterator and target are of compatible types. So in this case of nested for loop, r will be a typevar in the first for loop body, and previously we wrongly assume that if r is not of type range, then it is of type list, so we get the error.

I am not entirely sure if this way would be appropriate. I thought about some other ways but does not seem to work, If we constraints r to be something like var1(range = [`range`, list[var2]]), there is no way we can later get the type of i if r turns out to be of type range in the outer for loop. So I think we need to change the order we do type inference on for loop to first constranint the type of iter and target (similar to assignment), and then type check the body... (BTW I think this might also become a related potential problem if we later add support to allow typevars to be inside typevar constraints? like how do we unify var0[var1, var2] with int?)

I also added some concrete check when I think should be missing previously, which I think might cause some typevars to leak to the codegen phase and cause panic.

I encounter the bug when testing my fix for `len`. Now it seems that with `range`, the nested for loop type check will be broken: ```python def run() -> int32: for r in [range(3)]: for i in r: pass return 0 ``` will give: ``` Cannot unify range with list[var26] at line 36 column 14 ``` And if we mention `i` to be int: ```python def run() -> int32: for r in [range(3)]: for i in r: output_int(i) return 0 ``` it will still give: ``` Cannot unify range with list[int32] at line 36 column 14 ``` --- As fas as I understand it, it is due to the sequence we do the type inference: we first go to the `for` body and then check whether our iterator and target are of compatible types. So in this case of nested for loop, `r` will be a typevar in the first for loop body, and previously we wrongly assume that if `r` is not of type `range`, then it is of type list, so we get the error. I am not entirely sure if this way would be appropriate. I thought about some other ways but does not seem to work, If we constraints `r` to be something like ```var1(range = [`range`, list[var2]])```, there is no way we can later get the type of `i` if `r` turns out to be of type `range` in the outer for loop. So I think we need to change the order we do type inference on for loop to first constranint the type of iter and target (similar to assignment), and then type check the body... (BTW I think this might also become a related potential problem if we later add support to allow typevars to be inside typevar constraints? like how do we unify `var0[var1, var2]` with `int`?) I also added some concrete check when I think should be missing previously, which I think might cause some typevars to leak to the codegen phase and cause panic.
ychenfo added 2 commits 2021-12-12 06:30:15 +08:00
sb10q requested review from pca006132 2021-12-12 10:56:34 +08:00
Poster
Collaborator

I think for some more time and the major problem seems to be the unification between the type var var1(range = [list[var2], range])(list of something or range), and the type list[some concrete type T] or range

  • when list[some concrete type T] is the case, how can var2 get update with some concrete type T?
  • when range is the case, how to get var2 to be int32?

Maybe I did not think about it in a right way... how the order of doing things matters here as the unification process is supposed to be undirected anyway? The current commits workaround it when the type of the iter of the outmost loop is known, but might fail when that type is still a type var... I am not sure whether I missed anything but it seems that it would not be a type var?

I think for some more time and the major problem seems to be the unification between the type var `var1(range = [list[var2], range])`(list of something or range), and the type `list[some concrete type T]` or `range` - when `list[some concrete type T]` is the case, how can `var2` get update with `some concrete type T`? - when `range` is the case, how to get `var2` to be `int32`? Maybe I did not think about it in a right way... how the order of doing things matters here as the unification process is supposed to be undirected anyway? The current commits workaround it when the type of the iter of the outmost loop is known, but might fail when that type is still a type var... I am not sure whether I missed anything but it seems that it would not be a type var?

Actually, the fundamental problem is due to how we treat iterators. We have an inference rule which states that for _ in expr implies that expr: list[T]. The range type is something I later hacked into the type system without much consideration and would break in some cases (by break I mean report a type error when it is correct, but not inferring an incorrect type).

A proper fix would be to change the type system to allow for iterator type, which can be range type, list type or other user defined iterators. But I think if this patch can make the above example compile, it should be fine for the majority of the cases.

Actually, the fundamental problem is due to how we treat iterators. We have an inference rule which states that `for _ in expr` implies that `expr: list[T]`. The range type is something I later hacked into the type system without much consideration and would break in some cases (by break I mean report a type error when it is correct, but not inferring an incorrect type). A proper fix would be to change the type system to allow for iterator type, which can be range type, list type or other user defined iterators. But I think if this patch can make the above example compile, it should be fine for the majority of the cases.
Poster
Collaborator

A proper fix would be to change the type system to allow for iterator type, which can be range type, list type or other user defined iterators.

Ah I get it, yes and I think for this issue (#137) we would need to add iterator type to the existing type system anyway. For now this patch should be working.

> A proper fix would be to change the type system to allow for iterator type, which can be range type, list type or other user defined iterators. Ah I get it, yes and I think for this issue (#137) we would need to add iterator type to the existing type system anyway. For now this patch should be working.
pca006132 approved these changes 2021-12-19 16:25:58 +08:00
sb10q merged commit cb450372d6 into master 2021-12-19 18:01:49 +08:00
Sign in to join this conversation.
No reviewers
No Milestone
No Assignees
2 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: M-Labs/nac3#128
There is no content yet.