Type Variable Constraints #25

Open
opened 2021-07-30 15:21:18 +08:00 by pca006132 · 9 comments
Collaborator

Consider the following code:

V = TypeVar('V', int, bool)
V1 = TypeVar('V1', list[V], int)
V2 = TypeVar('V2', list[V], int)


def foo(a: V1, b: V2):
    pass

Can foo take (list[int], list[bool])? Do we treat type variables within constraints as variables or just a list of possible types? This would affect how we do type checking for functions.

Consider the following code: ```python V = TypeVar('V', int, bool) V1 = TypeVar('V1', list[V], int) V2 = TypeVar('V2', list[V], int) def foo(a: V1, b: V2): pass ``` Can foo take `(list[int], list[bool])`? Do we treat type variables within constraints as variables or just a list of possible types? This would affect how we do type checking for functions.
pca006132 changed title from Type variable semantic to Type Variable Constraints 2021-07-30 15:22:07 +08:00
Author
Collaborator

This is forbidden in pyright but I cannot find any PEP documentation about it.

Our current implementation would accept (list[int], list[bool]).

This is forbidden in pyright but I cannot find any PEP documentation about it. Our current implementation would accept `(list[int], list[bool])`.
Collaborator

Personally I think to consider them as just a list of possible types is more natural to me.

Personally I think to consider them as just a list of possible types is more natural to me.
Collaborator

@pca006132, the meaning of TypeVar('V', int, bool) is a type V which can be either unified with int or with bool, and with nothing else. So of course (list[int], list[bool]) shoudl be rejected. Otherwise, we could not use such variables to do things like constructor overloading.

@pca006132, the meaning of `TypeVar('V', int, bool)` is a type `V` which can be either unified with `int` or with `bool`, and with nothing else. So of course `(list[int], list[bool])` shoudl be rejected. Otherwise, we could not use such variables to do things like constructor overloading.
Owner

@LPTK the issue at hand is with the function foo, which is annotated with V1 and V2 and not V.

Unless I'm mistaken, this is basically a matter of taste and I'd vote for a list of possible types as well, e.g. (list[int], list[bool]) would be accepted.

@LPTK the issue at hand is with the function ``foo``, which is annotated with ``V1`` and ``V2`` and not ``V``. <s>Unless I'm mistaken, this is basically a matter of taste and I'd vote for a list of possible types as well, e.g. ``(list[int], list[bool])`` would be accepted.</s>
Owner

@pca006132 but if it's a list of types and not constraints, why duplicate the V1 definition? You could just have foo(a: V1, b: V1) to have the same behavior if I understand it right.

@pca006132 but if it's a list of types and not constraints, why duplicate the ``V1`` definition? You could just have `` foo(a: V1, b: V1)`` to have the same behavior if I understand it right.
Collaborator

My understanding of Python's (admittedly murky) quantification semantics is that all type variables in a type signature are quantified in a forall at the top-level of the signature, in a form called "prenex polymorphism". I don't see why this should not include type variables that are reached indirectly, through type variable bounds.

So, to use a clearer Haskell-inspired syntax, I understand the signature of foo as:

forall V V1 V2.
  (V  {int, bool}, V1  {list[V], int}, V2  {list[V], int}) =>
  (a: V1, b: V2) -> ()

A semantics where foo can take (list[int], list[bool]) would require a different interpretation. Either you'd say that the quantification of V happens at the level of V1 and V2's bounds, as in exists V. (V ∈ {int, bool}) => V1 ∈ {list[V], int}, but that seems fairly complictaed. Or you could say that all variables in bound specifications are "refreshed" and are thus distinct across individual bound specifications.

Both of these seem more complicated than my initial interpretation, which is also visibly the one taken by pyright. Also note that in the alternative interpretations, if you do want to share type variables between bound specifications, you're out of luck.

My understanding of Python's (admittedly murky) quantification semantics is that all type variables in a type signature are quantified in a `forall` at the top-level of the signature, in a form called "prenex polymorphism". I don't see why this should not include type variables that are reached indirectly, through type variable bounds. So, to use a clearer Haskell-inspired syntax, I understand the signature of `foo` as: ```haskell forall V V1 V2. (V ∈ {int, bool}, V1 ∈ {list[V], int}, V2 ∈ {list[V], int}) => (a: V1, b: V2) -> () ``` A semantics where `foo` can take `(list[int], list[bool])` would require a different interpretation. Either you'd say that the quantification of `V` happens at the level of `V1` and `V2`'s bounds, as in `exists V. (V ∈ {int, bool}) => V1 ∈ {list[V], int}`, but that seems fairly complictaed. Or you could say that all variables in bound specifications are "refreshed" and are thus distinct across individual bound specifications. Both of these seem more complicated than my initial interpretation, which is also visibly the one taken by pyright. Also note that in the alternative interpretations, if you _do_ want to share type variables between bound specifications, you're out of luck.
Author
Collaborator

Or you could say that all variables in bound specifications are "refreshed" and are thus distinct across individual bound specifications.

Yes I'm thinking if we should do this.

which is also visibly the one taken by pyright.

No, pyright would just disallow the use of type variables within the bound. I'm also thinking about this, but this would not allow us to have bounds like list[A] where A is a type variable (although this may not be very useful).

Maybe we can just take the simpler approach by disallowing the use of type variables within bounds for now?

> Or you could say that all variables in bound specifications are "refreshed" and are thus distinct across individual bound specifications. Yes I'm thinking if we should do this. > which is also visibly the one taken by pyright. No, pyright would just disallow the use of type variables within the bound. I'm also thinking about this, but this would not allow us to have bounds like `list[A]` where `A` is a type variable (although this may not be very useful). Maybe we can just take the simpler approach by disallowing the use of type variables within bounds for now?
Collaborator

this would not allow us to have bounds like list[A] where A is a type variable (although this may not be very useful)

It can be pretty useful. There is no particular reason to forbid type variables in bounds. Although the use-cases are probably not very widespread, since pyright does not even support them.

Maybe we can just take the simpler approach by disallowing the use of type variables within bounds for now?

Yes, for now it's certainly fine to disallow them at first.

> this would not allow us to have bounds like list[A] where A is a type variable (although this may not be very useful) It can be pretty useful. There is no particular reason to forbid type variables in bounds. Although the use-cases are probably not very widespread, since pyright does not even support them. > Maybe we can just take the simpler approach by disallowing the use of type variables within bounds for now? Yes, for now it's certainly fine to disallow them at first.
Collaborator

For the record, here's one legitimately useful example of type variables in bounds. Suppose we want to write an append function that works on either strings or lists. Its type could be:

A = TypeVar('A')
App = TypeVar('App', String, List[A])

append(xs: App, ys: App): App

We wouldn't want that appending two lists return a list of any type! Instead, we'd want the element types of both the argument lists and the result list to be the same. So the refreshing semantics seems pretty inadequate.

For the record, here's one legitimately useful example of type variables in bounds. Suppose we want to write an `append` function that works on either strings or lists. Its type could be: ```python A = TypeVar('A') App = TypeVar('App', String, List[A]) append(xs: App, ys: App): App ``` We wouldn't want that appending two lists return a list of _any_ type! Instead, we'd want the element types of both the argument lists and the result list to be the same. So the refreshing semantics seems pretty inadequate.
Sign in to join this conversation.
No Label
No Milestone
No Assignees
4 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: M-Labs/nac3-spec#25
No description provided.