diff --git a/toy-impl/README.md b/toy-impl/README.md index 3a8ecc0..dabd3d3 100644 --- a/toy-impl/README.md +++ b/toy-impl/README.md @@ -21,8 +21,89 @@ for simplicity reasons: * method override check modulo variable renaming. * more complicated type guard +## Files +All files named `test_xxx` are used for inspecting the result of algorithms, and +can be ignored for now. + +Here is the list of files and their purpose: +* `helper.py`: mainly for the error definition. +* `inference.py`: type-check for function invocation. +* `inheritance.py`: perform method and field inheritance. +* `main.py`: main script for checking an entire python script. +* `parse_expr.py`: type-check for expressions in python AST. +* `parse_stmt.py`: type-check for statements in python AST. +* `primitives.py`: definition of primitives, operations, and built-in functions. +* `top_level.py`: gather class, function and type variable definitions from + python AST. +* `type_def.py`: python class for various types. ## Type Check Implementation +The crucial constraint and assumption in our system is that, every +(sub-)expressions must have their types fully determined, and cannot depend on +statements/expressions after them. Hence, in a function call, every arguments +are well typed. We only have to determine the substitution of type variables +present in the function type signature that makes the type agree. + +There is a tiny difference between unification and our implementation. In our +implementation, the substitution would only be applied to the type signature of +the target function call but not the variables present in the function call. +This way we don't have to make the type variables in the callee fresh before +doing unification. + +Consider the following example: + +```py +X = TypeVar('X') + +def head(a: list[X]) -> X: + return a[0] + +head([1, 2, 3]) +``` + +In this example, the expression `[1, 2, 3]` has type `list[int32]`, so the +algorithm tries to fit `(list[int32])` into `(list[X])`, giving a substitution +`X -> int32`. + +Substitution can also substitute variables into another variable. Consider the +following example: + +``` +X = TypeVar('X') +Y = TypeVar('Y', int32, int64) + +def head(a: list[X]) -> X: + return a[0] + +def sum_of_heads(a: list[Y], b: list[Y]) -> Y: + return head(a) + head(b) +``` + +In this example, `a` has type `list[Y]`, so the algorithm would give a +substitution `X -> Y` for the call `head(a)`, and similarly for `b`. +As `Y` can only range over `int32` and `int64`, in the two valuations of `Y`, +the return statement would have type +* `int32 + int32 : int32 : Y` under `Y -> int32`, and +* `int64 + int64 : int64 : Y` under `Y -> int64`. +So the function is well typed. + +Note that variables are fresh in every invocation. Consider the following +example: +``` +I = TypeVar('I', int32, list[int32]) + +def add(a: int32, b: I) -> int32: + if type(b) == int32: + return a + b + else: + for x in b: + a = add(a, x) + return a + +add(1, [1, 2, 3]) +``` +This one should type check (bug now). `I -> list[int32]` only affects 1 call, +and the recursion inside could substitute `I -> int32`. ## Variable Scoping