Updated specification #8
157
README.md
|
@ -5,23 +5,41 @@ Specification and discussions about language design.
|
||||||
A toy implementation is in [`toy-impl`](./toy-impl), requires python 3.9.
|
A toy implementation is in [`toy-impl`](./toy-impl), requires python 3.9.
|
||||||
|
|
||||||
|
|
||||||
## Referencing Python Variables
|
## Referencing Host Variables from Kernel
|
||||||
The kernel is allowed to read host Python variables, but has to specify with
|
Host variable to be accessed must be declared as `global` in the kernel
|
||||||
pca006132 marked this conversation as resolved
|
|||||||
`global` before referencing them. This is to simplify and speed-up
|
function. This is to simplify and speed-up implementation, and also warn the
|
||||||
implementation, and also warn the user about the variable being global. (prevent
|
user about the variable being global. (prevent calling the interpreter many
|
||||||
calling the interpreter many times during compilation if there are many
|
times during compilation if there are many references to host variables)
|
||||||
references to host variables)
|
|
||||||
|
|
||||||
Only primitive types and tuple/list of primitive types are allowed.
|
Kernel cannot modify host variables, this would be checked by the compiler.
|
||||||
|
Value that can be observed by the kernel would be frozen once the kernel has
|
||||||
|
been compiled, subsequence modification within the host would not affect the
|
||||||
sb10q
commented
Ah, this would cause another problem like those in #5 Ah, this would cause another problem like those in https://git.m-labs.hk/M-Labs/nac3-spec/issues/5
But deleting globals or trying to make them immutable in the interpreter sounds messy, and a global being modified in a RPC seems to be a niche case.
So I'm ok with this current proposal, the behavior simply should be documented.
pca006132
commented
Yes, I was thinking about this either, but it seems like there is no easy way around this. Yes, I was thinking about this either, but it seems like there is no easy way around this.
lriesebos
commented
Why don't we force users to type globals used in kernels as Why don't we force users to type globals used in kernels as `Invariant[]`? The semantics are the same, making it clear to the user how this global variable can be used in kernels.
sb10q
commented
It cannot be enforced in the interpreter. It cannot be enforced in the interpreter.
lriesebos
commented
I understand, it would just potentially make it more clear to the user that the global variable is immutable in the kernel. But I am fine with either! I understand, it would just potentially make it more clear to the user that the global variable is immutable in the kernel. But I am fine with either!
pca006132
commented
I think the types that we allow are basically immutable? Int, tuples, string, etc. I think the types that we allow are basically immutable? Int, tuples, string, etc.
Even if they are decorated with `Invariant[T]`, the user can just re-assign them, so hard to enforce.
sb10q
commented
For classes there are special For classes there are special ``__getattr__`` and ``__setattr__`` methods that we can use to hack the interpreter and enforce ``KernelImmutable`` restrictions.
There is no equivalent for modules/globals.
lriesebos
commented
Yes, @sb10q showed a POC for immutable instance variables in #5, and I agree this does not work for global variables. For me, the reasoning was to type global variables as Yes, @sb10q showed a POC for immutable instance variables in #5, and I agree this does not work for global variables.
For me, the reasoning was to type global variables as `KernelImmutable[]` because they are per definition kernel immutable. That would align their type annotation with instance variables. Though since we can not enforce that at runtime, globals will anyway be a special case that needs documentation. So as mentioned before, I am also fine with not typing it `KernelImmutable[]`.
|
|||||||
|
kernel.
|
||||||
|
|
||||||
The value would be substituted at compile time, subsequent modification in the
|
Only types supported in the kernel can be referenced.
|
||||||
host would not be known by the kernel.
|
|
||||||
|
|
||||||
Modification in kernel code to the global variables is not allowed.
|
Examples:
|
||||||
|
```python
|
||||||
|
FOO = 0
|
||||||
|
|
||||||
|
@kernel
|
||||||
|
def correct() -> int:
|
||||||
|
global FOO
|
||||||
|
return FOO + 1
|
||||||
|
|
||||||
|
@kernel
|
||||||
|
def fail_without_global() -> int:
|
||||||
|
return FOO + 2
|
||||||
|
|
||||||
|
@kernel
|
||||||
|
def fail_write() -> None:
|
||||||
|
FOO += 1
|
||||||
|
|
||||||
|
```
|
||||||
|
|
||||||
## Class and Functions
|
## Class and Functions
|
||||||
* Class fields must be annotated:
|
* Instance variables must be annotated: (Issue #1)
|
||||||
```py
|
```python
|
||||||
class Foo:
|
class Foo:
|
||||||
a: int
|
a: int
|
||||||
b: int
|
b: int
|
||||||
|
@ -29,24 +47,26 @@ Modification in kernel code to the global variables is not allowed.
|
||||||
self.a = a
|
self.a = a
|
||||||
self.b = b
|
self.b = b
|
||||||
```
|
```
|
||||||
|
* Three types of instance variables: (Issue #5)
|
||||||
|
* Host only variables: Do not add type annotation for it in the class.
|
||||||
|
* Kernel Invariants: Immutable in the kernel and in the host while the kernel
|
||||||
|
is executing. Type: `KernelInvariant[T]`. The types must be immutable.
|
||||||
|
(use tuple instead of list in the host, but the type annotation should still
|
||||||
|
be list?)
|
||||||
|
* Normal Variables: The host can only assign to them in the `__init__`
|
||||||
|
function. Not accessible afterwards.
|
||||||
* Functions require full type signature, including type annotation to every
|
* Functions require full type signature, including type annotation to every
|
||||||
parameter and return type.
|
parameter and return type.
|
||||||
```py
|
```python
|
||||||
def add(a: int, b: int) -> int:
|
def add(a: int, b: int) -> int:
|
||||||
return a + b
|
return a + b
|
||||||
```
|
```
|
||||||
* No implicit coercion, require implicit cast. Integers are int32 by default,
|
|
||||||
floating point numbers are double by default.
|
|
||||||
* RPCs: optional parameter type signature, require return type signature.
|
* RPCs: optional parameter type signature, require return type signature.
|
||||||
* Function default parameters are not allowed, as changes to the default value
|
* Function default parameters must be immutable.
|
||||||
would not be kept across kernel calls, which is a potential source of
|
* Function pointers are supported, and lambda expression is not supported
|
||||||
confusion. (maybe we can allow primitive default types?)
|
currently. (maybe support lambda after implementing type inference?)
|
||||||
* Cannot construct objects within kernel code.
|
|
||||||
|
|
||||||
Questions:
|
Its type is denoted by the typing library, e.g. `Call[[int32, int32], int32]`.
|
||||||
* Can we construct objects within kernel code?
|
|
||||||
* Should we support function pointers? What about subtyping with function
|
|
||||||
pointers, and generic types?
|
|
||||||
|
|
||||||
## Built-in Types
|
## Built-in Types
|
||||||
* Primitive types include:
|
* Primitive types include:
|
||||||
|
@ -57,14 +77,14 @@ Questions:
|
||||||
* `uint32`
|
* `uint32`
|
||||||
* `uint64`
|
* `uint64`
|
||||||
* `float`
|
* `float`
|
||||||
* `str` (note: fixed length, provide builtin methods?)
|
* `str`
|
||||||
* `bytes` (a list of `byte`, but with more convenient syntax)
|
* `bytes`
|
||||||
* Collections include:
|
* Collections include:
|
||||||
* `list`: homogeneous (elements must be of the same type) fixed-size (no
|
* `list`: homogeneous (elements must be of the same type) fixed-size (no
|
||||||
append) list.
|
append) list.
|
||||||
* `tuple`: inhomogeneous fixed-size list, only pattern
|
* `tuple`: inhomogeneous immutable list, only pattern
|
||||||
matching (e.g. `a, b, c = (1, True, 1.2)`) and constant indexing
|
matching (e.g. `a, b, c = (1, True, 1.2)`) and constant indexing is
|
||||||
is supported:
|
supported:
|
||||||
```
|
```
|
||||||
t = (1, True)
|
t = (1, True)
|
||||||
# OK
|
# OK
|
||||||
|
@ -75,23 +95,34 @@ Questions:
|
||||||
i = 0
|
i = 0
|
||||||
a = t[i]
|
a = t[i]
|
||||||
```
|
```
|
||||||
* `range` (over numerical types) (not sure if this is really useful)
|
* `range` (over numerical types)
|
||||||
|
|
||||||
### Numerical Types
|
### Numerical Types
|
||||||
* All binary operations expect the values to have the same type, no implicit
|
* All binary operations expect the values to have the same type.
|
||||||
coercion would be performed, explicit casting is required.
|
|
||||||
* Casting can be done by `T(v)` where `T` is the target type, and `v` is the
|
* Casting can be done by `T(v)` where `T` is the target type, and `v` is the
|
||||||
original value. Examples: `int64(123)`
|
original value. Examples: `int64(123)`
|
||||||
* Constant integers are treated as `int32` by default. If the value cannot
|
* Integers are treated as `int32` by default. Floating point numbers are double
|
||||||
be stored in `int32`, `uint64` would be used if the integer is non-negative,
|
by default.
|
||||||
and `int64` would be used it the integer is negative.
|
* No implicit coercion, require implicit cast.
|
||||||
|
For integers that don't fit in int32, users should cast them to `int64`
|
||||||
|
explicitly, i.e. `int64(2147483648)`. If the compiler found that the integer
|
||||||
|
does not fit into int32, it would raise an error. (Issue #2)
|
||||||
* Only `uint32`, `int32` (and range of them) can be used as index.
|
* Only `uint32`, `int32` (and range of them) can be used as index.
|
||||||
|
|
||||||
|
### Kernel Only class
|
||||||
|
* Annotate the class with `@kernel`/`@portable`.
|
||||||
|
* The instance can be created from within kernel functions, or the host if it is
|
||||||
|
portable. It can be passed into kernels.
|
||||||
|
* All methods, including the constructor, are treated as kernel/portable
|
||||||
|
functions that would be compiled by the compiler, no RPC function is allowed.
|
||||||
|
* If the instance is passed into the kernel, the host is not allowed to access
|
||||||
|
the instance data. Access would raise exception.
|
||||||
|
|
||||||
## Generics
|
## Generics
|
||||||
We use [type variable](https://docs.python.org/3/library/typing.html#typing.TypeVar) for denoting generics.
|
We use [type variable](https://docs.python.org/3/library/typing.html#typing.TypeVar) for denoting generics.
|
||||||
|
|
||||||
Example:
|
Example:
|
||||||
```py
|
```python
|
||||||
from typing import TypeVar
|
from typing import TypeVar
|
||||||
T = TypeVar('T')
|
T = TypeVar('T')
|
||||||
|
|
||||||
|
@ -102,16 +133,13 @@ class Foo(EnvExperiment):
|
||||||
return a == b
|
return a == b
|
||||||
```
|
```
|
||||||
|
|
||||||
* Type variables can only be used in functions/methods, but not in classes.
|
|
||||||
(this can be relaxed, only allow those with type variables fully defined
|
|
||||||
from the constructor)
|
|
||||||
* Type variable can be limited to a fixed set of types.
|
* Type variable can be limited to a fixed set of types.
|
||||||
* Type variables are invariant, same as the default in Python. We disallow
|
* Type variables are invariant, same as the default in Python. We disallow
|
||||||
covariant or contravariant. The compiler should mark as error if it encounters
|
covariant or contravariant. The compiler should mark as error if it encounters
|
||||||
a type variable used in kernel that is declared covariant or contravariant.
|
a type variable used in kernel that is declared covariant or contravariant.
|
||||||
* Code region protected by a type check, such as `if type(x) == int:`, would
|
* Code region protected by a type check, such as `if type(x) == int:`, would
|
||||||
treat `x` as `int`, similar to how [typescript type guard](https://www.typescripttutorial.net/typescript-tutorial/typescript-type-guards/) works.
|
treat `x` as `int`, similar to how [typescript type guard](https://www.typescripttutorial.net/typescript-tutorial/typescript-type-guards/) works.
|
||||||
```py
|
```python
|
||||||
def add1(x: Union[int, bool]) -> int:
|
def add1(x: Union[int, bool]) -> int:
|
||||||
if type(x) == int:
|
if type(x) == int:
|
||||||
# x is int
|
# x is int
|
||||||
|
@ -126,53 +154,6 @@ class Foo(EnvExperiment):
|
||||||
* Type variable cannot occur alone in the result type, i.e. must be bound to the
|
* Type variable cannot occur alone in the result type, i.e. must be bound to the
|
||||||
input parameters.
|
input parameters.
|
||||||
|
|
||||||
Questions:
|
## Lifetime
|
||||||
* Should we support things like optional type? (like the one in rust)
|
Probably need more discussions...
|
||||||
* Would it be better to assert on the type variable directly instead of
|
|
||||||
`type(x)` for type guards?
|
|
||||||
|
|
||||||
## Dynamic Dispatch
|
|
||||||
Type annotations are invariant, so subtype (derived types) cannot be used
|
|
||||||
when the base type is expected. Example:
|
|
||||||
```py
|
|
||||||
class Base:
|
|
||||||
def foo(self) -> int:
|
|
||||||
return 1
|
|
||||||
|
|
||||||
class Derived(Base):
|
|
||||||
def foo(self) -> int:
|
|
||||||
return 2
|
|
||||||
|
|
||||||
def bar(x: list[Base]) -> int:
|
|
||||||
sum = 0
|
|
||||||
for v in x:
|
|
||||||
sum += v.foo()
|
|
||||||
return sum
|
|
||||||
|
|
||||||
# incorrect, this list cannot be typed (inhomogeneous)
|
|
||||||
bar([Base(), Derived()])
|
|
||||||
```
|
|
||||||
|
|
||||||
Dynamic dispatch is supported, but requires explicit annotation, similar to
|
|
||||||
[trait object](https://doc.rust-lang.org/book/ch17-02-trait-objects.html) in rust.
|
|
||||||
`virtual[T]` is the type for `T` and its subtypes(derived types).
|
|
||||||
|
|
||||||
This is mainly for performance consideration, as virtual method table that is
|
|
||||||
required for dynamic dispatch would penalize performance, and prohibits function
|
|
||||||
inlining etc.
|
|
||||||
|
|
||||||
Type variables cannot be used inside `virtual[...]`, and type variables would not
|
|
||||||
range over `virtual[...]`.
|
|
||||||
|
|
||||||
> Not sure what is the best syntax for `virtual[...]`
|
|
||||||
|
|
||||||
Example:
|
|
||||||
```py
|
|
||||||
def bar2(x: list[virtual[Base]]) -> int:
|
|
||||||
sum = 0
|
|
||||||
for v in x:
|
|
||||||
sum += v.foo()
|
|
||||||
return sum
|
|
||||||
```
|
|
||||||
|
|
||||||
|
|
||||||
|
|
I am still confused what the idea is about the use of 'global'. Is this what you mean?
yes