Merge pull request 'Updated specification' (#8) from updated into master

Reviewed-on: #8
This commit is contained in:
pca006132 2021-06-07 10:06:51 +08:00
commit d06a044189
1 changed files with 69 additions and 88 deletions

157
README.md
View File

@ -5,23 +5,41 @@ Specification and discussions about language design.
A toy implementation is in [`toy-impl`](./toy-impl), requires python 3.9.
## Referencing Python Variables
The kernel is allowed to read host Python variables, but has to specify with
`global` before referencing them. This is to simplify and speed-up
implementation, and also warn the user about the variable being global. (prevent
calling the interpreter many times during compilation if there are many
references to host variables)
## Referencing Host Variables from Kernel
Host variable to be accessed must be declared as `global` in the kernel
function. This is to simplify and speed-up implementation, and also warn the
user about the variable being global. (prevent calling the interpreter many
times during compilation if there are many references to host variables)
Only primitive types and tuple/list of primitive types are allowed.
Kernel cannot modify host variables, this would be checked by the compiler.
Value that can be observed by the kernel would be frozen once the kernel has
been compiled, subsequence modification within the host would not affect the
kernel.
The value would be substituted at compile time, subsequent modification in the
host would not be known by the kernel.
Only types supported in the kernel can be referenced.
Modification in kernel code to the global variables is not allowed.
Examples:
```python
FOO = 0
@kernel
def correct() -> int:
global FOO
return FOO + 1
@kernel
def fail_without_global() -> int:
return FOO + 2
@kernel
def fail_write() -> None:
FOO += 1
```
## Class and Functions
* Class fields must be annotated:
```py
* Instance variables must be annotated: (Issue #1)
```python
class Foo:
a: int
b: int
@ -29,24 +47,26 @@ Modification in kernel code to the global variables is not allowed.
self.a = a
self.b = b
```
* Three types of instance variables: (Issue #5)
* Host only variables: Do not add type annotation for it in the class.
* Kernel Invariants: Immutable in the kernel and in the host while the kernel
is executing. Type: `KernelInvariant[T]`. The types must be immutable.
(use tuple instead of list in the host, but the type annotation should still
be list?)
* Normal Variables: The host can only assign to them in the `__init__`
function. Not accessible afterwards.
* Functions require full type signature, including type annotation to every
parameter and return type.
```py
```python
def add(a: int, b: int) -> int:
return a + b
```
* No implicit coercion, require implicit cast. Integers are int32 by default,
floating point numbers are double by default.
* RPCs: optional parameter type signature, require return type signature.
* Function default parameters are not allowed, as changes to the default value
would not be kept across kernel calls, which is a potential source of
confusion. (maybe we can allow primitive default types?)
* Cannot construct objects within kernel code.
* Function default parameters must be immutable.
* Function pointers are supported, and lambda expression is not supported
currently. (maybe support lambda after implementing type inference?)
Questions:
* Can we construct objects within kernel code?
* Should we support function pointers? What about subtyping with function
pointers, and generic types?
Its type is denoted by the typing library, e.g. `Call[[int32, int32], int32]`.
## Built-in Types
* Primitive types include:
@ -57,14 +77,14 @@ Questions:
* `uint32`
* `uint64`
* `float`
* `str` (note: fixed length, provide builtin methods?)
* `bytes` (a list of `byte`, but with more convenient syntax)
* `str`
* `bytes`
* Collections include:
* `list`: homogeneous (elements must be of the same type) fixed-size (no
append) list.
* `tuple`: inhomogeneous fixed-size list, only pattern
matching (e.g. `a, b, c = (1, True, 1.2)`) and constant indexing
is supported:
* `tuple`: inhomogeneous immutable list, only pattern
matching (e.g. `a, b, c = (1, True, 1.2)`) and constant indexing is
supported:
```
t = (1, True)
# OK
@ -75,23 +95,34 @@ Questions:
i = 0
a = t[i]
```
* `range` (over numerical types) (not sure if this is really useful)
* `range` (over numerical types)
### Numerical Types
* All binary operations expect the values to have the same type, no implicit
coercion would be performed, explicit casting is required.
* All binary operations expect the values to have the same type.
* Casting can be done by `T(v)` where `T` is the target type, and `v` is the
original value. Examples: `int64(123)`
* Constant integers are treated as `int32` by default. If the value cannot
be stored in `int32`, `uint64` would be used if the integer is non-negative,
and `int64` would be used it the integer is negative.
* Integers are treated as `int32` by default. Floating point numbers are double
by default.
* No implicit coercion, require implicit cast.
For integers that don't fit in int32, users should cast them to `int64`
explicitly, i.e. `int64(2147483648)`. If the compiler found that the integer
does not fit into int32, it would raise an error. (Issue #2)
* Only `uint32`, `int32` (and range of them) can be used as index.
### Kernel Only class
* Annotate the class with `@kernel`/`@portable`.
* The instance can be created from within kernel functions, or the host if it is
portable. It can be passed into kernels.
* All methods, including the constructor, are treated as kernel/portable
functions that would be compiled by the compiler, no RPC function is allowed.
* If the instance is passed into the kernel, the host is not allowed to access
the instance data. Access would raise exception.
## Generics
We use [type variable](https://docs.python.org/3/library/typing.html#typing.TypeVar) for denoting generics.
Example:
```py
```python
from typing import TypeVar
T = TypeVar('T')
@ -102,16 +133,13 @@ class Foo(EnvExperiment):
return a == b
```
* Type variables can only be used in functions/methods, but not in classes.
(this can be relaxed, only allow those with type variables fully defined
from the constructor)
* Type variable can be limited to a fixed set of types.
* Type variables are invariant, same as the default in Python. We disallow
covariant or contravariant. The compiler should mark as error if it encounters
a type variable used in kernel that is declared covariant or contravariant.
* Code region protected by a type check, such as `if type(x) == int:`, would
treat `x` as `int`, similar to how [typescript type guard](https://www.typescripttutorial.net/typescript-tutorial/typescript-type-guards/) works.
```py
```python
def add1(x: Union[int, bool]) -> int:
if type(x) == int:
# x is int
@ -126,53 +154,6 @@ class Foo(EnvExperiment):
* Type variable cannot occur alone in the result type, i.e. must be bound to the
input parameters.
Questions:
* Should we support things like optional type? (like the one in rust)
* Would it be better to assert on the type variable directly instead of
`type(x)` for type guards?
## Dynamic Dispatch
Type annotations are invariant, so subtype (derived types) cannot be used
when the base type is expected. Example:
```py
class Base:
def foo(self) -> int:
return 1
class Derived(Base):
def foo(self) -> int:
return 2
def bar(x: list[Base]) -> int:
sum = 0
for v in x:
sum += v.foo()
return sum
# incorrect, this list cannot be typed (inhomogeneous)
bar([Base(), Derived()])
```
Dynamic dispatch is supported, but requires explicit annotation, similar to
[trait object](https://doc.rust-lang.org/book/ch17-02-trait-objects.html) in rust.
`virtual[T]` is the type for `T` and its subtypes(derived types).
This is mainly for performance consideration, as virtual method table that is
required for dynamic dispatch would penalize performance, and prohibits function
inlining etc.
Type variables cannot be used inside `virtual[...]`, and type variables would not
range over `virtual[...]`.
> Not sure what is the best syntax for `virtual[...]`
Example:
```py
def bar2(x: list[virtual[Base]]) -> int:
sum = 0
for v in x:
sum += v.foo()
return sum
```
## Lifetime
Probably need more discussions...