Go to file
pca006132 3fd63906c2 add back documentation about virtual types 2021-07-16 17:29:17 +08:00
toy-impl added explanation for substitution 2021-01-19 15:42:46 +08:00
.gitignore added gitignore 2020-12-17 17:09:30 +08:00
README.md add back documentation about virtual types 2021-07-16 17:29:17 +08:00

README.md

NAC3 Specification

Specification and discussions about language design.

A toy implementation is in toy-impl, requires python 3.9.

Referencing Host Variables from Kernel

Host variable to be accessed must be declared as global in the kernel function. This is to simplify and speed-up implementation, and also warn the user about the variable being global. (prevent calling the interpreter many times during compilation if there are many references to host variables)

Kernel cannot modify host variables, this would be checked by the compiler. Value that can be observed by the kernel would be frozen once the kernel has been compiled, subsequence modification within the host would not affect the kernel.

Only types supported in the kernel can be referenced.

Examples:

FOO = 0

@kernel
def correct() -> int:
    global FOO
    return FOO + 1

@kernel
def fail_without_global() -> int:
    return FOO + 2

@kernel
def fail_write() -> None:
    FOO += 1

Class and Functions

  • Instance variables must be annotated: (Issue #1)

    class Foo:
        a: int
        b: int
        def __init__(self, a: int, b: int):
            self.a = a
            self.b = b
    
  • Instance variables not used would be warned by the compiler, except for those preceded by the pseudocomment # nac3:no_warn_unused. (#9) The comment can either be placed on top of the variable or next to the variable. Example:

    class Foo:
        # nac3:no_warn_unused
        a: int
        b: int # nac3:no_warn_unused
        def __init__(self):
            pass
    
  • Use-before-define:

    • Host-only constructor: Error if a certain field f is used in other kernel methods but missing from the object constructed in the host.
    • @portable/@kernel constructor: Error if a certain field f is used in other kernel methods but not defined in the constructor, or used in the constructor before definition.
  • Three types of instance variables: (Issue #5)

    • Host only variables: Do not add type annotation for it in the class.
    • Kernel only variables: Denoted with type Kernel[T].
    • Kernel Invariants: Immutable in the kernel and in the host while the kernel is executing. Type: KernelImmutable[T]. The types must be immutable. In particular, the attribute cannot be modified during RPC calls.
    • Normal Variables: The host can only assign to them in the __init__ function. Not accessible afterwards.
  • Functions require full type signature, including type annotation to every parameter and return type.

    def add(a: int, b: int) -> int:
      return a + b
    
  • RPCs: optional parameter type signature, require return type signature.

  • Classes with constructor annotated with kernel/portable can be constructed within kernel functions. RPC calls for those objects would pass the whole object back to the host.

  • Function default parameters must be immutable.

  • Function pointers are supported, and lambda expression is not supported currently. (maybe support lambda after implementing type inference?)

    Its type is denoted by the typing library, e.g. Call[[int32, int32], int32].

Built-in Types

  • Primitive types include:
    • bool
    • byte
    • int32
    • int64
    • uint32
    • uint64
    • float
    • str
    • bytes
  • Collections include:
    • list: homogeneous (elements must be of the same type) fixed-size (no append) list.
    • tuple: inhomogeneous immutable list, only pattern matching (e.g. a, b, c = (1, True, 1.2)) and constant indexing is supported:
      t = (1, True)
      # OK
      a, b = t
      # OK
      a = t[0]
      # Not OK
      i = 0
      a = t[i]
      
    • range (over numerical types)

Numerical Types

  • All binary operations expect the values to have the same type.
  • Casting can be done by T(v) where T is the target type, and v is the original value. Examples: int64(123)
  • Integers are treated as int32 by default. Floating point numbers are double by default.
  • No implicit coercion, require implicit cast. For integers that don't fit in int32, users should cast them to int64 explicitly, i.e. int64(2147483648). If the compiler found that the integer does not fit into int32, it would raise an error. (Issue #2)
  • Only uint32, int32 (and range of them) can be used as index.

Kernel Only class

  • Annotate the class with @kernel/@portable.
  • The instance can be created from within kernel functions, or the host if it is portable. It can be passed into kernels.
  • All methods, including the constructor, are treated as kernel/portable functions that would be compiled by the compiler, no RPC function is allowed.
  • If the instance is passed into the kernel, the host is not allowed to access the instance data. Access would raise exception.

Generics

We use type variable for denoting generics.

Example:

from typing import TypeVar
T = TypeVar('T')

class Foo(EnvExperiment):
    @kernel
    # type of a is the same as type of b
    def run(self, a: T, b: T) -> bool:
        return a == b
  • Type variable can be limited to a fixed set of types.
  • Type variables are invariant, same as the default in Python. We disallow covariant or contravariant. The compiler should mark as error if it encounters a type variable used in kernel that is declared covariant or contravariant.
  • A custom function is_type(x, T) would be provided to check whether x is an instance of T, other methods like type(x) == int or isinstance(x, int) would not compile. The function would be able to check generic types for list and tuple. When running on the host, user can specify whether to use a debug mode checking (recursively check all elements, which would be slower for large lists) or performance mode which only check the first element of each list. (#15)
  • Code region protected by a type check, such as if is_type(x, int):, would treat x as int, similar to how typescript type guard works.
    def add1(x: Union[int, bool]) -> int:
      if is_type(x, int):
          # x is int
          return x + 1
      else:
          # x must be bool
          return 2 if x else 1
    
  • Generics are instantiated at compile time, all the type checks like is_type(x, int) would be evaluated as constants. Type checks are not allowed in area outside generics.
  • Type variable cannot occur alone in the result type, i.e. must be bound to the input parameters.
  • Polymorphic methods (with type variables in the type signature) must be annotated with @final. This is because we need to know where does the method come from when we do monomorphization, which we don't know for virtual methods.

For loop unrolling (#12)

A pseudocomment can be used for unrolling for loops that iterates a fixed amount of time. This can be used for iterating over inhomogeneous tuples. Example:

params = (1, 1.5, "foo")
# nac3:unroll
for p in params:
    print(p)

Dynamic Dispatch

Type annotations are invariant, so subtype (derived types) cannot be used when the base type is expected. Example:

class Base:
    def foo(self) -> int:
        return 1

class Derived(Base):
    def foo(self) -> int:
        return 2

def bar(x: list[Base]) -> int:
    sum = 0
    for v in x:
        sum += v.foo()
    return sum

# incorrect, this list cannot be typed (inhomogeneous)
bar([Base(), Derived()])

Dynamic dispatch is supported, but requires explicit annotation, similar to trait object in rust. virtual[T] is the type for T and its subtypes(derived types).

This is mainly for performance consideration, as virtual method table that is required for dynamic dispatch would penalize performance, and prohibits function inlining etc. Note that type variables cannot be used inside virtual[...].

Example:

def bar2(x: list[virtual[Base]]) -> int:
    sum = 0
    for v in x:
        sum += v.foo()
    return sum

The syntax for casting virtual objects is virtual(obj, T), which casts an object of type T1/virtual[T1] to virtual[T] where T1 <: T (T1 is a subtype of T).

The compiler may be able to infer the type cast. In that case, the cast is not required if obj is already of type virtual[T1], or the user can write the cast as virtual(obj) and the compiler would infer the type T automatically.

Methods would be automatically overriden, the type signature including parameter names and order must be exactly the same.

Defining a method which was marked as final in the super class would be considered as an error.

Lifetime

Probably need more discussions...