Add documentation
Hydra nac3artiq-msys2 Hydra build #200160 of artiq:nac3:nac3artiq-msys2
Hydra nac3artiq-msys2-pkg Hydra build #200161 of artiq:nac3:nac3artiq-msys2-pkg
Hydra nac3artiq-profile Hydra build #200162 of artiq:nac3:nac3artiq-profile
Hydra nac3artiq Hydra build #200159 of artiq:nac3:nac3artiq

This commit was merged in pull request #750.
This commit is contained in:
rclovis
2026-03-18 17:41:40 +01:00
committed by sb10q
parent f8c9ddb8b2
commit 7c4c262ca3
13 changed files with 484 additions and 2 deletions
+129
View File
@@ -0,0 +1,129 @@
# Architecture
NAC3 follows a classic compiler pipeline: parse, analyze, generate. The codebase is split into several Rust crates that separate concerns cleanly enough that `nac3core` contains nothing specific to ARTIQ.
## Crate Layout
| Crate | |
| -------------- | --------------------------------------------------------------------------------------------- |
| nac3ast | Python AST node definitions (based on [RustPython](https://github.com/RustPython/RustPython)) |
| nac3parser | Lexer + LALRPOP parser producing nac3ast trees |
| nac3core | Type checking, type inference, LLVM code generation |
| nac3artiq | ARTIQ frontend - Python/PyO3 integration, timeline, RPC<br> |
| nac3standalone | Minimal frontend - compiles a .py file to an object file |
| nac3binutils | Linker (nac3ld), symbolizer, DWARF utilities |
| runkernel | Test harness that runs compiled ARTIQ kernels on the host |
`nac3core` is where most of the compiler lives. It is intentionally frontend-agnostic: the two frontends (`nac3artiq` and `nac3standalone`) plug in through a small set of traits described below.
## Compilation Pipeline
A complete compilation proceeds in five stages. The frontends drive the first and last stages; `nac3core` owns everything in between.
![alt text](compilation_pipeline.png)
### Stage 1: Parsing
`nac3parser` tokenizes Python source and feeds it into a LALRPOP-generated parser. The output is a `Vec<Stmt>`; a list of top-level AST statements from `nac3ast`. The parser is a lightly modified fork of RustPython's parser.
### Stage 2: Registration
The frontend walks the parsed statements and registers each class and function with `TopLevelComposer::register_top_level()`. This populates the global definition list with `TopLevelDef::Class` and `TopLevelDef::Function` entries, each identified by a `DefinitionId` (a plain `usize` index).
Assignments at module scope are handled separately by the frontend, typically to register `TypeVar` and `ConstGeneric` declarations or module-level constants.
### Stage 3: Type Analysis
`TopLevelComposer::start_analysis()` processes all registered definitions:
1. Resolves type annotations (inheritance, field types, method signatures).
2. Runs the type inferencer on every function body.
3. Unifies type constraints using a union-find based `Unifier`.
After this stage the AST is annotated: every expression node carries an `Option<Type>` indicating its inferred type. `Type` is a `UnificationKey`: a lightweight handle into the unification table, not a concrete description. To inspect what a `Type` actually is, you query the `Unifier` for its `TypeEnum`.
The important `TypeEnum` variants are:
- `TObj`: a class instance, carrying its `DefinitionId`, field map, and type parameter bindings.
- `TFunc`: a function signature (argument types, return type, type variables).
- `TVar`: an unconstrained or range-constrained type variable, resolved during unification.
- `TRigidVar`: a type variable that must not be unified further (appears in generic class/function definitions).
- `TTuple`, `TLiteral`, `TVirtual`, `TCall`: tuples, literal types, virtual dispatch wrappers, and unresolved call sites respectively.
### Stage 4: Code Generation
Code generation is parallel and demand-driven. The frontend creates a `WorkerRegistry` with N worker threads, each owning a `CodeGenerator` and an independent LLVM `Context`.
The entry point function is submitted as a `CodeGenTask`. When a worker picks up a task, it generates LLVM IR for that function. If the function calls another generic function with concrete type arguments that has not been compiled yet, a new `CodeGenTask` is created and placed on the shared work queue. This continues until no more tasks remain.
Each task carries:
- The function body (typed AST).
- A `ConcreteTypeStore` with monomorphized types for this instantiation.
- Type substitutions mapping type variables to their concrete types.
- A `SymbolResolver` for looking up external names.
The per-function context is `CodeGenContext`, which holds the LLVM builder, variable assignments, type caches, and control-flow state (loop targets, unwind targets, return buffer). It derefs to `ModuleContext`, which holds the LLVM module and target-specific type information.
After all workers finish, the frontend links the per-worker LLVM modules together, links in the IRRT (runtime library), runs the LLVM optimization pipeline, and emits the final object file.
### Stage 5: Optimization and Linking
The merged LLVM module is run through LLVM's new pass manager. The pass string typically looks like `globaldce,strip-dead-prototypes,default<O2>`. After optimization, the target machine emits an object file. For ARTIQ, `nac3ld` performs final linking to produce an ELF suitable for loading onto the core
device.
## Frontend Integration Points
Frontends customize the compiler through four traits and one callback:
**`SymbolResolver`**: maps identifiers to types and values. The frontend implements this to bridge its own name resolution (Python runtime objects in nac3artiq, a simple hash map in nac3standalone) into `nac3core`'s type system. Key methods: `get_symbol_type()`, `get_identifier_def()`, `get_symbol_value()`.
**`CodeGenerator`**: controls IR generation for expressions, statements, calls, and control flow. `DefaultCodeGenerator` provides the standard implementation; `ArtiqCodeGenerator` overrides `gen_with()` and `gen_call()` to handle `with parallel` blocks and timeline manipulation.
**`BuiltinRegistry`**: determines how AST expressions are matched to builtin type/function definitions. `DefaultBuiltinRegistry` matches by name strings; nac3artiq's `ArtiqBuiltinRegistry` matches by Python object identity (via PyO3).
**`TimeFns`**: (nac3artiq only) emits LLVM IR for `now_mu()`, `at_mu()`, and `delay_mu()`. Implementations differ by target ISA (VexRiscv with 32-bit or 64-bit data bus, or external function calls for host execution).
**`GenCall`**: a callback stored on `TopLevelDef::Function` that overrides code generation for specific functions. nac3artiq uses this for RPC stubs, where the generated code must serialize arguments and invoke the host runtime instead of calling a compiled function.
## nac3standalone
The standalone frontend is a command-line tool that compiles a single Python file to an object file. It expects a `run()` function as the entry point. The implementation is under 500 lines and serves as the reference for how to drive `nac3core`.
The compilation flow:
1. Parse the input file.
2. Create a `TopLevelComposer` with `DefaultBuiltinRegistry`.
3. Register all top-level definitions; handle `TypeVar`/`ConstGeneric` assignments separately.
4. Run `start_analysis()`.
5. Look up the `run` function, create a `CodeGenTask` for it.
6. Spawn `WorkerRegistry` threads with `DefaultCodeGenerator`.
7. Link modules, optimize, write `module.o`.
## nac3artiq
The ARTIQ frontend is a Python extension module (built as a `cdylib` via PyO3). It is loaded by the ARTIQ runtime and compiles `@kernel` functions on demand.
Key differences from the standalone frontend:
- **Python interop**: `InnerResolver` implements `SymbolResolver` by inspecting live Python objects through PyO3. Class fields, method signatures, and default parameter values are all extracted from the Python runtime.
- **Decorators**: `@kernel`, `@portable`, `@rpc`, and `@extern` mark functions for different compilation strategies. `@rpc` functions get a `GenCall` callback that generates serialization/deserialization code instead of a normal function body.
- **Parallel blocks**: `with parallel` and `with sequential` are context managers that manipulate the RTIO timeline. `ArtiqCodeGenerator` overrides `gen_with()` to track timeline positions and reset/advance the cursor appropriately.
- **Timeline**: The `TimeFns` trait abstracts over different hardware targets.`NowPinningTimeFns64` directly reads/writes split 32-bit CSR registers on VexRiscv; `ExternTimeFns` calls out to external C functions for host-mode execution.
- **Target ISAs**: nac3artiq can target `riscv32-unknown-linux` (Kasli/core device), `armv7-unknown-linux-eabihf` (Zynq), or the host triple.
- **Attribute writeback**: After compilation, mutable object attributes may need to be written back to the Python runtime. This is handled by `attributes_writeback()`.
## IRRT (Inline Runtime)
The IRRT is a small runtime library written in C++ under `nac3core/irrt/`. It provides helper functions for operations that are too complex to emit inline (integer exponentiation, range slicing, string operations, list helpers, etc.).
The build process (in `nac3core/build.rs`):
1. Compile `irrt.cpp` to LLVM IR using `clang-irrt` targeting `wasm32` (to get target-independent IR).
2. Filter the IR with regexes to keep only function definitions, declarations, type definitions, and globals.
3. Strip debug metadata.
4. Assemble to bitcode with `llvm-as-irrt`.
5. Embed the bitcode via `include_bytes!()`.
At compile time, `load_irrt()` parses this embedded bitcode into an LLVM module and initializes exception ID globals. The module is then linked into the final output.
To debug IRRT issues, set `DEBUG_DUMP_IRRT=1` when building nac3core. This writes `irrt.ll` (raw) and `irrt-filtered.ll` (after regex filtering) to the build output directory.
+134
View File
@@ -0,0 +1,134 @@
# Code Generation
This document covers the internals of `nac3core`'s type system and code generation pipeline. It is meant to orient developers on the critical types and the flow from typed AST to LLVM IR; the fine details live in Rustdoc comments on the relevant structs and functions.
## Type System
### Types and the Unifier
`Type` is a `UnificationKey`: a handle into the unification table. It is not a type description by itself. To inspect what a `Type` actually represents, look it up through the `Unifier`:
```rust
let ty_enum: &TypeEnum = &*unifier.get_ty(some_type);
match ty_enum {
TypeEnum::TObj { obj_id, fields, params, .. } => { /* ... */ }
TypeEnum::TFunc(sig) => { /* ... */ }
// ...
}
```
The `Unifier` owns a `UnificationTable` implementing union-find. Unification merges two types by constraining them to be equal; if the constraint is contradictory, a `TypeError` is returned. During type inference, `TVar` nodes start unconstrained (or range-constrained) and get progressively pinned down as the inferencer processes the AST.
`SharedUnifier` (`Arc<Mutex<(UnificationTable, u32, Vec<Call>)>>`) is used when unifiers need to be shared across threads; each module gets its own unifier during analysis, and the shared form is stored in `TopLevelContext::unifiers`.
### PrimitiveStore
`PrimitiveStore` holds `Type` handles for all builtin primitive types (`int32`, `int64`, `uint32`, `uint64`, `float`, `bool`, `str`, `none`, `exception`, `option`, `ndarray`, etc.). It is created once during `TopLevelComposer` initialization and threaded through the entire pipeline.
### TopLevelDef and DefinitionId
Every class, function, and module registered with the compiler gets a `TopLevelDef` entry and a `DefinitionId` (index into the definition list).
`TopLevelDef::Function` has two important maps:
- `instance_to_symbol`: maps a string key (derived from concrete type variable bindings) to the LLVM symbol name for that instantiation.
- `instance_to_stmt`: maps the same key to a `FunInstance` containing the typed AST body, call site information, and type substitutions.
When a generic function is called with specific type arguments, the codegen looks up (or creates) an entry in these maps. If a new entry is created, a new `CodeGenTask` is queued.
`TopLevelDef::Function` can also carry a `codegen_callback` (`GenCall`), which entirely replaces normal code generation for that function. nac3artiq uses this for RPC functions, where instead of compiling a function body, the generated code serializes arguments and calls into the ARTIQ RPC runtime.
## Monomorphization
NAC3 compiles generic functions by monomorphization: each distinct combination of concrete type arguments produces a separate LLVM function. The `ConcreteTypeStore` manages this mapping.
The flow:
1. During codegen, a call to a generic function triggers `gen_func_instance()`.
2. The type variable bindings are collected into a substitution key (a sorted string of variable ID/type pairs).
3. If `instance_to_symbol` already has this key, the existing symbol is reused.
4. Otherwise a new `CodeGenTask` is created with the concrete substitutions and placed on the `WorkerRegistry` queue.
Because workers run in parallel, `gen_func_instance()` must handle the race where two workers try to instantiate the same function simultaneously. The default implementation uses the lock on the `TopLevelDef` to serialize this check.
## CodeGenContext
`CodeGenContext` is the per-function state during IR generation. It holds:
- **`builder`**: the LLVM `Builder` for emitting instructions.
- **`var_assignment`**: maps variable names to `VarValue` (an LLVM pointer plus an optional `StaticValue` for compile-time-known values).
- **`type_cache` / `alloca_type_cache`**: caches `Type` to LLVM `BasicTypeEnum` conversions. `alloca_type_cache` is specifically for in-memory representations (e.g., `bool` is `i8` in memory but `i1` in the ABI).
- **`loop_target`**: `(header, exit)` basic blocks for the current loop, used by `break`/`continue`.
- **`unwind_target`**: the landing pad for exception handling.
- **`return_buffer`** / **`return_target`**: for functions that need a single return point (e.g., when exception cleanup is involved).
`CodeGenContext` derefs to `ModuleContext`, which provides access to the LLVM `Context`, `Module`, target-specific integer types (`i32`, `i64`, `size_t`), and the type context for converting nac3 types to LLVM types.
## Expression and Statement Generation
Expression codegen (`codegen/expr.rs`) and statement codegen (`codegen/stmt.rs`) are the two largest files in the codebase. They follow the AST structure closely:
- `gen_expr()` dispatches on `ExprKind` and returns an `RtValue` (a pair of `Type` and an optional LLVM value).
- `gen_stmt()` dispatches on `StmtKind` and returns `()` (control flow is handled through the builder's current basic block).
Both are implemented as free functions that take a `&mut dyn CodeGenerator` and `&mut CodeGenContext`. The `CodeGenerator` trait methods delegate to these free functions by default, letting frontends override specific behaviors without duplicating the rest.
## Parallel Compilation
`WorkerRegistry` manages a pool of codegen worker threads. Each worker:
1. Receives `CodeGenTask` items from a shared channel.
2. Creates (or reuses) a `ModuleContext` with its own LLVM `Context`.
3. Calls `gen_func_impl()` to generate the function body.
4. When the function calls another function that needs a new instantiation, the worker calls `registry.add_task()` to queue it.
5. After finishing a task, writes the module bitcode to a buffer and signals completion.
The registry tracks outstanding tasks with a counter and a condvar. When all tasks are done, the main thread collects the per-worker LLVM bitcode buffers, links them into one module, and proceeds with optimization.
Workers are created with `WorkerRegistry::create_workers()`, which takes a `Vec<Box<G>>` of `CodeGenerator` instances (one per thread). This is where the frontend passes in its custom generator type.
## Type Layouts
The `codegen/types/` directory contains proxy types that map nac3 types to LLVM struct layouts. Each proxy type implements `ProxyType` and provides methods for accessing fields, creating instances, and generating related operations.
The important proxy types:
- `ListType`: a `{ptr, len}` struct. The pointer references a heap-allocated array of elements.
- `NDArrayType`: the representation of `numpy.ndarray`. Contains data pointer, number of dimensions, shape array, and strides. Broadcasting and indexing operations are in `codegen/types/ndarray/`.
- `StringType`: a `{ptr, len}` pair for UTF-8 data.
- `RangeType`: `{start, stop, step}` integers.
- `TupleType`: an LLVM struct with one field per element.
- `ExceptionType`: carries exception class ID, message, parameters, and source location fields.
- `OptionType`: a tagged union with a flag byte and optional value.
## Exception Handling
NAC3 uses LLVM's `landingpad`-based exception handling with a personality function. The personality symbol is set via `TopLevelContext::personality_symbol` (nac3artiq sets this to `__nac3_personality`).
The flow for a `try`/`except` block:
1. `gen_stmt` for `Try` sets `ctx.unwind_target` to a landing pad block.
2. Calls within the `try` body are emitted as `invoke` instructions targeting both a normal continuation and the landing pad.
3. The landing pad dispatches on exception class ID to the matching `except` clause.
4. `raise` compiles to a call to `__nac3_raise` followed by `unreachable`.
Each exception class is assigned a numeric ID via `SymbolResolver::get_exception_id()`, and the IRRT uses `SymbolResolver::get_string_id()` for exception name strings.
## IRRT Functions
When you need a runtime helper that is too complex for inline LLVM IR, add it to the IRRT (`nac3core/irrt/`). The C++ source is compiled to target-independent LLVM bitcode and linked into every compilation. See `irrt/irrt.cpp` and the submodule headers.
To call an IRRT function from Rust codegen, declare it in the appropriate `codegen/irrt/*.rs` module and call it through the LLVM builder. Functions that need to differ between 32-bit and 64-bit `size_t` use the `get_usize_dependent_function_name()` helper to select the right variant.
## Builtin Functions
Builtin functions (e.g., `int32()`, `len()`, `range()`, `np_zeros()`) are registered during `TopLevelComposer` initialization. The `PrimDef` enum in `toplevel/helper.rs` lists every builtin type and function.
Most builtins have their code generation in `codegen/builtin_fns.rs`. NumPy operations are in `codegen/numpy.rs`. The implementations receive `CodeGenContext` and the call arguments, and return the result as LLVM values.
When adding a new builtin:
1. Add a variant to `PrimDef`.
2. Register the type and signature in `make_primitives()`.
3. Write the codegen implementation.
4. If the builtin needs a `GenCall` callback (because it requires custom calling conventions), set `codegen_callback` on the `TopLevelDef::Function`.
Binary file not shown.

After

Width:  |  Height:  |  Size: 157 KiB

+148
View File
@@ -0,0 +1,148 @@
# Developer Guide
Practical information for building, testing, debugging, and extending NAC3.
## Building
### With Nix
```
$ nix develop # enter the dev shell (bash)
$ nix develop --command zsh # or use your preferred shell
$ cargo build --release
```
The Nix flake provides LLVM 16, `clang-irrt`, `llvm-as-irrt`, and all other
dependencies.
### PGO Build
The flake includes a profile-guided optimization (PGO) build for nac3artiq. PGO recompiles LLVM itself using profiling data collected from a real ARTIQ compilation, which improves codegen throughput.
```
$ nix build .#nac3artiq-pgo -L
```
The PGO pipeline has three stages, all handled automatically by Nix:
1. **Instrumented build** (`nac3artiq-instrumented`): builds nac3artiq against an instrumented LLVM that records branch frequency data during execution.
2. **Profile collection** (`nac3artiq-profile`): runs the instrumented compiler on the `nac3devices` ARTIQ example to produce `llvm.profdata`.
3. **PGO build** (`nac3artiq-pgo`): rebuilds LLVM with the collected profile applied, then builds nac3artiq against this optimized LLVM.
The intermediate packages can also be built individually if needed (e.g., `nix build .#nac3artiq-profile` to just collect profile data).
### IRRT Build
The `nac3core` build script (`build.rs`) compiles the C++ runtime under `nac3core/irrt/` to LLVM bitcode. If you modify IRRT sources, `cargo` will automatically rebuild. To inspect the generated IR:
```
$ DEBUG_DUMP_IRRT=1 cargo build -p nac3core
```
This writes `irrt.ll` and `irrt-filtered.ll` to the cargo output directory (printed by cargo as `OUT_DIR`).
## Running nac3standalone
The standalone compiler expects a Python file with a `run()` entry point:
```
$ cargo run --release -p nac3standalone -- my_program.py
```
This produces `module.o`. Link it against your runtime stubs (e.g., the demo `output_*` functions) to get an executable.
Useful flags:
- `-O0` / `-O2` / `-O3`: optimization level
- `--emit-llvm-ir`: write `main.ll` for each compilation stage
- `--emit-llvm-bc`: write `main.bc` (bitcode)
- `-T 0`: use all available threads for compilation
### Running demos
The `nac3standalone/demo/` directory contains example programs and a helper
script that compiles, links, and runs them in one step. From the demo directory:
```
$ cd nac3standalone/demo
$ ./run_demo.sh -- src/demo_test.py
```
`run_demo.sh` does three things:
1. Compiles the Python source with `nac3standalone`, producing `module.o`.
2. Compiles `demo.c` (the C runtime stubs for `output_int32`, `output_bool`, etc.) with clang.
3. Links both object files (plus `liblinalg.a` for linear algebra demos) into an executable and runs it.
Options:
- `--debug`: use the debug build of nac3standalone instead of release.
- `-i686`: cross-compile to 32-bit x86 (uses `--triple i686-unknown-linux-gnu` and links against the 32-bit linalg stub).
- `--out OUTFILE`: redirect the program output to a file instead of stdout.
- Extra nac3standalone flags can be passed after `--`: e.g., `./run_demo.sh -- --emit-llvm-ir src/demo_test.py`.
### Checking demos
`check_demos.sh` runs every `src/*.py` demo through both the Python interpreter and the NAC3 compiler, then diffs the output:
```
$ cd nac3standalone/demo
$ ./check_demos.sh
```
This is the same check that the Nix build runs. Pass `-i686` to also verify 32-bit output. Individual demos can be checked with `check_demo.sh`:
```
$ ./check_demo.sh src/demo_test.py
```
## Running nac3artiq + runkernel locally
For testing ARTIQ kernels without hardware, use `runkernel`. It provides stub implementations of `now_mu`, `at_mu`, `delay_mu`, `rtio_output`, and a few other ARTIQ syscalls.
The workflow:
1. Compile your kernel. nac3artiq produces `module.elf` (and optionally `debug.elf`) when invoked through the ARTIQ `Core.run()` method. The demo under `nac3artiq/demo/` shows the minimal setup, including `min_artiq.py` (a self-contained ARTIQ-like environment) and `device_db.py`.
2. Run through runkernel:
```
$ cargo run --release -p runkernel -- module.elf
```
`runkernel` loads the ELF, looks up `__modinit__`, and executes it. RTIO calls print their arguments so you can trace the output timeline.
### Running the demo
```
$ cd nac3artiq/demo
$ python demo.py
```
This uses `min_artiq.py` to set up the compiler, compiles the demo kernels, and produces `module.elf`. You can then run it with `runkernel` as above.
## Testing
```
$ cargo test # all tests
$ cargo test -p nac3core # core tests only
$ cargo test -p nac3parser # parser tests only
```
## Extending the Compiler
### Adding a new type to codegen
The canonical pattern for adding type support in `codegen/types/`:
1. Create a new file (e.g., `codegen/types/mytype.rs`).
2. Define a struct that wraps the LLVM struct layout.
3. Implement `ProxyType` for it. This provides the interface for creating instances, accessing fields, and converting to/from LLVM values.
4. Register the type in `codegen/types/mod.rs`.
5. Add handling in `gen_expr` and `gen_stmt` where the type appears (attribute access, method calls, etc.).
### Adding a new builtin function
1. Add a variant to the `PrimDef` enum in `toplevel/helper.rs`.
2. In `make_primitives()` (same file), register the function's type signature with the `TopLevelComposer`.
3. If the function needs special type-checking logic (e.g., it accepts heterogeneous argument types returns a type derived from its arguments, or cannot be expressed as a simple signature), add a branch to `try_fold_special_call()` in `typecheck/type_inferencer/mod.rs`. This is where builtins like `len()`, `virtual()`, and NumPy array constructors perform their custom type inference.
4. Implement code generation. For simple functions, add a branch in `codegen/builtin_fns.rs`. For NumPy functions, use `codegen/numpy.rs`.
5. If the function needs custom calling conventions (like RPC), create a `GenCall` callback and assign it to the `TopLevelDef::Function`'s `codegen_callback` field.
6. Register the function in the frontend's builtin registry (`DefaultBuiltinRegistry` or `ArtiqBuiltinRegistry`).
+14
View File
@@ -0,0 +1,14 @@
# NAC3 Developer Documentation
NAC3 is a Python-to-machine-code compiler. It compiles a statically-typed subset of Python to LLVM IR, for use in
[ARTIQ](https://m-labs.hk/artiq). The compiler is written in Rust and uses
[inkwell](https://github.com/TheDan64/inkwell) as its LLVM binding.
This documentation is intended for developers working on NAC3 itself. For user-facing language documentation, see the
[ARTIQ manual](https://m-labs.hk/artiq/manual/).
## Contents
- [Architecture](architecture.md) - Crate layout, compilation pipeline, and how the pieces fit together.
- [Code Generation](codegen.md) - LLVM IR generation, the `CodeGenerator` trait, parallel compilation, IRRT, and type layouts.
- [Developer Guide](guide.md) - Building, debugging, extending codegen/types, running nac3artiq locally, and common pitfalls.
+2
View File
@@ -64,6 +64,8 @@ enum ParallelMode {
Deep,
}
/// ARTIQ-specific code generator that extends the default with timeline manipulation,
/// `with parallel`/`with sequential` block handling, and RPC support.
pub struct ArtiqCodeGenerator<'a> {
name: String,
+5 -1
View File
@@ -3,7 +3,11 @@ use nac3core::{
inkwell::{AtomicOrdering, values::BasicValueEnum},
};
/// Functions for manipulating the timeline.
/// Trait for emitting LLVM IR for ARTIQ timeline operations.
///
/// Different implementations target different hardware backends: `NowPinningTimeFns64`
/// directly reads/writes split 32-bit CSR registers on VexRiscv, while `ExternTimeFns`
/// calls external C functions (used for host-mode execution and `runkernel`).
pub trait TimeFns {
/// Emits LLVM IR for `now_mu`.
fn emit_now_mu<'ctx>(
+6
View File
@@ -15,6 +15,11 @@ use crate::{
typecheck::typedef::{FunSignature, Type},
};
/// Trait for customizing LLVM IR generation.
///
/// The default implementations delegate to the free functions in `codegen::expr` and
/// `codegen::stmt`. Frontends override specific methods to change behavior -- for example,
/// `ArtiqCodeGenerator` overrides `gen_with()` to handle `with parallel` blocks.
pub trait CodeGenerator {
/// Return the module name for the code generator.
fn get_name(&self) -> &str;
@@ -221,6 +226,7 @@ pub trait CodeGenerator {
}
}
/// Default code generator with no frontend-specific behavior. Used by nac3standalone.
pub struct DefaultCodeGenerator {
name: String,
}
+7
View File
@@ -326,6 +326,12 @@ impl WithCall {
}
}
/// Thread pool for parallel code generation.
///
/// Workers consume `CodeGenTask` items from a shared channel. Each worker has its own LLVM
/// `Context` and `CodeGenerator`. When a function call requires a new monomorphized instance,
/// the worker queues a new task. The main thread waits for all tasks to complete, then
/// collects the per-worker LLVM bitcode buffers for linking.
pub struct WorkerRegistry {
sender: Arc<Sender<Option<CodeGenTask>>>,
receiver: Arc<Receiver<Option<CodeGenTask>>>,
@@ -485,6 +491,7 @@ impl WorkerRegistry {
}
}
/// A unit of work for the codegen thread pool, representing one monomorphized function.
pub struct CodeGenTask {
pub subst: Vec<(Type, ConcreteType)>,
pub store: ConcreteTypeStore,
+5
View File
@@ -349,6 +349,11 @@ impl<'ctx> ValueEnum<'ctx> {
}
}
/// Trait for resolving identifiers to types and values.
///
/// Frontends implement this trait to bridge their name resolution (e.g., Python runtime objects
/// in nac3artiq, or a simple hash map in nac3standalone) into the nac3core type system. The
/// resolver is consulted during type inference, type analysis, and code generation.
pub trait SymbolResolver {
/// Get type of type variable identifier or top-level function type,
fn get_symbol_type(
+9
View File
@@ -34,6 +34,7 @@ use crate::{
/// for standalone mode. Use `DefaultBuiltinRegistry` when you need a simple
/// builtin registry without custom matching logic.
#[derive(Debug, Clone, Copy, Default)]
/// Name-based builtin registry used by nac3standalone. Matches builtins by string comparison.
pub struct DefaultBuiltinRegistry;
impl BuiltinRegistry for DefaultBuiltinRegistry {}
@@ -374,6 +375,14 @@ pub fn promote_expr_type(
}
pub type DefAst = (Arc<RwLock<TopLevelDef>>, Option<Stmt<()>>);
/// Orchestrates the registration and type analysis of all top-level definitions.
///
/// The typical usage is:
/// 1. Create with `TopLevelComposer::new()`.
/// 2. Call `register_top_level()` for each class and function definition.
/// 3. Call `start_analysis()` to run type inference and unification on all definitions.
/// 4. Call `make_top_level_context()` to produce a `TopLevelContext` for code generation.
pub struct TopLevelComposer {
// list of top level definitions, same as top level context
pub definition_ast_list: Vec<DefAst>,
+16
View File
@@ -28,6 +28,7 @@ pub mod numpy;
mod test;
pub mod type_annotation;
/// Index of a top-level definition (class, function, or module) in the global definition list.
#[derive(PartialEq, Eq, PartialOrd, Ord, Clone, Copy, Hash, Debug)]
pub struct DefinitionId(pub usize);
@@ -40,6 +41,10 @@ type GenCallCallback = dyn for<'ctx, 'a> Fn(
+ Send
+ Sync;
/// A callback that overrides code generation for a specific function.
///
/// Used by frontends to implement custom calling conventions (e.g., RPC serialization in
/// nac3artiq) instead of generating a normal function call.
pub struct GenCall {
fp: Box<GenCallCallback>,
}
@@ -74,6 +79,8 @@ impl Debug for GenCall {
}
}
/// A monomorphized instance of a generic function, containing the typed AST body and the
/// type variable substitutions for this particular instantiation.
#[derive(Clone, Debug)]
pub struct FunInstance {
pub body: Arc<Vec<Stmt<Option<Type>>>>,
@@ -87,6 +94,11 @@ pub enum FunAttribute {
StaticMethod,
}
/// A top-level definition: module, class, or function.
///
/// Definitions are stored in a global list and referenced by [`DefinitionId`]. During type
/// analysis, fields and method signatures are populated. During code generation, function
/// instances are created on demand as generic functions are called with concrete type arguments.
#[derive(Debug, Clone)]
pub enum TopLevelDef {
Module {
@@ -167,6 +179,10 @@ pub enum TopLevelDef {
},
}
/// Global compilation context shared across all codegen workers.
///
/// Contains the full list of top-level definitions, per-module unifiers, and the builtin
/// registry. Created by `TopLevelComposer::make_top_level_context()` after type analysis.
pub struct TopLevelContext {
pub definitions: Arc<RwLock<Vec<Arc<RwLock<TopLevelDef>>>>>,
pub unifiers: Arc<RwLock<Vec<(SharedUnifier, PrimitiveStore)>>>,
+9 -1
View File
@@ -229,7 +229,10 @@ impl AttrKind {
}
}
/// Category of variable and value types.
/// The concrete representation of a type, stored in the unification table.
///
/// `Type` handles are lightweight keys; to inspect what a type actually is, look it up
/// through `Unifier::get_ty()` to obtain a `TypeEnum`.
#[derive(Debug, Clone)]
pub enum TypeEnum {
TRigidVar {
@@ -314,6 +317,11 @@ impl TypeEnum {
pub type SharedUnifier = Arc<Mutex<(UnificationTable<TypeEnum>, u32, Vec<Call>)>>;
/// Type unification engine based on union-find.
///
/// Manages type constraints during inference and resolves type variables to concrete types.
/// Each module gets its own `Unifier` during analysis; during code generation, workers receive
/// a snapshot.
#[derive(Clone)]
pub struct Unifier {
pub(crate) top_level: Option<Arc<TopLevelContext>>,