ndstrides: Allow multi-file IRRT + Exceptions in IRRT #508

Merged
sb10q merged 9 commits from ndstrides-enhance-irrt into ndstrides 2024-08-27 22:50:15 +08:00
Collaborator
  • IRRT is now a multi-file source tree.
  • IRRT now has mechanisms to throw exceptions, either in 32-bits or 64-bits. The location details are set to be within the C++ source code itself using CPP magic macros. The exception IDs are initialized at link time.
  • IRRT can do debug assertions if compiled with -DIRRT_DEBUG_ASSERT in nac3core/build.rs. Debug assertions throw exceptions.
  • Add .clang-format for reformatting.
- IRRT is now a multi-file source tree. - IRRT now has mechanisms to throw exceptions, either in 32-bits or 64-bits. The location details are set to be within the C++ source code itself using CPP magic macros. The exception IDs are initialized at link time. - IRRT can do debug assertions if compiled with `-DIRRT_DEBUG_ASSERT` in nac3core/build.rs. Debug assertions throw exceptions. - Add `.clang-format` for reformatting.
lyken added 8 commits 2024-08-26 14:34:37 +08:00
lyken requested review from derppening 2024-08-26 14:34:40 +08:00
derppening requested changes 2024-08-26 14:56:11 +08:00
Dismissed
.clang-format Outdated
@ -0,0 +1,3 @@
BasedOnStyle: Microsoft
Collaborator

Why Microsoft style? Wouldn't this introduce inconsistencies with demo.c where opening braces are on the same line as the statement?

Why Microsoft style? Wouldn't this introduce inconsistencies with `demo.c` where opening braces are on the same line as the statement?
Author
Collaborator

I picked Microsoft arbitrarily. Could we reformat demo.c to be Microsoft style too?

I picked Microsoft arbitrarily. Could we reformat `demo.c` to be Microsoft style too?
Collaborator

I personally use LLVM, as Microsoft's opening brace style makes the source code unnecessarily bloated.

I personally use LLVM, as Microsoft's opening brace style makes the source code unnecessarily bloated.
Author
Collaborator

Sure I can reformat everything to LLVM, should take little work. Later commits too.

Sure I can reformat everything to LLVM, should take little work. Later commits too.
derppening marked this conversation as resolved
@ -33,6 +33,7 @@ use inkwell::{
OptimizationLevel,
};
use itertools::Itertools;
use nac3core::codegen::irrt::setup_irrt_exceptions;
Collaborator

Fold this into use nac3core::codegen::{...}.

Fold this into `use nac3core::codegen::{...}`.
derppening marked this conversation as resolved
@ -0,0 +1,4 @@
#include <irrt/exception.hpp>
#include <irrt/int_types.hpp>
#include <irrt/math_util.hpp>
#include <irrt/original.hpp>
Collaborator

Why is this necessary? Why not have multiple IRRT C++ files and link the LLVM bitcode at build-time?

Why is this necessary? Why not have multiple IRRT C++ files and link the LLVM bitcode at build-time?
Author
Collaborator

Sure, but I didn't want to deal with having to managing temporary files. I can implement that now.

Sure, but I didn't want to deal with having to managing temporary files. I can implement that now.
Author
Collaborator

Update: I think implementing it would make the process a bit too complicated, at least with the solution I came up with.

  • Write a Makefile
  • Write the multi-file compilation process in that Makefile.
  • Have build.rs do make and pass in all the necessary configurations as environment variables.
  • The Makefile finds all *.cpp files recursively and compile each of them, and somehow save them temporarily in a directory.
  • The Makefile does linking and output the aggregate.
  • build.rs reads the aggregate and filter the IR with regex.
  • Output the filtrate to irrt.o.
Update: I think implementing it would make the process a bit too complicated, at least with the solution I came up with. - Write a Makefile - Write the multi-file compilation process in that Makefile. - Have `build.rs` do `make` and pass in all the necessary configurations as environment variables. - The Makefile finds all `*.cpp` files recursively and compile each of them, and somehow save them temporarily in a directory. - The Makefile does linking and output the aggregate. - `build.rs` reads the aggregate and filter the IR with regex. - Output the filtrate to `irrt.o`.
Collaborator

If you have to do this ugly Makefile trick then let's just keep it the way it is for now.

If you have to do this ugly Makefile trick then let's just keep it the way it is for now.
derppening marked this conversation as resolved
@ -0,0 +4,4 @@
template <typename SizeT> struct CSlice
{
uint8_t *base;
Collaborator

* on the left, i.e. uint8_t*.

`*` on the left, i.e. `uint8_t*`.
Author
Collaborator

Forced by Microsoft

Forced by Microsoft
Collaborator

Still not reformatted.

Still not reformatted.
derppening marked this conversation as resolved
@ -0,0 +7,4 @@
/**
* @brief Implementation of `strlen()`.
*/
uint32_t length(const char *str)
Collaborator

Remove this and use __builtin_strlen instead.

Remove this and use `__builtin_strlen` instead.
@ -0,0 +10,4 @@
#define raise_debug_assert(SizeT, msg, param1, param2, param3) \
raise_exception(SizeT, EXN_ASSERTION_ERROR, "IRRT debug assert failed: " msg, param1, param2, param3);
#define debug_assert_eq(SizeT, lhs, rhs) \
Collaborator

Why not just

#ifdef IRRT_DEBUG_ASSERT
#define debug_assert_eq ((void) 0)
#else
#define do {                                                                        \
        if ((lhs) != (rhs))  {                                                      \
            raise_debug_assert(SizeT, "LHS = {0}. RHS = {1}", lhs, rhs, NO_PARAM);  \
        }                                                                           \
    } while (false)
#endif
Why not just ```cpp #ifdef IRRT_DEBUG_ASSERT #define debug_assert_eq ((void) 0) #else #define do { \ if ((lhs) != (rhs)) { \ raise_debug_assert(SizeT, "LHS = {0}. RHS = {1}", lhs, rhs, NO_PARAM); \ } \ } while (false) #endif ```
Author
Collaborator

IRRT_DEBUG_ASSERT_BOOL itself will be used to write stuff like:

if (IRRT_DEBUG_ASSERT_BOOL) {
  // Code that only runs if assertion is on.
}

I could place a bunch of #ifdef IRRT_DEBUG_ASSERT blocks but I find it ugly, though I am not sure what modern C++ people do nowadays.

`IRRT_DEBUG_ASSERT_BOOL` itself will be used to write stuff like: ```c++ if (IRRT_DEBUG_ASSERT_BOOL) { // Code that only runs if assertion is on. } ``` I could place a bunch of `#ifdef IRRT_DEBUG_ASSERT` blocks but I find it ugly, though I am not sure what modern C++ people do nowadays.
Collaborator

You can keep the _BOOL variable.

I would say I would do it as

if constexpr (IRRT_DEBUG_ASSERT_BOOL) {
    if ((lhs) != (rhs)) {
        raise_debug_assert(SizeT, "LHS = {0}. RHS = {1}", lhs, rhs, NO_PARAM);
    }
}

But I suggested the original version as it most resembles how assert in C is implemented in the <assert.h> header.

You can keep the `_BOOL` variable. I would say I would do it as ```cpp if constexpr (IRRT_DEBUG_ASSERT_BOOL) { if ((lhs) != (rhs)) { raise_debug_assert(SizeT, "LHS = {0}. RHS = {1}", lhs, rhs, NO_PARAM); } } ``` But I suggested the original version as it most resembles how `assert` in C is implemented in the `<assert.h>` header.
derppening marked this conversation as resolved
@ -0,0 +45,4 @@
int64_t params[3];
};
const int64_t NO_PARAM = 0;
Collaborator

constexpr

`constexpr`
Author
Collaborator

I would like to understand why.

I am not too familiar with constexpr and why people put them everywhere.

I would like to understand why. I am not too familiar with `constexpr` and why people put them everywhere.
Collaborator

constexpr indicates that this is to be a compile-time constant, and implies inline. This ensures that the variable does not escape this compilation unit and prevents ODR violations. AFAIK these variables also do not show up in the compiled ELF file.

https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#con5-use-constexpr-for-values-that-can-be-computed-at-compile-time

`constexpr` indicates that this is to be a compile-time constant, and implies `inline`. This ensures that the variable does not escape this compilation unit and prevents ODR violations. AFAIK these variables also do not show up in the compiled ELF file. https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#con5-use-constexpr-for-values-that-can-be-computed-at-compile-time
@ -0,0 +53,4 @@
{
Exception<SizeT> e = {
.id = id,
.filename = {.base = (uint8_t *)filename, .len = (int32_t)cstr::length(filename)},
Collaborator

Add spaces between casts (or preferably, use reinterpret_cast).

Add spaces between casts (or preferably, use `reinterpret_cast`).
Author
Collaborator

Microsoft again.

I am not using reinterpret_cast because it is convenient. Do they make a difference? or should I use reinterpret_cast for consistency.

Microsoft again. I am not using `reinterpret_cast` because it is convenient. Do they make a difference? or should I use `reinterpret_cast` for consistency.
Collaborator

They are functionally the same, but you should use modern C++ regardless.

https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#es49-if-you-must-use-a-cast-use-a-named-cast

They are functionally the same, but you should use modern C++ regardless. https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#es49-if-you-must-use-a-cast-use-a-named-cast
Author
Collaborator

Ah okay

Ah okay
derppening marked this conversation as resolved
@ -0,0 +57,4 @@
.line = line,
.column = 0,
.function = {.base = (uint8_t *)function, .len = (int32_t)cstr::length(function)},
.msg = {.base = (uint8_t *)msg, .len = (int32_t)cstr::length(msg)},
Collaborator

Same here.

Same here.
Author
Collaborator

Microsoft.

Microsoft.
derppening marked this conversation as resolved
@ -0,0 +62,4 @@
e.params[0] = param0;
e.params[1] = param1;
e.params[2] = param2;
__nac3_raise((void *)&e);
Collaborator

Same here.

Same here.
derppening marked this conversation as resolved
@ -0,0 +1,372 @@
#pragma once
Collaborator

Probably best to split original.hpp into separate files, since you're already doing that.

Probably best to split `original.hpp` into separate files, since you're already doing that.
derppening marked this conversation as resolved
@ -931,1 +931,4 @@
}
/// Initialize all global `EXN_*` exception IDs in IRRT with the [`SymbolResolver`].
pub fn setup_irrt_exceptions<'ctx>(
Collaborator

Couldn't you just set the exceptions while loading the IRRT? Since IRRT depends on the exception specifications anyways.

Couldn't you just set the exceptions while loading the IRRT? Since IRRT depends on the exception specifications anyways.
derppening marked this conversation as resolved
@ -14,6 +14,7 @@ use inkwell::{
memory_buffer::MemoryBuffer, passes::PassBuilderOptions, support::is_multithreaded, targets::*,
OptimizationLevel,
};
use nac3core::codegen::irrt::setup_irrt_exceptions;
Collaborator

Fold this into use nac3core::{codegen::{...}, ...}

Fold this into `use nac3core::{codegen::{...}, ...}`
derppening marked this conversation as resolved
derppening reviewed 2024-08-26 15:04:02 +08:00
@ -0,0 +52,4 @@
int64_t param0, int64_t param1, int64_t param2)
{
Exception<SizeT> e = {
.id = id,
Collaborator

Designated initializer syntax requires C++20. Either use the standard syntax or add -std=c++20 to build.rs.

Designated initializer syntax requires C++20. Either use the standard syntax or add `-std=c++20` to `build.rs`.
derppening marked this conversation as resolved
Author
Collaborator

Revised. Squashing commits when everything is ok.

Revised. Squashing commits when everything is ok.
lyken requested review from derppening 2024-08-26 16:34:53 +08:00
derppening requested changes 2024-08-26 17:34:56 +08:00
Dismissed
derppening left a comment
Collaborator

The positions of pointers * and references & are still incorrect, and I still see many uses of C-style casts. Please change those before submitting for re-review.

The positions of pointers `*` and references `&` are still incorrect, and I still see many uses of C-style casts. Please change those before submitting for re-review.
@ -0,0 +11,4 @@
raise_exception(SizeT, EXN_ASSERTION_ERROR, \
"IRRT debug assert failed: " msg, param1, param2, param3);
#define debug_assert_eq(SizeT, lhs, rhs) if constexpr (IRRT_DEBUG_ASSERT_BOOL)
Collaborator

You still need the trailing \ for newlines, no?

You still need the trailing `\` for newlines, no?
Collaborator

This still needs fixing AFAICT.

This still needs fixing AFAICT.
derppening marked this conversation as resolved
Collaborator

Do you want me to help you do a quick rebase pass to fix all the formatting issues instead?

Do you want me to help you do a quick rebase pass to fix all the formatting issues instead?
Author
Collaborator

I am fully relying on .clang-format to reformat, and it really does not like T*. Many of the formatting issues come from me just doing

$ find nac3core/irrt -name '*.hpp' -o -name '*.cpp' | xargs clang-format -i

blindly.

Perhaps we should settle on new .clang-format instead? Like using yours. Since I will have to keep reformatting my future PRs, and doing it manually is definitely not a sane option.

I am fully relying on `.clang-format` to reformat, and it really does not like `T*`. Many of the formatting issues come from me just doing ```console $ find nac3core/irrt -name '*.hpp' -o -name '*.cpp' | xargs clang-format -i ``` blindly. Perhaps we should settle on new `.clang-format` instead? Like using yours. Since I will have to keep reformatting my future PRs, and doing it manually is definitely not a sane option.
Author
Collaborator

Also I can't find any C-style casts in nac3core/irrt, or at least with rg '\*\)' nac3core/irrt and vim.

Also I can't find any C-style casts in `nac3core/irrt`, or at least with `rg '\*\)' nac3core/irrt` and vim.
Author
Collaborator

...unless you are talking about nacstandalone/

...unless you are talking about `nacstandalone/`
Author
Collaborator

Switched to your .clang-format and reformatted everything. How see?

Switched to your .clang-format and reformatted everything. How see?
lyken force-pushed ndstrides-enhance-irrt from ff1088b429 to 9ac4c73797 2024-08-27 10:01:31 +08:00 Compare
Collaborator

Switched to your .clang-format and reformatted everything. How see?

Much better, thanks.

> Switched to your .clang-format and reformatted everything. How see? Much better, thanks.
Collaborator

Thanks. You could squash the final few commits and then we can merge it.

Thanks. You could squash the final few commits and then we can merge it.
lyken force-pushed ndstrides-enhance-irrt from b326532099 to 048950b7f0 2024-08-27 10:40:03 +08:00 Compare
Author
Collaborator

Rebased to clean up the branch

Rebased to clean up the branch
derppening approved these changes 2024-08-27 14:36:25 +08:00
sb10q merged commit 048950b7f0 into ndstrides 2024-08-27 22:50:14 +08:00
sb10q deleted branch ndstrides-enhance-irrt 2024-08-27 22:50:15 +08:00
Sign in to join this conversation.
No reviewers
No Milestone
No Assignees
2 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: M-Labs/nac3#508
No description provided.