Implement string equality operator using IRRT and optimise LLVM implementation #561

Merged
sb10q merged 9 commits from ramtej/nac3:feature/string-equality into master 2024-12-30 13:02:09 +08:00
Collaborator
  1. Implement IRRT function for string equality operator
  2. Refactor LLVM string equality implementation

Closes #421

1. Implement IRRT function for string equality operator 2. Refactor LLVM string equality implementation Closes #421
ramtej added 1 commit 2024-12-12 01:24:00 +08:00
sb10q reviewed 2024-12-12 09:20:32 +08:00
@ -28,3 +34,3 @@
str_ne()
return 0
return 0
Owner

Again I don't see the point of these test changes.

Again I don't see the point of these test changes.
sb10q reviewed 2024-12-12 09:24:41 +08:00
@ -0,0 +4,4 @@
namespace {
template<typename SizeT>
int32_t __nac3_str_eq_impl(const char* str1, SizeT len1, const char* str2, SizeT len2) {
if (str1 == str2) return 1;
Owner

I think this case of pointer equality is rare enough that it does not warrant its special handling and corresponding IR bloat. LLVM is already sluggish enough processing all that IR.

It's also potentially incorrect since you could call the function with the same pointer and different lengths and it would return true.

So remove this line.

I think this case of pointer equality is rare enough that it does not warrant its special handling and corresponding IR bloat. LLVM is already sluggish enough processing all that IR. It's also potentially incorrect since you could call the function with the same pointer and different lengths and it would return true. So remove this line.
ramtej marked this conversation as resolved
sb10q reviewed 2024-12-12 09:25:55 +08:00
@ -0,0 +9,4 @@
for (SizeT i = 0; i < len1; ++i) {
if (static_cast<unsigned char>(str1[i]) != static_cast<unsigned char>(str2[i])) {
return 0;
}
Owner

Isn't that just memcmp but implemented inefficiently?

Isn't that just memcmp but implemented inefficiently?
Collaborator

LLVM has an intrinsic __builtin_memcmp for this exact purpose: https://clang.llvm.org/docs/LanguageExtensions.html#string-builtins

LLVM has an intrinsic `__builtin_memcmp` for this exact purpose: https://clang.llvm.org/docs/LanguageExtensions.html#string-builtins
Collaborator

In fact, LLVM has __builtin_strncmp. If that's the case, the entire IRRT can just be implemented in terms of that intrinsic function.

In fact, LLVM has `__builtin_strncmp`. If that's the case, the entire IRRT can just be implemented in terms of that intrinsic function.
ramtej marked this conversation as resolved
sb10q reviewed 2024-12-12 09:28:06 +08:00
@ -0,0 +16,4 @@
} // namespace
extern "C" {
int32_t nac3_str_eq(const char* str1, uint64_t len1, const char* str2, uint64_t len2) {
Owner

uint64_t-only length looks highly suspicious to me. I thought we had cleaned up all this size_t business already. @derppening

uint64_t-only length looks highly suspicious to me. I thought we had cleaned up all this size_t business already. @derppening
Collaborator

str is implemented as struct { i8*, usize } where usize is size_t. You will need two C functions for this, one with 32-bit and one with 64-bit lengths.

`str` is implemented as `struct { i8*, usize }` where `usize` is `size_t`. You will need two C functions for this, one with 32-bit and one with 64-bit lengths.
ramtej marked this conversation as resolved
ramtej added 2 commits 2024-12-12 18:14:15 +08:00
ramtej added 1 commit 2024-12-12 18:29:34 +08:00
ramtej force-pushed feature/string-equality from 8b3515294c to 9b0d37b1f0 2024-12-12 18:36:36 +08:00 Compare
sb10q reviewed 2024-12-13 08:49:12 +08:00
@ -0,0 +8,4 @@
if (len1 != len2){
return 0;
}
return (__builtin_strncmp(str1, str2, static_cast<SizeT>(len1)) == 0) ? 1 : 0;
Owner

Is IRRT really required at this point?

Is IRRT really required at this point?
Collaborator

Yes, __builtin_strncmp isn't exposed in LLVM IR so we will need clang to emit the equivalent bitcode for us.

Yes, `__builtin_strncmp` isn't exposed in LLVM IR so we will need `clang` to emit the equivalent bitcode for us.
Owner

Isn't clang also just a LLVM user?
How does it do it?

Isn't clang also just a LLVM user? How does it do it?
Owner

Also, NAC3 is just using regular Pascal-style strings, isn't it?
If so, then the correct function to use is memcmp. strncmp is intrinsically slower as it adds a check for NUL characters which is unnecessary and may even lead to incorrect behavior.

Also, NAC3 is just using regular Pascal-style strings, isn't it? If so, then the correct function to use is ``memcmp``. ``strncmp`` is intrinsically slower as it adds a check for NUL characters which is unnecessary and may even lead to incorrect behavior.
Collaborator

Just want to pitch my 2 cents - String equality operator (#421) is already implemented in a8e92212. I think this PR should focus on refactoring this operator to implement this using IRRT and either use strncmp/memcmp for comparison (if ARTIQ's libc has these functions already available), or handroll an implementation of these functions.

Just want to pitch my 2 cents - String equality operator (#421) is already implemented in a8e92212. I think this PR should focus on refactoring this operator to implement this using IRRT and either use `strncmp`/`memcmp` for comparison (if ARTIQ's libc has these functions already available), or handroll an implementation of these functions.
Owner

I don't think IRRT is necessary if it's just one comparison (for length) + the memcmp LLVM intrinsic (which may get turned into a libc call)?

Calling IRRT alone appears to be 47 lines of code in that PR, so it should be done with caution.

I don't think IRRT is necessary if it's just one comparison (for length) + the memcmp LLVM intrinsic (which may get turned into a libc call)? Calling IRRT alone appears to be 47 lines of code in that PR, so it should be done with caution.
Collaborator

I don't think IRRT is necessary if it's just one comparison (for length) + the memcmp LLVM intrinsic (which may get turned into a libc call)?

__builtin_memcmp is a Clang builtin function, not an LLVM intrinsic. AFAIK we can only call Clang builtins in C/C++ code, so IRRT is definitely necessary if we want to use that.

> I don't think IRRT is necessary if it's just one comparison (for length) + the memcmp LLVM intrinsic (which may get turned into a libc call)? `__builtin_memcmp` is a Clang builtin function, not an LLVM intrinsic. AFAIK we can only call Clang builtins in C/C++ code, so IRRT is definitely necessary if we want to use that.
ramtej added 1 commit 2024-12-13 23:45:28 +08:00
sb10q reviewed 2024-12-15 20:54:13 +08:00
@ -2086,1 +2084,4 @@
let lhs_ptr = ctx.build_in_bounds_gep_and_load(
plhs,
&[llvm_i32.const_zero(), llvm_i32.const_zero()],
Owner

i32?

i32?
Author
Collaborator

Yes, according to https://releases.llvm.org/14.0.0/docs/LangRef.html#getelementptr-instruction:

"The type of each index argument depends on the type it is indexing into. When indexing into a (optionally packed) structure, only i32 integer constants are allowed (when using a vector of indices they must all be the same i32 integer constant). When indexing into an array, pointer or vector, integers of any width are allowed, and they are not required to be constant. These integers are treated as signed values where relevant."

Yes, according to https://releases.llvm.org/14.0.0/docs/LangRef.html#getelementptr-instruction: "The type of each index argument depends on the type it is indexing into. When indexing into a (optionally packed) structure, only i32 integer constants are allowed (when using a vector of indices they must all be the same i32 integer constant). When indexing into an array, pointer or vector, integers of any width are allowed, and they are not required to be constant. These integers are treated as signed values where relevant."
Author
Collaborator

so since we are using string as a structure, i have to use i32. i tried using llvm_usize, but i kept running into segmentation error because of this GEP format

so since we are using string as a structure, i have to use i32. i tried using llvm_usize, but i kept running into segmentation error because of this GEP format
Collaborator

What about llvm_usize for the first gep and i32 for the second?

What about `llvm_usize` for the first `gep` and `i32` for the second?
Author
Collaborator

yea, that worked too, would it be better for me to use that implementation then?

yea, that worked too, would it be better for me to use that implementation then?
Collaborator

Yes, please use that implementation.

Yes, please use that implementation.
sb10q reviewed 2024-12-15 20:54:33 +08:00
@ -0,0 +14,4 @@
) -> IntValue<'ctx> {
let func = ctx.module.get_function("nac3_str_eq").unwrap_or_else(|| {
ctx.module.add_function(
"nac3_str_eq",
Owner

And when is nac3_str_eq64 called?

And when is ``nac3_str_eq64`` called?
ramtej marked this conversation as resolved
ramtej added 1 commit 2024-12-16 17:50:33 +08:00
ramtej added 1 commit 2024-12-16 22:36:35 +08:00
ramtej added 2 commits 2024-12-30 12:21:52 +08:00
sb10q merged commit 49a7469b4a into master 2024-12-30 13:02:09 +08:00
Sign in to join this conversation.
No reviewers
No Milestone
No Assignees
3 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: M-Labs/nac3#561
No description provided.