Implement string equality operator using IRRT and optimise LLVM implementation #561
No reviewers
Labels
No Milestone
No Assignees
3 Participants
Notifications
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: M-Labs/nac3#561
Loading…
Reference in New Issue
Block a user
No description provided.
Delete Branch "ramtej/nac3:feature/string-equality"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Closes #421
@ -28,3 +34,3 @@
str_ne()
return 0
return 0
Again I don't see the point of these test changes.
@ -0,0 +4,4 @@
namespace {
template<typename SizeT>
int32_t __nac3_str_eq_impl(const char* str1, SizeT len1, const char* str2, SizeT len2) {
if (str1 == str2) return 1;
I think this case of pointer equality is rare enough that it does not warrant its special handling and corresponding IR bloat. LLVM is already sluggish enough processing all that IR.
It's also potentially incorrect since you could call the function with the same pointer and different lengths and it would return true.
So remove this line.
@ -0,0 +9,4 @@
for (SizeT i = 0; i < len1; ++i) {
if (static_cast<unsigned char>(str1[i]) != static_cast<unsigned char>(str2[i])) {
return 0;
}
Isn't that just memcmp but implemented inefficiently?
LLVM has an intrinsic
__builtin_memcmp
for this exact purpose: https://clang.llvm.org/docs/LanguageExtensions.html#string-builtinsIn fact, LLVM has
__builtin_strncmp
. If that's the case, the entire IRRT can just be implemented in terms of that intrinsic function.@ -0,0 +16,4 @@
} // namespace
extern "C" {
int32_t nac3_str_eq(const char* str1, uint64_t len1, const char* str2, uint64_t len2) {
uint64_t-only length looks highly suspicious to me. I thought we had cleaned up all this size_t business already. @derppening
str
is implemented asstruct { i8*, usize }
whereusize
issize_t
. You will need two C functions for this, one with 32-bit and one with 64-bit lengths.8b3515294c
to9b0d37b1f0
@ -0,0 +8,4 @@
if (len1 != len2){
return 0;
}
return (__builtin_strncmp(str1, str2, static_cast<SizeT>(len1)) == 0) ? 1 : 0;
Is IRRT really required at this point?
Yes,
__builtin_strncmp
isn't exposed in LLVM IR so we will needclang
to emit the equivalent bitcode for us.Isn't clang also just a LLVM user?
How does it do it?
Also, NAC3 is just using regular Pascal-style strings, isn't it?
If so, then the correct function to use is
memcmp
.strncmp
is intrinsically slower as it adds a check for NUL characters which is unnecessary and may even lead to incorrect behavior.Just want to pitch my 2 cents - String equality operator (#421) is already implemented in
a8e92212
. I think this PR should focus on refactoring this operator to implement this using IRRT and either usestrncmp
/memcmp
for comparison (if ARTIQ's libc has these functions already available), or handroll an implementation of these functions.I don't think IRRT is necessary if it's just one comparison (for length) + the memcmp LLVM intrinsic (which may get turned into a libc call)?
Calling IRRT alone appears to be 47 lines of code in that PR, so it should be done with caution.
__builtin_memcmp
is a Clang builtin function, not an LLVM intrinsic. AFAIK we can only call Clang builtins in C/C++ code, so IRRT is definitely necessary if we want to use that.@ -2086,1 +2084,4 @@
let lhs_ptr = ctx.build_in_bounds_gep_and_load(
plhs,
&[llvm_i32.const_zero(), llvm_i32.const_zero()],
i32?
Yes, according to https://releases.llvm.org/14.0.0/docs/LangRef.html#getelementptr-instruction:
"The type of each index argument depends on the type it is indexing into. When indexing into a (optionally packed) structure, only i32 integer constants are allowed (when using a vector of indices they must all be the same i32 integer constant). When indexing into an array, pointer or vector, integers of any width are allowed, and they are not required to be constant. These integers are treated as signed values where relevant."
so since we are using string as a structure, i have to use i32. i tried using llvm_usize, but i kept running into segmentation error because of this GEP format
What about
llvm_usize
for the firstgep
andi32
for the second?yea, that worked too, would it be better for me to use that implementation then?
Yes, please use that implementation.
@ -0,0 +14,4 @@
) -> IntValue<'ctx> {
let func = ctx.module.get_function("nac3_str_eq").unwrap_or_else(|| {
ctx.module.add_function(
"nac3_str_eq",
And when is
nac3_str_eq64
called?