Implement core_log and rtio_log to ARTIQ #488

Merged
sb10q merged 8 commits from feature/core-rtio-log into master 2024-08-13 15:19:03 +08:00
3 changed files with 27 additions and 10 deletions
Showing only changes of commit 1ccfad9c53 - Show all commits

View File

@ -8,12 +8,15 @@ if [ -z "$1" ]; then
fi
declare -a nac3args
while [ $# -ge 2 ]; do
while [ $# -gt 1 ]; do
case "$1" in
--help)
echo "Usage: check_demo.sh [-i686] -- demo [NAC3ARGS...]"
echo "Usage: check_demo.sh [--debug] [-i686] -- [NAC3ARGS...] demo"
exit
;;
--debug)
debug=1
;;
-i686)
i686=1
;;
@ -22,18 +25,18 @@ while [ $# -ge 2 ]; do
break
;;
*)
break
echo "Unrecognized argument \"$1\""
exit 1
;;
esac
shift
done
demo="$1"
shift
while [ $# -gt 1 ]; do
nac3args+=("$1")
shift
done
demo="$1"
echo "### Checking $demo..."
@ -43,12 +46,20 @@ echo ">>>>>> Running $demo with the Python interpreter"
if [ -n "$i686" ]; then
echo "...... Trying NAC3's 32-bit code generator output"
./run_demo.sh -i686 --out run_32.log "${nac3args[@]}" "$demo"
if [ -n "$debug" ]; then
./run_demo.sh --debug -i686 --out run_32.log -- "${nac3args[@]}" "$demo"
else
./run_demo.sh -i686 --out run_32.log -- "${nac3args[@]}" "$demo"
fi
diff -Nau interpreted.log run_32.log
fi
echo "...... Trying NAC3's 64-bit code generator output"
./run_demo.sh --out run_64.log "${nac3args[@]}" "$demo"
if [ -n "$debug" ]; then
./run_demo.sh --debug --out run_64.log -- "${nac3args[@]}" "$demo"
else
./run_demo.sh --out run_64.log -- "${nac3args[@]}" "$demo"
fi
diff -Nau interpreted.log run_64.log
echo "...... OK"

View File

@ -2,6 +2,11 @@
set -e
if [ "$1" == "--help" ]; then
echo "Usage: check_demos.sh [CHECKARGS...] [--] [NAC3ARGS...]"
exit
fi
count=0
for demo in src/*.py; do
./check_demo.sh "$@" "$demo"

View File

@ -14,7 +14,7 @@ declare -a nac3args
while [ $# -ge 1 ]; do
case "$1" in
--help)
echo "Usage: run_demo.sh [--help] [--out OUTFILE] [--debug] [-i686] -- [NAC3ARGS...]"
echo "Usage: run_demo.sh [--help] [--out OUTFILE] [--debug] [-i686] -- [NAC3ARGS...] demo"
exit
;;
--out)
@ -32,7 +32,8 @@ while [ $# -ge 1 ]; do
break
;;
*)
break
echo "Unrecognized argument \"$1\""
exit 1
;;
esac
shift
@ -59,7 +60,7 @@ if [ -z "$i686" ]; then
clang -c -std=gnu11 -Wall -Wextra -O3 -o demo.o demo.c
clang -o demo module.o demo.o $DEMO_LINALG_STUB -lm -Wl,--no-warn-search-mismatch
else
$nac3standalone --triple i686-unknown-linux-gnu "${nac3args[@]}"
$nac3standalone --triple i686-unknown-linux-gnu --target-features +sse2 "${nac3args[@]}"
Outdated
Review

@abdul124 I thought you said this sse2 option was no longer required when using i686 instead of i386?

@abdul124 I thought you said this sse2 option was no longer required when using i686 instead of i386?

@abdul124 I thought you said this sse2 option was no longer required when using i686 instead of i386?

sse2 should be enforced by default on rustc i686 target https://github.com/rust-lang/rust/issues/82435#issuecomment-783939789

> @abdul124 I thought you said this sse2 option was no longer required when using i686 instead of i386? sse2 should be enforced by default on rustc i686 target https://github.com/rust-lang/rust/issues/82435#issuecomment-783939789
Outdated
Review

Okay, but that comment is about Rust, not LLVM. nac3 is only using the latter here.

Okay, but that comment is about Rust, not LLVM. nac3 is only using the latter here.

Okay, but that comment is about Rust, not LLVM. nac3 is only using the latter here.

LLVM does interpret "i686" as implying sse2 for most part (Debian LLVM requires patch though). This commit 56d3ad9d23 describes the LLVM situation for sse2 support on LLVM i686 target.

> Okay, but that comment is about Rust, not LLVM. nac3 is only using the latter here. LLVM does interpret "i686" as implying sse2 for most part (Debian LLVM requires patch though). This commit https://github.com/google/boringssl/commit/56d3ad9d23bc130aa9404bfdd1957fe81b3ba498 describes the LLVM situation for sse2 support on LLVM i686 target.
Outdated
Review

So the present change should be reverted i.e. there is no need to add --target-features +sse2 ?

So the present change should be reverted i.e. there is no need to add ``--target-features +sse2`` ?

So the present change should be reverted i.e. there is no need to add --target-features +sse2 ?

Yes, since the feature is already implied by i686 target, --target-feature +sse2 is redundant.

> So the present change should be reverted i.e. there is no need to add ``--target-features +sse2`` ? Yes, since the feature is already implied by i686 target, ``--target-feature +sse2`` is redundant.

I don't think any of that is true. The reason why BoringSSL requires SSE2 on i686 targets is because (from the commit message)

As far as I know, all our supported 32-bit x86 consumers require SSE2.

So it should be seen as OpenSSL's own project requirements, rather than a behavior of the compiler.

In fact, the quoted issue in LLVM's repo says that while SSE2 support is implied by i686, it is ultimately treated as an i386 target.

-m32 on Clang is i386 too.

When compiling without any flags, all below f-prefixed instructions are from the X87 ISA (List of X87 Instructions).

$ ../../target/debug/nac3standalone --triple i686-unknown-linux-gnu src/mandelbrot.py
$ llvm-objdump -Cd -M intel --no-show-raw-insn module.o
[...]
      14:       fldz
      16:       fld     dword ptr [ebx]
      1c:       fstp    qword ptr [esp + 32]
      20:       fst     qword ptr [esp + 16]
[...]
      30:       fstp    st(1)
      32:       mov     dword ptr [esp], 4294967295
      39:       fstp    qword ptr [esp + 8]
      3d:       call    0x3e <run+0x3e>
      42:       fld     qword ptr [esp + 24]
      46:       fld     qword ptr [esp + 8]
      4a:       faddp   st(1), st
[...]
      56:       fst     qword ptr [esp + 24]
      5a:       fmul    qword ptr [ebx]
      60:       fdiv    dword ptr [ebx]
      66:       fadd    qword ptr [ebx]
      6c:       fstp    qword ptr [esp + 8]
      70:       fld     qword ptr [esp + 16]
      74:       xor     edi, edi
      76:       fldz
[...]

With --target-features +sse2, all the above X87 instructions are replaced with their SSE2 counterparts (note the use of xmm registers):

$ ../../target/debug/nac3standalone --triple i686-unknown-linux-gnu --target-features +sse2 src/mandelbrot.py
$ llvm-objdump -Cd -M intel --no-show-raw-insn module.o
[...]
      14:       xorpd   xmm0, xmm0
      18:       movsd   xmm1, qword ptr [ebx]   # xmm1 = mem[0],zero
      20:       movsd   qword ptr [esp + 32], xmm1
      26:       movsd   xmm1, qword ptr [ebx]   # xmm1 = mem[0],zero
      2e:       movsd   qword ptr [esp + 24], xmm1
      34:       movsd   xmm1, qword ptr [ebx]   # xmm1 = mem[0],zero
      3c:       movsd   qword ptr [esp + 16], xmm1
      42:       movsd   xmm1, qword ptr [ebx]   # xmm1 = mem[0],zero
      4a:       movsd   qword ptr [esp + 72], xmm1
      50:       movsd   xmm1, qword ptr [ebx]   # xmm1 = mem[0],zero
      58:       movsd   qword ptr [esp + 64], xmm1
      5e:       movsd   xmm1, qword ptr [ebx]   # xmm1 = mem[0],zero
      66:       movsd   qword ptr [esp + 56], xmm1
      6c:       movsd   xmm1, qword ptr [ebx]   # xmm1 = mem[0],zero
      74:       movsd   qword ptr [esp + 48], xmm1
      7a:       movsd   xmm1, qword ptr [ebx]   # xmm1 = mem[0],zero
      82:       movsd   qword ptr [esp + 8], xmm1
[...]
      9c:       movsd   xmm0, qword ptr [esp + 40] # xmm0 = mem[0],zero
      a2:       addsd   xmm0, qword ptr [esp + 8]
[...]
      b2:       movsd   qword ptr [esp + 40], xmm0
      b8:       movapd  xmm7, xmm0
      bc:       mulsd   xmm7, qword ptr [esp + 32]
      c2:       divsd   xmm7, qword ptr [esp + 24]
      c8:       addsd   xmm7, qword ptr [esp + 16]
      ce:       xorpd   xmm0, xmm0
      d2:       xor     edi, edi
      d4:       movsd   qword ptr [esp + 80], xmm7
[...]

This also explains the test failures when removing --target-features +sse2 with src/mandelbrot.py.

I don't think any of that is true. The reason why BoringSSL requires SSE2 on i686 targets is because (from the commit message) > As far as I know, all our supported 32-bit x86 consumers require SSE2. So it should be seen as OpenSSL's own project requirements, rather than a behavior of the compiler. In fact, the [quoted issue in LLVM's repo](https://github.com/llvm/llvm-project/issues/61347) says that while SSE2 support is implied by `i686`, it is ultimately treated as an `i386` target. > -m32 on Clang is i386 too. When compiling without any flags, all below `f`-prefixed instructions are from the X87 ISA ([List of X87 Instructions](https://www2.math.uni-wuppertal.de/~fpf/Uebungen/GdR-SS02/opcode_f.html)). ``` $ ../../target/debug/nac3standalone --triple i686-unknown-linux-gnu src/mandelbrot.py $ llvm-objdump -Cd -M intel --no-show-raw-insn module.o [...] 14: fldz 16: fld dword ptr [ebx] 1c: fstp qword ptr [esp + 32] 20: fst qword ptr [esp + 16] [...] 30: fstp st(1) 32: mov dword ptr [esp], 4294967295 39: fstp qword ptr [esp + 8] 3d: call 0x3e <run+0x3e> 42: fld qword ptr [esp + 24] 46: fld qword ptr [esp + 8] 4a: faddp st(1), st [...] 56: fst qword ptr [esp + 24] 5a: fmul qword ptr [ebx] 60: fdiv dword ptr [ebx] 66: fadd qword ptr [ebx] 6c: fstp qword ptr [esp + 8] 70: fld qword ptr [esp + 16] 74: xor edi, edi 76: fldz [...] ``` With `--target-features +sse2`, all the above X87 instructions are replaced with their SSE2 counterparts (note the use of `xmm` registers): ``` $ ../../target/debug/nac3standalone --triple i686-unknown-linux-gnu --target-features +sse2 src/mandelbrot.py $ llvm-objdump -Cd -M intel --no-show-raw-insn module.o [...] 14: xorpd xmm0, xmm0 18: movsd xmm1, qword ptr [ebx] # xmm1 = mem[0],zero 20: movsd qword ptr [esp + 32], xmm1 26: movsd xmm1, qword ptr [ebx] # xmm1 = mem[0],zero 2e: movsd qword ptr [esp + 24], xmm1 34: movsd xmm1, qword ptr [ebx] # xmm1 = mem[0],zero 3c: movsd qword ptr [esp + 16], xmm1 42: movsd xmm1, qword ptr [ebx] # xmm1 = mem[0],zero 4a: movsd qword ptr [esp + 72], xmm1 50: movsd xmm1, qword ptr [ebx] # xmm1 = mem[0],zero 58: movsd qword ptr [esp + 64], xmm1 5e: movsd xmm1, qword ptr [ebx] # xmm1 = mem[0],zero 66: movsd qword ptr [esp + 56], xmm1 6c: movsd xmm1, qword ptr [ebx] # xmm1 = mem[0],zero 74: movsd qword ptr [esp + 48], xmm1 7a: movsd xmm1, qword ptr [ebx] # xmm1 = mem[0],zero 82: movsd qword ptr [esp + 8], xmm1 [...] 9c: movsd xmm0, qword ptr [esp + 40] # xmm0 = mem[0],zero a2: addsd xmm0, qword ptr [esp + 8] [...] b2: movsd qword ptr [esp + 40], xmm0 b8: movapd xmm7, xmm0 bc: mulsd xmm7, qword ptr [esp + 32] c2: divsd xmm7, qword ptr [esp + 24] c8: addsd xmm7, qword ptr [esp + 16] ce: xorpd xmm0, xmm0 d2: xor edi, edi d4: movsd qword ptr [esp + 80], xmm7 [...] ``` This also explains the test failures when removing `--target-features +sse2` with `src/mandelbrot.py`.
clang -m32 -c -std=gnu11 -Wall -Wextra -O3 -msse2 -o demo.o demo.c
clang -m32 -o demo module.o demo.o $DEMO_LINALG_STUB32 -lm -Wl,--no-warn-search-mismatch
fi