LibWasm: Fix SIMD shuffle and swizzle

`swizzle` had the wrong operands, and the vector masking boolean logic was incorrect in the internal `shuffle_or_0` implementation. `shuffle` was previously implemented as a dynamic swizzle, when it uses an immediate operand for lane indices in the spec.
2025-12-22 09:19:03 +00:00 · 2024-07-23 10:08:55 -07:00
parent d841742c35
commit 9cc3e7d32d
3 changed files with 15 additions and 9 deletions
--- a/AK/SIMDExtras.h
+++ b/AK/SIMDExtras.h
@@ -220,13 +220,14 @@ ALWAYS_INLINE static T shuffle_or_0_impl(T a, Control control, IndexSequence<Idx
    if constexpr (__has_builtin(__builtin_shuffle)) {
        // GCC does a very bad job at optimizing the masking, while not recognizing the shuffle idiom
        // So we jinx its __builtin_shuffle to work with out of bounds indices
+        // TODO: verify that this masking logic is correct (for machines with __builtin_shuffle)
        auto mask = (control >= 0) | (control < N);
        return __builtin_shuffle(a, control & mask) & ~mask;
    }
    // 1. Set all out of bounds values to ~0
    // Note: This is done so that  the optimization mentioned down below works
    // Note: Vector compares result in bitmasks, aka all 1s or all 0s per element
-    control |= ~((control > 0) | (control < N));
+    control |= ~((control >= 0) & (control < N));
    // 2. Selectively set out of bounds values to 0
    // Note: Clang successfully optimizes this to a few instructions on x86-ssse3, GCC does not
    //       Vector Optimizations/Instruction-Selection on ArmV8 seem to not be as powerful as of Clang18