mirror of
https://github.com/fergalmoran/ladybird.git
synced 2026-01-01 22:29:13 +00:00
Some occasional cases could cause the accumulator to overflow and have an incorrect result. It would be nice to use a smaller accumulator, but it seems not to be correct. :^( We now cast to i16 to allow 128-bit vectorization to make use of one whole register instead of having to split the loop into multiple. This results in about a 5% reduction in performance in my testing.
110 KiB
110 KiB