Whenever the floating-point values are in integer range, we can use the
various FCVT functions with static rounding mode to perform fast
rounding. I took this opportunity to clean up the architecture
differentiation for these functions, which allows us to use the software
rounding implementation for all extreme and unimplemented cases,
including AArch64.
Also adds more round & trunc tests.