IntegerUtilities has RoundingRule that is a superset of FloatingPointRoundingRule; many of these are also useful when working with floating-point. This change provides a `rounding(_:)` function similar to the standard libraries `rounded(_:)` that makes all of them available. It has a different name instead of being a shadow because otherwise existing use sites like `rounded(.down)` would become ambiguous.
These were omitted in the first pass over integer rounding rules, but are generally useful and make good sense to have available. In particular, toNearestOrUp is the natural mode for a lot of fixed-point arithmetic, and frequently has HW support on SIMD units, so it makes good sense to have a name for that. Once you add that, having nearestOrDown also makes sense, and then nearestOrZero should exist by analogy to nearestOrAway. Also beefed up testing and refactored the division rounding code somewhat.
These are used only in TestSupport, but it's still nice to get them right. Signed types had a long-standing bug in how overflow was computed for multiplication, and masking shifts would do the wrong thing when the bitwdith was not a power of two and the shift count was negative. I also added implementations of &*, &+, and &-.
Replaces the rescaling algorithm for Complex division to one inspired by Doug Priest's "Efficient Scaling for Complex Division," with some further tweaks to:
- allow it to work for arbitrary FloatingPoint types, including Float16
- get exactly the same rounding behavior as the un-rescaled path, so that z/w = tz/tw when tz and tw are computed exactly.
- allow future optimizations to hoist a rescaled reciprocal for more speedups.
Unlike Priest, we do not try to avoid spurious overflow in the final computation when the result is very near the overflow boundary but cancellation brings us just inside it. We do not believe that this is a good tradeoff, as complex multiplication overflows in exactly the same way. We will investigate providing opt-in API to avoid this overflow case in a future PR.
Improved how tolerance is applied close to infinity for complex division
(this still could use a more principled approach, but at least we're in
the right ballpark now). Tightened up the tolerance for semi-exhaustive
testing too, since that can be done now.
Use `PROJECT_SOURCE_DIR` to allow embedding within larger projects. `CMAKE_SOURCE_DIR` is the root of the source tree, where as `PROJECT_SOURCE_DIR` is the directory of the current project.