This is used by memset() so we get a lot of mileage out of optimizing
this instruction.
Note that we currently audit every individual byte accessed separately.
This could be greatly improved by adding a range auditing mechanism to
MallocTracer.
To make SoftMMU::find_region() O(1), this patch invests 3MiB into a
lookup table where we track each possible page base address and map
them to the SoftMMU::Region corresponding to that address.
This is another large improvement to general emulation performance. :^)
We don't want the next_address pointer losing its alignment somehow.
This whole thing should be replaced at some point, since UE hosted
programs won't be able to run forever with this allocation strategy.
m32int is a 32-bit integer stored in memory, and should not be mistaken
for a floating point number. :^)
Also add missing handling of 64-bit FPU register operands to some of
the RM64 instructions.
There are some destruction order races that can cause hangs while
shutting down UE. Since there's no particular value right now in
destroying the Emulator object properly, just avoid destruction and
add a FIXME about looking into it later.
Instead of doing an O(n) scan over all the mallocations whenever we're
doing a read/write audit, UE now keeps track of ChunkedBlocks and their
chunks. Both the block lookup and the chunk lookup is O(1).
We know what ChunkedBlocks look like via mallocdefs.h from LibC.
Note that the old linear scan is still in use for big mallocations,
but the vast majority of mallocations are chunked, so this helps a lot.
This makes malloc auditing significantly faster! :^)
These instructions now operate on the specified FPU stack entry instead
of always using ST(0) and ST(1).
FUCOMI and FUCOMIP also handle NaN values slightly better.
Instead of always showing the preceding mallocation, prefer showing the
following one *if* it's closer to the audited address.
This makes it easier to find bugs where the access is just before an
allocation instead of just after it.
Start fleshing out basic support for floating-point instructions in the
UserspaceEmulator CPU.
This is all work done by @nico for #3576. I'm just merging it all in
this patch since it's a decent foundation to continue working on. :^)
When a mallocation is shrunk/grown without moving, UE needs to update
its precise metadata about the mallocation, since it tracks *exactly*
how many bytes were allocated, not just the malloc chunk size.
Problem:
- `constexpr` functions are decorated with the `inline` specifier
keyword. This is redundant because `constexpr` functions are
implicitly `inline`.
- [dcl.constexpr], §7.1.5/2 in the C++11 standard): "constexpr
functions and constexpr constructors are implicitly inline (7.1.2)".
Solution:
- Remove the redundant `inline` keyword.
* AK: Add formatter for JsonValue.
* Inspector: Use new format functions.
* Profiler: Use new format functions.
* UserspaceEmulator: Use new format functions.
In the future all (normal) output should be written by any of the
following functions:
out (currently called new_out)
outln
dbg (currently called new_dbg)
dbgln
warn (currently called new_warn)
warnln
However, there are still a ton of uses of the old out/warn/dbg in the
code base so the new functions are called new_out/new_warn/new_dbg. I am
going to rename them as soon as all the other usages are gone (this
might take a while.)
I also added raw_out/raw_dbg/raw_warn which don't do any escaping,
this should be useful if no formatting is required and if the input
contains tons of curly braces. (I am not entirely sure if this function
will stay, but I am adding it for now.)
Don't require clients to templatize modrm().read{8,16,32,64}() with
the ValueWithShadow type when we can figure it out automatically.
The main complication here is that ValueWithShadow is a UE concept
while the MemoryOrRegisterReference inlines exist at the lower LibX86
layer and so doesn't have direct access to those types. But that's
nothing we can't solve with some simple template trickery. :^)
m_cached_code_end points at the first invalid byte, so we need to
update the cache if the last byte we want to read points at the
end or past it. Previously we updated the cache 1 byte prematurely in
read16, read32, read64 (but not in read8).
Noticed by reading the code (the code looked different from read8() and
the other 3). I didn't find anything that actually hit this case.