* Fix Redwood tree height overgrowth when EaR and tenant page split are enabled, by removing the buildNewSubtree() logic.
* Fixing incorrect page upper bound for the last page created by writePages() without the buildNewSubtree() logic.
* Enable tenant page split if encryption mode is domain-aware encryption.
* Related test fixes:
- In simulation, pass encryption mode to storage/Redwood via knobs. This is a workaround to enable testing with Redwood encryption before we correctly pass the encryption mode via db config. Also temporarily disable tenant page split for restart tests.
- Disable raw access in FuzzApiCorrectness test if domain-aware encryption is enabled, to avoid test timeout
- Disable encryption for DrUpgradeRestart test, which is likely to fail due to a rare EKP deadlock issue blocking recovery. Will re-enable after the deadlock issue is fixed.
The simulator tracks only active processes. Rebooted or killed processes
are removed from the list of processes, and only get added back when the
process is rebooted and starts up again. This causes a problem for the
`RebootProcessAndSwitch` kill type, which wants to simultaneously reboot
all machines in a cluster and change their cluster file. If a machine is
currently being rebooted, it will miss the reboot process and switch
command.
The fix is to add a check when a process is being started in simulation.
If the process has had its cluster file changed and the cluster is in a
state where all processes should have had their cluster files reverted
to the original value, the simulator will now send a
`RebootProcessAndSwitch` signal right when the process is started. This
will cause an extra reboot, but should correctly switch the process back
to its original, correct cluster file, allowing the cluster to fully
recover all clusters.
Note that the above issue should only affect simulation, due to how the
simulator tracks processes and handles kill signals.
This commit also adds a field to each process struct to determine
whether the process is being run in a DR cluster in the simulation run.
This is needed because simulation does not differentiate between
processes in different clusters (other than by the IP), and some
processes needed to switch clusters and some simply needed to be
rebooted.
The simulator tracks only active processes. Rebooted or killed processes
are removed from the list of processes, and only get added back when the
process is rebooted and starts up again. This causes a problem for the
`RebootProcessAndSwitch` kill type, which wants to simultaneously reboot
all machines in a cluster and change their cluster file. If a machine is
currently being rebooted, it will miss the reboot process and switch
command.
The fix is to add a check when a process is being started in simulation.
If the process has had its cluster file changed and the cluster is in a
state where all processes should have had their cluster files reverted
to the original value, the simulator will now send a
`RebootProcessAndSwitch` signal right when the process is started. This
will cause an extra reboot, but should correctly switch the process back
to its original, correct cluster file, allowing the cluster to fully
recover all clusters.
Note that the above issue should only affect simulation, due to how the
simulator tracks processes and handles kill signals.
This commit also adds a field to each process struct to determine
whether the process is being run in a DR cluster in the simulation run.
This is needed because simulation does not differentiate between
processes in different clusters (other than by the IP), and some
processes needed to switch clusters and some simply needed to be
rebooted.