1
0
mirror of https://github.com/apple/foundationdb.git synced 2025-05-31 18:19:35 +08:00

214 Commits

Author SHA1 Message Date
Meng Xu
c16d76745d FastRestore:small fix compilation error 2020-09-22 13:36:26 -07:00
Meng Xu
17ece3d477 FastRestore:Fix FastRestoreApplierTransactionRateControl events 2020-09-22 13:25:43 -07:00
Meng Xu
b4254473d7 FastRestore:Add transaction rate info tracer 2020-09-22 09:10:11 -07:00
Meng Xu
002b1bec4c FastRestore:Control write traffic at each applier
Controller assigns each applier a write rate.
Applier keeps the write-rate worth of transactions outstanding to DB.

This is to avoid heavily overloading DB while still keep enough
traffic to DB to get a good write throughput.
2020-09-22 08:14:28 -07:00
Meng Xu
ce92f1a224 FR:Init StagingKey when created
The key field was used in various places, such as figuring out the conflict key range.
We should not leave it empty
2020-09-09 16:12:32 -07:00
Meng Xu
5c5abd7afa FastRestoreApplier:Calculate conflict range in applyStagingKeysBatch 2020-09-09 15:00:25 -07:00
Meng Xu
2febbe74ce FastRestoreApplier:Fix conflict range inverted due to invalid memory access
Rerpot error on loader and applier if not error_code_operation_cancelled error ever happens
2020-09-09 14:40:21 -07:00
Meng Xu
f10e9ea679 FastRestoreApplier:Add write conflict range 2020-09-09 12:12:14 -07:00
Meng Xu
d8e73fddb6 FastRestore:Cancel actors when restore request finishes 2020-08-25 14:46:26 -07:00
Meng Xu
778daf20c0 FastRestore:Fix incorrect assert 2020-08-24 19:59:56 -07:00
Meng Xu
422784d545 FastRestore:Send reply before assert fail 2020-08-19 11:25:08 -07:00
Meng Xu
d9ea14ea6c FastRestore:fix:loader can receive reply from vb that has been processed and deleted 2020-08-19 10:39:49 -07:00
Meng Xu
22a2fac689 FastRestore:Fix segmentation fault when previous duplicate request is sent too late
This seg fault was not caught by simulation test;
It is only reproduced very easily in circus test.

Add an ASSERT to check if the scenario happens in simulation
2020-08-19 09:15:18 -07:00
Meng Xu
9b2f667bbe FastRestore:Fix uninitialized variable 2020-08-18 11:58:57 -07:00
Meng Xu
046260b9d7 FastRestore:Applier:Assert to ensure batchData will not be used after deleted 2020-08-17 22:42:41 -07:00
Meng Xu
4a0315483b FastRestore:Safeguard when request of earlier vb may be sent after the vb has finished 2020-08-17 22:20:54 -07:00
Meng Xu
7b7490efe7 FastRestore:Debug trace for seg fault 2020-08-17 20:34:33 -07:00
Meng Xu
97e49f2f70 Resolve throttling events 2020-08-10 22:01:12 -07:00
Meng Xu
f071d81ad0 Report warning on FastRestoreApplierClearRangeMutationsStart if delayTime is too large 2020-08-03 14:08:31 -07:00
Meng Xu
f36d5aa180 FR:Applier received bytes per batch 2020-07-31 17:48:55 -07:00
Meng Xu
47c35a7a69 FastRestore:Add stats to ApplierBatchData 2020-07-31 14:59:45 -07:00
Meng Xu
37c3bd8615 FastRestore:Ensure FASTRESTORE_NOT_WRITE_DB only work in non simulation mode 2020-07-30 20:27:54 -07:00
Meng Xu
8cace30bb2 FastRestore:change TXN_BATCH_MAX_BYTES default to 1KB from 1MB 2020-07-30 16:43:35 -07:00
Meng Xu
d16db8e733 FastRestore:Fix segmentation fault 2020-07-30 12:10:32 -07:00
Meng Xu
d71361245b FastRestore:Short cut DB for get and clearange 2020-07-30 11:17:05 -07:00
Meng Xu
ad915e462e Add knob FASTRESTORE_NOT_WRITE_DB to skip writting to DB 2020-07-30 10:17:17 -07:00
Meng Xu
efb61bcac0 Rename knob to FASTRESTORE_TXN_EXTRA_DELAY 2020-06-29 21:16:30 -07:00
Meng Xu
97e26d8eb0 FastRestore:Count appliedBytes 2020-06-29 10:18:18 -07:00
Meng Xu
82dfb5ce3f FastRestore:Update process metrics for restore master 2020-06-28 12:37:04 -07:00
Meng Xu
bc98c84346 RestoreLoader release data early and revert Lower priority for RestoreApplierReceiveMutations actor
A quick evalution shows lowering priority for receive mutation actor does not help restore speed but hurt it.
2020-06-28 11:12:06 -07:00
Meng Xu
ca7beb5a26 Fix compilation 2020-06-27 15:21:06 -07:00
Meng Xu
e57dba00bd FastRestore:Lower priority for RestoreApplierReceiveMutations actor 2020-06-27 15:16:38 -07:00
Meng Xu
78c45c1200 Knob for txn delay and add back FlowLock to control txn concurrency 2020-06-27 10:13:34 -07:00
Meng Xu
ecd2d8b239 FastRestore:Add counters for applier and disable FlowLock on applyStagingKeysBatch 2020-06-27 00:20:54 -07:00
Meng Xu
5860a5b4db FastRestore:Suppress or mute spammy trace events 2020-06-24 22:10:54 -07:00
Meng Xu
3d6f69c8e2 FastRestore:addPrefix:Transform must clear both orignal and transformed range
Otherwise, anything left in the range can interfer with the result.
2020-06-21 22:18:12 -07:00
Jingyu Zhou
df064ac922
Merge pull request from xumengpanda/mengxu/fr-restore-ranges-PR
Fast Restore: Support restoring sub ranges in the framework
2020-06-09 21:01:39 -07:00
Meng Xu
d85dc5a4d3 FastRestore:Only clear ranges that will be restored
Instead of clearning the entire normal key space.

This commit also removes some unnecessary tr->reset() which can invalid the txn backoff time.
2020-06-08 22:41:49 -07:00
Meng Xu
28212d397d RestoreApplier:Remove getValue actor 2020-06-08 20:32:52 -07:00
Meng Xu
1edcee4e9d RestoreApplier:Rewrite getKeys because key_not_exists error is handled by txn internally 2020-06-08 20:27:25 -07:00
Meng Xu
5022566b35 Validate if key_not_found error ever happens 2020-06-08 16:59:00 -07:00
Meng Xu
f00deefd5a RestoreApplier:Remove unnecessary txn reset 2020-06-08 10:10:32 -07:00
Meng Xu
8c81fedf11 RestoreApplier:Better handling of key not exist 2020-06-07 21:49:35 -07:00
Meng Xu
94be3afcf8 RestoreApplier:Costmic change based on review 2020-06-06 21:17:57 -07:00
Meng Xu
f51fca0bf3 FastRestore:Sanity check actors do not throw error silently 2020-06-05 17:44:24 -07:00
Meng Xu
ffe949b04d Applier:getAndComputeStagingKeys:reset txn at first error
When tr->onError() is ready, the txn state has been reset.
We cannot wait on the get() future from the txn because its state has been deleted.
If we do that, it will throw txn_cancelled error, which will be throw all the way
up to the RestoreApplier main loop.

The batchData->dbApplier, which is assigned by writeMutationsToDB(self->id(), req.batchIndex, batchData, cx),
will become ready but isError(). This will make all handleApplyToDBRequest throw error silently.
2020-06-05 16:40:19 -07:00
Meng Xu
e9af22085b Debug: getAndComputeStagingKeys may be stuck
Maybe wait(success(fValues[i])); never return
2020-06-04 21:26:14 -07:00
Meng Xu
633587a95a RestoreApplier:getAndComputeStagingKeys:retry for keys that exist in DB
Test shows that we cannot just skip the key that exist in DB but has
future_version error.
2020-06-03 21:17:35 -07:00
Meng Xu
87a557dcb4 FastRestore:Applier:Treat future_version as key not exist 2020-06-03 18:30:59 -07:00
Meng Xu
d5025a1779 getAndComputeStagingKeys: Improved handling of not exist keys 2020-06-03 15:32:36 -07:00