181 Commits

Author SHA1 Message Date
Meng Xu
ecd2d8b239 FastRestore:Add counters for applier and disable FlowLock on applyStagingKeysBatch 2020-06-27 00:20:54 -07:00
Meng Xu
5860a5b4db FastRestore:Suppress or mute spammy trace events 2020-06-24 22:10:54 -07:00
Meng Xu
3d6f69c8e2 FastRestore:addPrefix:Transform must clear both orignal and transformed range
Otherwise, anything left in the range can interfer with the result.
2020-06-21 22:18:12 -07:00
Jingyu Zhou
df064ac922
Merge pull request #3321 from xumengpanda/mengxu/fr-restore-ranges-PR
Fast Restore: Support restoring sub ranges in the framework
2020-06-09 21:01:39 -07:00
Meng Xu
d85dc5a4d3 FastRestore:Only clear ranges that will be restored
Instead of clearning the entire normal key space.

This commit also removes some unnecessary tr->reset() which can invalid the txn backoff time.
2020-06-08 22:41:49 -07:00
Meng Xu
28212d397d RestoreApplier:Remove getValue actor 2020-06-08 20:32:52 -07:00
Meng Xu
1edcee4e9d RestoreApplier:Rewrite getKeys because key_not_exists error is handled by txn internally 2020-06-08 20:27:25 -07:00
Meng Xu
5022566b35 Validate if key_not_found error ever happens 2020-06-08 16:59:00 -07:00
Meng Xu
f00deefd5a RestoreApplier:Remove unnecessary txn reset 2020-06-08 10:10:32 -07:00
Meng Xu
8c81fedf11 RestoreApplier:Better handling of key not exist 2020-06-07 21:49:35 -07:00
Meng Xu
94be3afcf8 RestoreApplier:Costmic change based on review 2020-06-06 21:17:57 -07:00
Meng Xu
f51fca0bf3 FastRestore:Sanity check actors do not throw error silently 2020-06-05 17:44:24 -07:00
Meng Xu
ffe949b04d Applier:getAndComputeStagingKeys:reset txn at first error
When tr->onError() is ready, the txn state has been reset.
We cannot wait on the get() future from the txn because its state has been deleted.
If we do that, it will throw txn_cancelled error, which will be throw all the way
up to the RestoreApplier main loop.

The batchData->dbApplier, which is assigned by writeMutationsToDB(self->id(), req.batchIndex, batchData, cx),
will become ready but isError(). This will make all handleApplyToDBRequest throw error silently.
2020-06-05 16:40:19 -07:00
Meng Xu
e9af22085b Debug: getAndComputeStagingKeys may be stuck
Maybe wait(success(fValues[i])); never return
2020-06-04 21:26:14 -07:00
Meng Xu
633587a95a RestoreApplier:getAndComputeStagingKeys:retry for keys that exist in DB
Test shows that we cannot just skip the key that exist in DB but has
future_version error.
2020-06-03 21:17:35 -07:00
Meng Xu
87a557dcb4 FastRestore:Applier:Treat future_version as key not exist 2020-06-03 18:30:59 -07:00
Meng Xu
d5025a1779 getAndComputeStagingKeys: Improved handling of not exist keys 2020-06-03 15:32:36 -07:00
Meng Xu
f5aef706f6 FastRestore:Delay leader election until restore requests are set 2020-05-12 19:11:08 -07:00
Meng Xu
a93c23d239 Resovle review comments 2020-05-07 15:06:59 -07:00
Meng Xu
e4bf6d570f FastRestore:Add assertion and trace events for diagnosis 2020-05-05 19:12:15 -07:00
Meng Xu
4d90384c58 Correct suppression event 2020-05-05 12:36:32 -07:00
Meng Xu
c49b6756fe FastRestoreApplier:Trace clear range op when it has too many for debug 2020-05-05 09:28:50 -07:00
Meng Xu
759820cc61 FastRestoreApplier:Add warning when too many clears in a txn 2020-05-05 09:00:02 -07:00
Meng Xu
62de02fb2c FastRestoreApplier:Add delay to avoid overwelming DB 2020-05-05 08:47:26 -07:00
Meng Xu
67b9e0b29a FastRestoreApplier:Add sanity check and trace for debugging stall 2020-05-04 22:32:57 -07:00
Meng Xu
d22af629cd FastRestoreApplier:Add applierID and batchIndex for precompute stage 2020-05-04 16:32:09 -07:00
Meng Xu
abda13e9df FastRestoreApplier:Free memory at each VB and refactor handleApplyToDBRequest 2020-05-04 15:29:27 -07:00
Meng Xu
135f6443da FastRestoreApplier:Add trace to track applying status 2020-05-04 15:02:53 -07:00
Meng Xu
0ba1551116 FastRestore:Trace memory usage periodically 2020-05-04 11:20:53 -07:00
Meng Xu
7b5d43da9c FastRestore:Remove unused field in RestoreRequest 2020-05-03 20:59:47 -07:00
Meng Xu
ae86b5bb68 FastRestoreApplier:Continue when a key not exists in DB
Although we thought all keys cached in appliers should have
a base value in DB.
2020-05-03 20:47:21 -07:00
Meng Xu
528466e0e6 FastRestore:Fix Valgrind error InvalidSuppression
Trace.error() must explicitly include error_code_actor_cancelled
to handle the error.
2020-05-02 19:52:05 -07:00
Meng Xu
f9f1ac6594 FastRestore:Revise TraceEvent for better diagnosis 2020-05-01 16:31:55 -07:00
Meng Xu
134dbca0ee FastRestore:Use cannonical way to trace error 2020-05-01 13:35:13 -07:00
Meng Xu
41c0a1768f FastRestore:Make FastRestore event type more descriptive 2020-05-01 10:27:08 -07:00
Meng Xu
038f3834fc Merge branch 'master' into mengxu/fr-code-improvement-PR 2020-05-01 09:26:29 -07:00
Meng Xu
6bd71560f0 FastRestore:Reduce trace events in real cluster environment 2020-04-30 19:12:31 -07:00
Meng Xu
f073049865 FastRestore:Revise trace events to be descriptive
Revert changes that send mutations to appliers out of order
2020-04-24 10:31:08 -07:00
Meng Xu
d21da5065a FastRestore:Loader:Merge MutationsVec and LogMessageVersionVec into VersionedMutationsVec
Remove the actor that sends one mutation message batch in the previous commit,
because that actor no longer reduces the code complexity.
2020-04-21 22:05:34 -07:00
Meng Xu
061bcd2fb4 FastRestore:Replace typeString with safe getTypeString func
Also fix compilation error in previous commit
2020-04-13 15:15:54 -07:00
Meng Xu
dbc9c23193 FastRestore:Loader:Send mutations at different versions in the same message to appliers
This increases the bandwidth sent from loaders to appliers.
2020-04-12 10:46:58 -07:00
Meng Xu
55ee034e7f
Merge pull request #2916 from jzhou77/backup-fix
Remove version stamp ops from RestoreApplier
2020-04-11 14:04:11 -07:00
Meng Xu
2325ab209f FastRestore:Applier:Avoid extra copy in getAndComputeStagingKeys 2020-04-08 12:22:08 -07:00
Meng Xu
5ebafdb94c FastRestore:Apply clang-format to changes 2020-04-07 15:57:03 -07:00
Meng Xu
e5b2cd81d5 FastRestore:Cleanup debug code 2020-04-07 15:56:44 -07:00
Jingyu Zhou
cd8215ecf2 Remove version stamp ops from RestoreApplier
Version stamp ops are converted into SET at the proxy, so the backup files
will never have them.
2020-04-06 22:27:47 -07:00
Meng Xu
a51ff7aaae FastRestore:Fix:buildVersionBatches may lose the last log file
If the last log file's endversion decides the last version batch's endversoin,
the buildVersionBatches function may quit early before include the last log file.

This causes some mutations missing and lead to incorrect DB.

This commit also addes an ASSERT(maxVBVersion >= targetVersion) to
alert such error as early as possible to simplify debug.
2020-04-06 12:24:26 -07:00
Meng Xu
536e65cd76 FastRestore:Introduce debugFRMutation for debug keys 2020-04-05 15:00:36 -07:00
Meng Xu
432c99afd0 FastRestore:Applier:Keep incompleteStagingKeys content before values are applied to DB
To avoid the incompleteStagingKeys is cleared before  getAndComputeStagingKeys() finish using it.
2020-04-04 22:38:04 -07:00
Meng Xu
a81ec332a9 FastRestore:Fix:Master cannot throttle on in progress version batches when it release batches out of order in simulation 2020-04-04 17:34:26 -07:00