46 Commits

Author SHA1 Message Date
sfc-gh-tclinkenbeard
a71099471b Update copyright header dates 2022-03-21 13:36:23 -07:00
sfc-gh-tclinkenbeard
594e8944ae Move RestoreWorkerInterface into fdbserver 2021-05-30 11:51:47 -07:00
FDB Formatster
df90cc89de apply clang-format to *.c, *.cpp, *.h, *.hpp files 2021-03-10 10:18:07 -08:00
Meng Xu
83d1350d8d FastRestore:Handle retriable blob error 2020-09-09 07:29:17 -07:00
Meng Xu
198696bc1e Move transformRestoredDatabase from server to client
AtomicRestore workload turns out to rely on the FileBackupAgent
client. Keeping transformRestoredDatabase in server makes linking harder.
2020-06-23 15:48:43 -07:00
Meng Xu
4e27fd34e5 Refactor transformDatabaseContents into RestoreCommon
Prepare to enable addPrefix for atomicRestore
2020-06-23 14:33:13 -07:00
Meng Xu
e4bf6d570f FastRestore:Add assertion and trace events for diagnosis 2020-05-05 19:12:15 -07:00
Meng Xu
2fec56e7e2 FastRestore:Logging for getReplyBatches 2020-05-04 20:12:59 -07:00
Meng Xu
134dbca0ee FastRestore:Use cannonical way to trace error 2020-05-01 13:35:13 -07:00
Meng Xu
28178f356f FastRestore:Minor knob change and revise comments 2020-05-01 10:47:44 -07:00
Meng Xu
41c0a1768f FastRestore:Make FastRestore event type more descriptive 2020-05-01 10:27:08 -07:00
Meng Xu
05ba743f96 Control number of replies on wait in getBatchReplies 2020-05-01 10:09:08 -07:00
Meng Xu
96855d9b47 FastRestore:Loader:Enable sending mutation messages out of order 2020-04-25 17:21:17 -07:00
Meng Xu
93112d0adb FastRestore:getBatchReplies:resetReply on errors unconditionally
This can avoid immediate error at the cost that the sampling mutation stats
can be off.
We can change this to reset only the error request later.
2020-04-24 10:31:30 -07:00
Meng Xu
f073049865 FastRestore:Revise trace events to be descriptive
Revert changes that send mutations to appliers out of order
2020-04-24 10:31:08 -07:00
Meng Xu
38193a3866 Merge branch 'master' into mengxu/fr-code-improvement-PR 2020-04-22 10:51:33 -07:00
Jingyu Zhou
6909f0b8fc Remove decodeRangeFileBlock from parallel restore
Reuse the one from fileBackup namespace.
2020-04-21 13:42:24 -07:00
Meng Xu
2960a2fe8a FastRestore:Add knob to control parallelism in waiting requests 2020-04-19 21:34:11 -07:00
Meng Xu
a0c32f7a67 FastRestore:getBatchReplies:Comment out trace for performance 2020-04-08 15:43:40 -07:00
Meng Xu
2325ab209f FastRestore:Applier:Avoid extra copy in getAndComputeStagingKeys 2020-04-08 12:22:08 -07:00
Jingyu Zhou
88ad28e576 Integrate parallel restore with partitioned logs
In parallel restore, use new getPartitionedRestoreSet() to get a set containing
partitioned mutation logs. The loader uses a new parser to extract mutations
from partitioned logs.

TODO: fix unable to restore errors.
2020-03-20 20:13:38 -07:00
Meng Xu
2c6f82e1ab FastRestore:Add unit name to threshold knob name 2020-03-02 10:52:44 -08:00
Meng Xu
2520e8d44c FastRestore:Use more concise code as suggested in review 2020-03-01 22:32:36 -08:00
Meng Xu
62b9043ff6 FastRestore:DB can be destroyed before master unlock it in simulation
Because retore roles run as workload in simulation,
they do not know when DB is destroyed by the backup and restore test workload.
So if DB is destroyed earlier than restore master unlocks DB, which is rare,
restore master should abort the unlocking DB step.
2020-02-28 14:25:58 -08:00
Meng Xu
fbf5020af9 FastRestore:Applier:Add fetchKeys counter 2020-02-26 11:37:40 -08:00
Meng Xu
8506bce493 FastRestore:Reuse getBatchReplies for sendBatchRequests
Remove old sendBatchRequests and getBatchReplies as well.
2020-02-21 16:15:53 -08:00
Meng Xu
4dd206b1b8 FastRestore:Use new getBatchReplies that profile request latency 2020-02-21 15:59:57 -08:00
Meng Xu
505997ba0a FastRestore:Switch to new sendBatchRequests that tracks performance and straggler 2020-02-21 15:45:32 -08:00
Meng Xu
05ea79f584 FastRestore:Profile performance for getBatchReplies
Generic approach to profile getBatchReplies performance
and detect straggler.
2020-02-21 15:20:22 -08:00
Meng Xu
ab2dd36bdc FastRestore:Generic way to detect stragger 2020-02-21 14:30:08 -08:00
Meng Xu
e76b6d824a FastRestore:Assign priority to actors to prioritize vb work
When we pipeline multiple version batches, we should prevent a later
version batch from blocking the earlier version batch by consuming
CPU resources.

To achive the above, we should assign higher priority to actors
in later phases in a version batch.

Because restore master will not invoke an actor at a later phase unless
the actors at the earlier phases have been finished. This priority assignment
will not cause dead lock.
2020-02-10 20:29:23 -08:00
Meng Xu
cab9d51e06 Merge branch 'master' into mengxu/fast-restore-pipeline-PR 2020-01-27 18:16:26 -08:00
Meng Xu
52e3d20d39 FastRestore:VersionBatch replace vector with set
In order to ensure each backup file only appears in version batch once.
2020-01-22 13:13:10 -08:00
Meng Xu
153b713b53 FastRestore:Add sampling on parsed mutations 2019-12-03 12:52:17 -08:00
Meng Xu
e345c9061f FastRestore:Refine debug messages 2019-11-04 11:47:38 -08:00
Meng Xu
3e2b3de4d0 FastRestore:RestoreMaster:Remove the extra lockDatabase in RestoreMaster 2019-10-17 00:50:13 -07:00
Meng Xu
2cd7010efb FastRestore:Add fileIndex to RestoreFileFR struct and bug fix
Fix bugs in RestoreMaster that cannot properly lock or unlock DB when
exception occurs;
Fix bug in ordering backup files
2019-10-17 00:50:13 -07:00
Meng Xu
2602cb3591 FastRestore:Rename RestoreConfig to RestoreConfigFR to fix link problem in windows
Because the current restore has defined RestoreConfig, windows linker complains.
This commit rename the RestoreConfig used in FastRestore as RestoreConfigFR.
2019-08-02 23:00:12 -07:00
Meng Xu
9cc832cfd6 FastRestore:Fix Mac and Windows compilation error 2019-08-02 14:33:08 -07:00
Meng Xu
3b54363780 FastRestore:Apply Clang-format 2019-08-01 18:09:12 -07:00
Meng Xu
45b9504ba6 FastRestore:Refactor distribute workload for version batch
Rewrite the code that collects files for a version batch and that
distribute workload among loaders for files in a version batch.
The new code is easier to understand and maintain.
2019-05-30 17:39:50 -07:00
Meng Xu
620cdd411e FastRestore:Add comments for each restore file 2019-05-12 21:53:43 -07:00
Meng Xu
5406c74daf FastRestore: Ensure actorcompiler.h is included 2019-05-11 22:48:39 -07:00
Meng Xu
a08a6776f5 FastRestore: Refactor to smaller components
The current code uses one restore interface to handle the work
for all restore roles, i.e., master, loader and applier.
This makes it harder to review or maintain or scale.

This commit split the restore into multiple roles by mimicing FDB
transaction system:
1) It uses a RestoreWorker as the process to host restore roles;
   This commit assumes one restore role per RestoreWorker; but
   it should be easy to extend to support multiple roles per RestoreWorker;
2) It creates 3 restore roles:
   RestoreMaster: Coordinate the restore process and send commands to the other two roles;
   RestoreLoader: Parse backup files to mutations and send mutations to appliers;
   RestoreApplier: Sort received mutations and apply them to DB in order.

Compilable version. To be tested in correctness.
2019-05-10 14:20:06 -07:00
Meng Xu
25c75f4222 FastRestore: Add new empty files for restore roles
Add .h and .cpp files for RestoreLoader and RestoreApplier roles.
We will split the code for each restore role into a separate file.

This commit also fixes the bug in including RestoreCommon.actor.h, and
remove the unused code.
2019-05-06 16:59:41 -07:00
Meng Xu
19841f9ef5 FastRestore: Move copied code into a separate file
We re-use some code from the existing restore system.
To make code review easier and code cleaner, we move the copied and
small-changed code into two separate files:
RestoreCommon.actor.h and RestoreCommon.actor.cpp
2019-04-30 20:57:02 -07:00