214 Commits

Author SHA1 Message Date
Meng Xu
3eadb31798 FastRestore:Resolve two major reveiw comments
1) Add sendBatchRequests and getBatchReplies

sendBatchRequests is a generic actor to send requests without
processing replies.
getBatchReplies is similar to sendBatchRequests expect that
it returns the reply to caller.

2) Share applier interface to loaders by using RequestStream,
instead of using DB.
   Create RestoreSysInfo struct, similar purpose as DBInfo, for
 the restore system information that are shared among restore workers.
2019-05-24 21:53:21 -07:00
Meng Xu
fac63a83c4 FastRestore:Use NotifiedVersion to deduplicate requests
Add a NotifiedVersion into an applier data which represents
the smallest version the applier is at.

When a loader sends mutation vector to appliers, it sends
the request that contains prevVersion and commitVersion.

This commits also put actor into an actorCollector for
loop-choose-when situation.
2019-05-22 22:09:54 -07:00
Meng Xu
e8cc3add16 FastRestore:Add a general getBatchReplies func
The getBatchReplies takes the RequestStream, a set of interfaces, and
a set of requests.
It sends the requests via the RequestStream of the interfaces and
ensure each request has at least one reply returned.
2019-05-20 20:18:49 -07:00
Meng Xu
35b169fd2d FastRestore:Fix bug in registerMutationsToApplier
We forgot to update the applierInterface reference to the iterated
applyID
2019-05-14 22:10:09 -07:00
Meng Xu
f54a1e1463 FastRestore:Fix bug in deciding applierID in splitMutation 2019-05-14 17:39:44 -07:00
Meng Xu
86c936522d FastRestore:CMDUID should serialize nodeIndex 2019-05-14 16:03:32 -07:00
Meng Xu
6c4c807801 FastRestore:fix bug due to non-unique cmdid
This commit identifies the bug
why DB may be restored to an inconsistent state.

The cmdid is used to achieve exact once delivery even when
network can deliver a request twice.
This is under assumption that cmdid is unique for each request!

However, this assumption may not hold for
the phase Loader_Send_Mutations_To_Applier, when loaders send parsed
mutations to appliers:
1) When the same loader loads multiple files, we reset the cmdid
for the phase;
2) When different loaders load files, each loader's cmdid starts from
0 for the phase.
Both situations can break the assumption, which causes appliers to
miss some mutations to apply. This breaks the cycle test.
2019-05-14 01:49:49 -07:00
Meng Xu
76dd8dc8a8 FastRestore: Fix splitMutation bug 2019-05-13 17:24:57 -07:00
Meng Xu
fd92ab64e4 FastRestore: Clean code for RestoreApplier
Remove unused code and add comments to actors
2019-05-12 22:05:55 -07:00
Meng Xu
620cdd411e FastRestore:Add comments for each restore file 2019-05-12 21:53:43 -07:00
Meng Xu
32c030b7d6 FastRestore: Clear RestoreRole key in DB at finishRestore
This commit is the one that passes correctness tests after
refactoring the fast restore.
2019-05-11 22:25:36 -07:00
Meng Xu
879bf8dc7b FastRestore: Bug fix for refactored code 2019-05-10 16:48:01 -07:00
Meng Xu
a08a6776f5 FastRestore: Refactor to smaller components
The current code uses one restore interface to handle the work
for all restore roles, i.e., master, loader and applier.
This makes it harder to review or maintain or scale.

This commit split the restore into multiple roles by mimicing FDB
transaction system:
1) It uses a RestoreWorker as the process to host restore roles;
   This commit assumes one restore role per RestoreWorker; but
   it should be easy to extend to support multiple roles per RestoreWorker;
2) It creates 3 restore roles:
   RestoreMaster: Coordinate the restore process and send commands to the other two roles;
   RestoreLoader: Parse backup files to mutations and send mutations to appliers;
   RestoreApplier: Sort received mutations and apply them to DB in order.

Compilable version. To be tested in correctness.
2019-05-10 14:20:06 -07:00
Meng Xu
25c75f4222 FastRestore: Add new empty files for restore roles
Add .h and .cpp files for RestoreLoader and RestoreApplier roles.
We will split the code for each restore role into a separate file.

This commit also fixes the bug in including RestoreCommon.actor.h, and
remove the unused code.
2019-05-06 16:59:41 -07:00