2590 Commits

Author SHA1 Message Date
Ata E Husain Bohra
dfe9d184ff Refactor: ClusterController driving cluster-recovery state machine
At present, cluster recovery process consists of following steps:
1. ClusterController clusterWatchDatabase actor recruits
   master/sequencer process.
2. Sequencer process implements the cluster recovery state machine,
   responsible to recruit all other processes as well restore the
   cluster state.

Patch proposes a scheme where the cluster recovery state machine
is implemented and driven by the ClusterController process instead
of the Sequencer process.

Advantages of the scheme could be:
1. Simplified design where ClusterController recruits "sequencer"
   process like other worker processes compared to current scheme
   where "sequencer" process gets special treatment. In newer scheme
   sequencer is responsible for maintaining/providing
   "committed version" (as expected).
2. ClusterController is responsible for worker processes recruitment,
   the sequencer though orchestrating the recovery state machine, it
   need to reachout to the ClusterController for recruiting worker
   processes etc.

NOTE:
Patch has moved the recovery state machine code from
'sequencer' -> 'cluster-controller' process, however, necessary
updates were done for both functionality as well as performance
improvement reasons.

Next Steps:
Cluster recovery documentation will be updated in near future.
2021-12-22 14:06:27 -08:00
A.J. Beamon
ff1cb58174 Convert hyphens to underscores for all prefix-based arguments (e.g. --knob-, --locality-) 2021-12-14 12:01:44 -08:00
A.J. Beamon
f29f487823
Unify flags (#25)
* Unify flags implementation and change help text in backup.actor.cpp
Description

Testing

* Keep LOG_GROUP unchanged

Description

Testing

* Transfer the hyphens to underscores for internal options and user's input, EXCEPT leading hyphens

Description

Testing

* Use a deep copy of the user's input flag to do the match

Description

Testing

* Convert the _ to - in Option arrays of backup.actor.cpp

Description

Testing

* Transter _ to - for files:
        TLSConfig.actor.h, fdbcli.actor.cpp, fdbserver.actor.cpp, FileConverter.h, FileConverter.cpp

Description

Testing

* Change another way to unify flag: using SO_O_ICASE_HYPHEN_AND_UNDERSCORE to determine whether we do the conversion in function IsEqual

Description

Testing

* Change the config command's name from SO_O_ICASE_HYPHEN_AND_UNDERSCORE to SO_O_HYPHEN_TO_UNDERSCORE

Description

Testing

* Update the comment for the SO_O_HYPHEN_TO_UNDERSCORE

Description

Testing

* Fix left underscore in SOption arrays

Description

Testing

* Convert _ to - in several files for commands

Description

Testing

* Make the FDBService and fdbmonitor backward compatible

Description

Testing

* Fix bugs about pointers

Description

Testing

* Check underscore and hyphen at the same time for --knob_, --localily_ and --test_
And fix bugs in fdbmonitor and FDBService
Description

Testing

* Simplify the function in fdbmonitor and FDBService about retrieving arguments.
And fix some documents in masterserver.actor.cpp

Description

Testing

* Convert _ to - for knob in the setKnob functions

Description

Testing

* Convert - to _ in the setKnob functions

Description
Since key in the knob related maps only contain _

Testing

* Rename varialbe name in the fdbmonitor and FDBService for clarification

Description

Testing

Co-authored-by: Chang Liu <chang.liu@snowflake.com>
2021-12-14 08:44:39 -08:00
Josh Slocum
3afe9fb6e0 MVC bug fixes 2021-12-10 12:47:53 -06:00
Andrew Noyes
def41697bf
Merge pull request #6083 from sfc-gh-tclinkenbeard/remove-temporaries
Avoid creating unnecessary temporary objects
2021-12-06 13:24:56 -08:00
Tao Lin
9b0a9c4503
Return error when getRangeAndFlatMap has more & Improve simulation tests (#6029) 2021-12-03 12:50:07 -08:00
sfc-gh-tclinkenbeard
3d36dfe5e9 Fix compilation error in resolveTCPEndpoint_impl 2021-12-02 12:51:00 -08:00
Evan Tschannen
b11ae4dae8
Merge pull request #5910 from sfc-gh-jslocum/bg_bindings
Blob Granule C bindings
2021-12-02 11:40:26 -08:00
sfc-gh-tclinkenbeard
70c8f98eb9 Apply clang-format to Net2.actor.cpp 2021-12-02 10:22:22 -08:00
sfc-gh-tclinkenbeard
464d9488ef Merge remote-tracking branch 'origin/master' into fix-unused-warnings 2021-12-01 23:52:09 -08:00
sfc-gh-tclinkenbeard
6b45ef98ca Merge remote-tracking branch 'origin/master' into remove-temporaries 2021-12-01 23:50:29 -08:00
sfc-gh-tclinkenbeard
d01a363e29 Avoid creating unnecessary temporary objects 2021-12-01 23:48:34 -08:00
FoundationDB CI
ca5d5ac942
apply formatting with clang 13
Signed-off-by: FoundationDB CI <foundationdb_ci@apple.com>
2021-12-02 05:13:59 +00:00
sfc-gh-tclinkenbeard
90ced244eb Fix -Wunused-but-set-variable warnings 2021-12-01 18:15:53 -08:00
Josh Slocum
9cb6fb5114 fixing unrelated code formatting 2021-12-01 17:20:11 -06:00
Josh Slocum
a82845af43 Merge branch 'master' into bg_bindings 2021-12-01 16:55:28 -06:00
Josh Slocum
0f2f5bc0b6 Cleanup of ThreadResult 2021-12-01 16:24:28 -06:00
Josh Slocum
7f4fcc8c2c Added FDBResult and made readBlobGranules use it 2021-12-01 16:22:05 -06:00
sfc-gh-tclinkenbeard
ec64890ac1 Remove some usages of PRId64 by using fmt library 2021-11-30 23:35:36 -08:00
A.J. Beamon
c47535245b
Merge pull request #6033 from sfc-gh-ajbeamon/improved-client-db-logging
Client logging improvements
2021-11-29 13:23:10 -08:00
Trevor Clinkenbeard
6429b82796
Merge pull request #6053 from RenxuanW/fromHostname
Change member variable fromHostname to type bool.
2021-11-29 13:13:21 -08:00
A.J. Beamon
b8bd89f88d Shorten the name of external client threads. Add a thread name for trace logging threads. 2021-11-29 09:57:10 -08:00
A.J. Beamon
264c75b9a6 Add some extra client logging details:
1. Add a trace event when a database is created and move the cluster file / connection string from ClientStart to the new trace event
2. Add a detail for the path to the image being loaded
3. Add a detail for whether a client library is primary or not
4. Set a thread name for each external client thread that includes the release version
2021-11-29 09:57:10 -08:00
Renxuan Wang
09fedc429a Remove unnecessary boost/bind.hpp.
Complement of #6026.
2021-11-24 16:33:05 -08:00
Ata E Husain Bohra
0962fcb243 Override commit/grv proxies_count if mutation supplied new value is -1
Patch improves on handling scenarios where either commit or grv proxies
value is update to -1 OR `proxies_count` is being reset.
The code splits the proxies between two proxies by ensuring for invalid
input configuration, the min (read as 1) proxies gets provisioned, otherwise,
the split is done based on input values

Patch handles the scenario where mutation supplied values to update grv_proxies
and/or commit_proxies is -1, however, the total proxy count > 1,
uses DEFAULT_COMMIT_GRV_PROXIES_RATIO to split proxies between
grv_proxies & commit_proxies.
2021-11-24 12:52:31 -08:00
Renxuan Wang
46d17d748f Change member variable fromHostname to type bool.
So that it can be serialized.
2021-11-23 14:25:02 -10:00
Renxuan Wang
22e34bd6b9 Replace <boost/bind.hpp> with <boost/bind/bind.hpp>.
This eliminates many useless warnings when compiling.
`#pragma message: The practice of declaring the Bind placeholders (_1, _2, ...) in the global namespace is deprecated. Please use <boost/bind/bind.hpp> + using namespace boost::placeholders, or define BOOST_BIND_GLOBAL_PLACEHOLDERS to retain the current behavior.`
2021-11-18 14:00:13 -08:00
Lukas Joswiak
18243351e7 Fix possible data race
Transactions (created on a separate thread) can read the `globals` field
at the same time as `setGlobal` is called on the main thread, causing a
potential race. TSAN surfaced this issue.
2021-11-18 10:16:20 -08:00
negoyal
8b0938c7b3 Added a TODO 2021-11-17 09:45:50 -08:00
Steve Atherton
3caca74ac2 Merge commit 'fd707c6d7ee80de6d9fda5796da2d0add10abd79' into bit-flipping-workload 2021-11-16 21:54:27 -08:00
Markus Pilman
b2019cd4f2
Merge pull request #5992 from sfc-gh-mpilman/features/fmt
added fmt dependency to flow
2021-11-16 18:58:55 -07:00
Steve Atherton
035e0d6e52
Merge branch 'master' into bit-flipping-workload 2021-11-16 14:42:22 -08:00
Markus Pilman
b1633b90f1 Added fmt to flow 2021-11-16 12:03:49 -07:00
Steve Atherton
867999a41a Rename wrong_format_version to unsupported_format_version. 2021-11-16 03:25:54 -08:00
Renxuan Wang
4630b0ccea Move DNS mock from SimExternalConnection to Sim2.
This is a revise PR of #5934. In simulation, we don't have direct access to SimExternalConnection.
2021-11-15 17:02:51 -08:00
Evan Tschannen
964d0209ca
Merge pull request #5637 from sfc-gh-ljoswiak/features/data-loss-prevention
Data loss protection when joining new cluster
2021-11-15 15:26:32 -08:00
Jingyu Zhou
02d0c43bc2
Merge pull request #5982 from sfc-gh-tclinkenbeard/improve-error-descriptions
Make snapshot errors more descriptive
2021-11-15 13:18:19 -08:00
Evan Tschannen
a546fb63ea
Merge pull request #5985 from sfc-gh-etschannen/feature-changefeed-empty-versions
Added a whenAtLeast function to change feeds to efficiently learn about empty versions
2021-11-15 10:51:28 -08:00
Markus Pilman
daf6dc22d4
Merge pull request #5959 from mpilman/features/apple-silicon-3
FDB compiles on Apple Sillicon
2021-11-15 11:21:28 -07:00
Evan Tschannen
94a51e57a5 Merge branch 'master' into feature-changefeed-empty-versions
# Conflicts:
#	fdbclient/StorageServerInterface.h
2021-11-14 19:13:05 -08:00
Evan Tschannen
6909754b21 changefeeds now have a whenAtLeast function for efficiently learning when the version has updated but no mutations have been committed 2021-11-14 19:08:46 -08:00
sfc-gh-tclinkenbeard
dc756228f2 Make snapshot errors more descriptive 2021-11-14 13:46:17 -08:00
Steve Atherton
508429f30d
Redwood chunked file growth and low priority IO starvation prevention (#5936)
* Redwood files now growth in large page chunks controlled by a knob to reduce truncate() calls for expansion.   PriorityMultiLock has limit on consecutive same-priority lock release.  Increased Redwood max priority level to 3 for more separation at higher BTree levels.

* Simulation fix, don't mark certain IO timeout errors as injected unless the simulated process has been set to have an unreliable disk.

* Pager writes now truncate gradually upward, one chunk at a time, in response to writes, which wait on only the necessary truncate operations.   Increased buggified chunk size because truncate can be very slow in simulation.

* In simulation, ioTimeoutError() and ioDegradedOrTimeoutError() will wait until at least the target timeout interval past the point when simulation is sped up.

* PriorityMultiLock::toString() prints more info and is now public.

* Added queued time to PriorityMultiLock.

* Bug fix to handle when speedUpSimulation changes later than the configured time.

* Refactored mutation application in leaf nodes to do fewer comparisons and do in place value updates if the new value is the same size as the old value.

* Renamed updatingInPlace to updatingDeltaTree for clarity.  Inlined switchToLinearMerge() since it is only used in one place.

* Updated extendToCover to be more clear by passing in the old extension future as a parameter.  Fixed initialization warning.
2021-11-12 13:47:07 -08:00
Josh Slocum
329091e14f Merge branch 'master' into bg_bindings 2021-11-11 10:13:37 -06:00
Markus Pilman
28dde27cb1 Fix Linux compiler errors 2021-11-11 08:49:51 -07:00
Markus Pilman
0e6dd46f5a Correct code formatting 2021-11-11 08:38:25 -07:00
Josh Slocum
b8ac4213a1 Switched BG APIs to transaction instead of database 2021-11-11 08:59:06 -06:00
Markus Pilman
5af465aa29 FDB compiles on Apple Sillicon 2021-11-10 20:05:38 -07:00
Lukas Joswiak
069a04c5e5 Removed outdated definition 2021-11-10 13:33:49 -08:00
Lukas Joswiak
1da288822f Remove distributed trace database option 2021-11-10 13:33:49 -08:00