1
0
mirror of https://github.com/apple/foundationdb.git synced 2025-05-31 18:19:35 +08:00

45 Commits

Author SHA1 Message Date
Renxuan Wang
c69a07a858
Check in the new Hostname logic. ()
* Revert .

20220407-031010-renxuan-c101052c21da8346           compressed=True data_size=31004844 duration=4310801 ended=100000 fail_fast=10 max_runs=100000 pass=100000 priority=100 remaining=0 runtime=1:04:15 sanity=False started=100047 stopped=20220407-041425 submitted=20220407-031010 timeout=5400 username=renxuan

* Revert .

20220407-051532-renxuan-470f0fe6aac1c217           compressed=True data_size=30982370 duration=3491067 ended=100002 fail_fast=10 max_runs=100000 pass=100002 priority=100 remaining=0 runtime=0:59:57 sanity=False started=100141 stopped=20220407-061529 submitted=20220407-051532 timeout=5400 username=renxuan

* Revert .

Remove resolving-related functionalities in connection string. Connection string will be used for storing purpose only, and non-mutable.

20220407-175119-renxuan-55d30ee1a4b42c2f           compressed=True data_size=30970443 duration=5437659 ended=100000 fail_fast=10 max_runs=100000 pass=100000 priority=100 remaining=0 runtime=0:59:31 sanity=False started=100154 stopped=20220407-185050 submitted=20220407-175119 timeout=5400 username=renxuan

* Add hostname to coordinator interfaces.

* Turn on the new hostname logic.

* Add the corresponding change in config txns.

The most notable change is before calling basicLoadBalance(), we need to call tryInitializeRequestStream() to initialize request streams first.

Passed correctness tests.

* Return error when hostnames cannot be resolved in coordinators command.

* Minor fixes.
2022-04-27 21:54:13 -07:00
Renxuan Wang
e40cc8722c
A few hostname improvements. ()
* Add tryResolveHostnames() in connection string.

* Add missing hostname to related interfaces.

* Do not pass RequestStream into *GetReplyFromHostname() functions.

Because we are using new RequestStream for each request anyways. Also, the passed in pointer could be nullptr, which results in seg faults.

* Add dynamic hostname resolve and reconnect intervals.

* Address comments.
2022-04-20 13:42:46 -07:00
Renxuan Wang
465ff712b6
Move Hostname to its own files. ()
* Change DNS cache to use std::map.

Revert commit 90c259d84e95dd35e01149c0a86bd18e82e33930, because if we use unordered_map, toString() can be inconsistent.

* Move ClientKnob::COORDINATOR_HOSTNAME_RESOLVE_DELAY to FlowKnob::HOSTNAME_RESOLVE_DELAY.

* Move Hostname to its own files.

Also, add resolve-related variables and functions in Hostname.
2022-04-04 19:04:51 -07:00
Renxuan Wang
2a59c5fd4e
Workers should monitor coordinators in submitCandidacy(). ()
* Workers should monitor coordinators in submitCandidacy().

* Change re-resolve delay to a knob.
2022-03-24 19:20:42 -07:00
sfc-gh-tclinkenbeard
a71099471b Update copyright header dates 2022-03-21 13:36:23 -07:00
Renxuan Wang
f9f3735f73 Add resolveHostnamesBlocking() in ConnectionString and IClusterConnectionRecord.
Also, combine IClusterConnectionRecord::getConnectionString() and IClusterConnectionRecord::getMutableConnectionString() to IClusterConnectionRecord::getConnectionString(), and rename setConnectionString() to setAndPersistConnectionString().
2022-01-28 12:20:41 -08:00
A.J. Beamon
e882eb33fc Abstract the cluster file into a cluster connection record that can be backed by something other than the filesystem. 2021-10-22 11:05:18 -07:00
Xiaoge Su
abf73047ca Enforce std:: specifier rather than using namespace 2021-09-16 19:40:28 -07:00
Renxuan Wang
96fcde45c2 Minor leader election code improvements.
1. Rename monitorLeaderRemotely* functions to monitorLeaderWithDelayedCandidacy*. "Remote" is not clearly describing what the functions are doing;
2. Rename monitorLeaderForProxies() to monitorLeaderAndGetClientInfo() to better describe the function;
3. Remove monitorLeaderRemotelyInternal() and monitorLeaderRemotely() in MonitorLeader.actor.cpp, to eliminate code duplication. They already exist in worker.actor.cpp;
4. Move the declaration of getLeader() from LeaderElection.actor.cpp to MonitorLeader.h;
5. Update a few comments.
2021-09-16 15:34:45 -07:00
Steve Atherton
507c1f11e3 Add .log() to bare TraceEvent() invocations without any .detail()s to avoid clang-tidy warning about immediate destruction of object without use. 2021-07-26 19:55:10 -07:00
sfc-gh-tclinkenbeard
5916c9903b Make FASTRESTORE_ATOMICOP_WEIGHT a client knob 2021-05-30 08:40:24 -07:00
RenxuanW
0145eea684 Make MonitorLeaderForwarding and LeaderForwarding trackLatest events. 2021-04-27 15:17:20 -07:00
FDB Formatster
df90cc89de apply clang-format to *.c, *.cpp, *.h, *.hpp files 2021-03-10 10:18:07 -08:00
Andrew Noyes
79cec09255 Apply clang-tidy's performance-inefficient-vector-operation fix
I ran this command in my build directory after compiling with
OPEN_FOR_IDE. It took a few small tweaks to get it to compile, which is
outside the scope of this commit.

    $ python run-clang-tidy.py -j $(nproc) -checks='-*,performance-inefficient-vector-operation' -fix
2021-03-04 03:58:25 +00:00
sfc-gh-tclinkenbeard
4669f837fa Add uses of makeReference 2020-11-07 22:10:18 -08:00
Evan Tschannen
048201717c Fixed a number of problems with monitorLeaderRemotely 2020-05-10 14:20:50 -07:00
Evan Tschannen
07cc0a8d74 code cleanup 2020-04-10 17:02:11 -07:00
Andrew Noyes
6aa0ada7b1 Replace scalar root types with proper messages 2019-08-28 14:40:50 -07:00
Alex Miller
7a500cd37f A giant translation of TaskFooPriority -> TaskPriority::Foo
This is so that APIs that take priorities don't take ints, which are
common and easy to accidentally pass the wrong thing.
2019-06-25 02:47:35 -07:00
A.J. Beamon
603721e125 Merge branch 'master' into thread-safe-random-number-generation
# Conflicts:
#	fdbclient/ManagementAPI.actor.cpp
#	fdbrpc/AsyncFileCached.actor.h
#	fdbrpc/genericactors.actor.cpp
#	fdbrpc/sim2.actor.cpp
#	fdbserver/DiskQueue.actor.cpp
#	fdbserver/workloads/BulkSetup.actor.h
#	flow/ActorCollection.actor.cpp
#	flow/Net2.actor.cpp
#	flow/Trace.cpp
#	flow/flow.cpp
2019-05-23 08:35:47 -07:00
mpilman
6afce01744 Implementation complete (not yet working) 2019-05-13 14:15:22 -07:00
A.J. Beamon
5f55f3f613 Replace g_random and g_nondeterministic_random with functions deterministicRandom() and nondeterministicRandom() that return thread_local random number generators. Delete g_debug_random and trace_random. Allow only deterministicRandom() to be seeded, and require it to be seeded from each thread on which it is used. 2019-05-10 14:01:52 -07:00
Vishesh Yadav
3eb9b23024 Listen to multiple addresses and start using vector<NetworkAdddress> in Endpoint
- This patch will make FDB listen to multiple addresses given via
  command line. Although, we'll still use first address in most places,
  this patch starts using vector<NetworkAddress> in Endpoint at some basic
  places.
- When sending packets to an endpoint, pick a random network address in
  endpoints
- Renames Endpoint::address to Endpoint::addresses since it
  now holds a vector of addresses.
2018-12-13 13:36:52 -08:00
Vishesh Yadav
43e5a46f9b Change Endpoint::address(NetworkAddress) to vector<NetworkAddress>
Extend `Endpoint` class to take multiple NetworkAddresses instead of
just one. Hence, to talk to an endpoint instead of one IP:PORT, we'll
have multiple IP:PORT pairs.

This patch simply adds the field and makes changes to compile the
codebase. The first element of of `address` field is used everywhere.
Hence the way we talk to remains same with this patch.

NOTE:

Directly accessing the first memeber of Endpoint::address is unsafe
as Endpoint() doesn't enforces non-empty address list. However, since
the correctness test pass for now and are anyway replacing all those
unsafe accesses with ones considering the whole vector, this patch
ignores to access them in safe way.
2018-12-13 13:36:52 -08:00
Robert Escriva
268093a96d Adjust all includes to be relative to the root.
Remove the use of relative paths.  A header at foo/bar.h could be included by
files under foo/ with "bar.h", but would be included everywhere else as
"foo/bar.h".  Adjust so that every include references such a header with the
latter form.

Signed-off-by: Robert Escriva <rescriva@dropbox.com>
2018-10-19 17:35:33 +00:00
Alex Miller
fb31a6999f Rewrite all files to have #include actorcompiler.h as the last include. 2018-08-14 15:50:26 -07:00
Alex Miller
535b5701e5 Rewrite all Void _ = wait(...) -> wait(...).
This takes advantage of the new actorcompiler functionality to avoid
having duplicate definitions of `Void _` when trying to feed the
un-actorompiled source through clang.
2018-08-14 15:50:26 -07:00
Evan Tschannen
13fb59cf11 fix: the cluster controller needs to update its priority immediately 2018-07-06 18:29:54 -07:00
Evan Tschannen
5fc8199abc Swapped OkayFit and UnsetFit, because generally if machine classes are set on one machine they are set everywhere and it helps with wait_for_good_recruitment logic
wait_for_good_recruitment now requires that you have the desired count of each roll
remote recruitment is given a much longer wait_for_good_recruitment time interval, which does not start until enough remote machines have registered
2018-06-22 10:15:24 -07:00
Balachandar Namasivayam
59bfa74197 Address review comments. Refactor getLeader function to mask the first 7 bits of changeID and return the masked LeaderInfo. 2018-06-01 18:23:24 -07:00
Balachandar Namasivayam
9f55ccd4a5 Remove extraneous comments. 2018-05-31 15:32:47 -07:00
Balachandar Namasivayam
070366ca70 Optimize client and server connection times to cluster controller, especially in multi DC configurations.
A majority(quorum) answer from co-ordinators was required to connect to cluster controller.
Now a cluster controller is optimistically selected to connect even if there is no quorum.
2018-05-30 16:48:04 -07:00
Evan Tschannen
c74211bd92 fix: merge problem 2018-03-09 16:52:37 -08:00
Evan Tschannen
91bb8faa45 Merge commit 'f773b9460d31d31b7d421860fc647936f31aa1fa'
# Conflicts:
#	tests/fast/SidebandWithStatus.txt
#	tests/rare/LargeApiCorrectnessStatus.txt
#	tests/slow/DDBalanceAndRemoveStatus.txt
2018-03-09 14:47:03 -08:00
Evan Tschannen
f9625f5b2f fix: new cluster controllers should not consider anything failed until they have time to get failure monitoring updates
fix: storage and log class machines wait 100MS before attempting to become the cluster controller
2018-03-08 18:08:41 -08:00
Evan Tschannen
37a6a81634 Merge commit '7f6fc3e039c911cd84b8540f7f799fc38a1c1822' into feature-remote-logs
# Conflicts:
#	fdbserver/workloads/RestartRecovery.actor.cpp
2018-02-23 12:33:28 -08:00
Alec Grieser
0bae9880f1 remove trailing whitespace from our copyright headers ; fixed formatting of python setup.py 2018-02-21 10:25:11 -08:00
Evan Tschannen
c7b3be5b19 re-enabled better master exists
the cluster controller can choose a better data center for itself and let the workers know where the next cluster controller should be recruited
2018-02-09 16:48:55 -08:00
Yichi Chiang
df922bc973 Change excluded cluster controller 2017-11-14 13:57:37 -08:00
Yichi Chiang
5bcdd37c0d Move UID generation and add initialClass 2017-10-13 13:46:37 -07:00
Yichi Chiang
12edd27281 Introduce prevChangeID to CandidacyRequest and LeaderHeartbeatRequest 2017-10-12 17:11:58 -07:00
Yichi Chiang
636ce4a131 Replace leader when find a better one 2017-09-29 16:34:55 -07:00
Yichi Chiang
6758c649fc Catch and update processClass change from DBSource 2017-09-25 10:36:03 -07:00
Yichi Chiang
9fe927127f choose leader on the perferred process class 2017-08-28 14:41:04 -07:00
FDB Dev Team
a674cb4ef4 Initial repository commit 2017-05-25 13:48:44 -07:00