40 Commits

Author SHA1 Message Date
Xiaoge Su
0a60142160 Extract ProcessInfo, MachineInfo, KillType out from ISimulator 2023-01-24 14:48:42 -08:00
FoundationDB CI
86d6106dc1
format source code after switch to clang 15 2022-12-08 17:26:45 +00:00
Markus Pilman
fb1390ef3d reformat code 2022-10-14 08:47:56 -06:00
Markus Pilman
49f0cf5ab0 Force name and description of workloads to be the same 2022-10-13 20:53:48 -06:00
Nim Wijetunga
37b93a6232
fix tests (#8286) 2022-09-22 11:34:06 -07:00
A.J. Beamon
4fd64630e8 Convert literal string ref instances to use _sr suffix 2022-09-19 11:35:58 -07:00
sfc-gh-tclinkenbeard
82adc1e856 Make g_simulator a pointer 2022-09-15 09:00:33 -07:00
Jingyu Zhou
3379f1e974 Move resolutionBalancing() back to master
This revert the behavior done by a recent refactor on master recovery in PR #6191.
2022-03-23 09:57:31 -07:00
sfc-gh-tclinkenbeard
a71099471b Update copyright header dates 2022-03-21 13:36:23 -07:00
A.J. Beamon
e882eb33fc Abstract the cluster file into a cluster connection record that can be backed by something other than the filesystem. 2021-10-22 11:05:18 -07:00
Chaoguang Lin
65956ae6b7 Refactor configure command; refactor changeConfig to template code to reuse existing tests 2021-09-21 10:06:04 -07:00
Xiaoge Su
abf73047ca Enforce std:: specifier rather than using namespace 2021-09-16 19:40:28 -07:00
Josh Slocum
6aabd9a03e Adding FIXME for simulation issue 2021-08-17 18:18:28 -05:00
Steve Atherton
507c1f11e3 Add .log() to bare TraceEvent() invocations without any .detail()s to avoid clang-tidy warning about immediate destruction of object without use. 2021-07-26 19:55:10 -07:00
FDB Formatster
df90cc89de apply clang-format to *.c, *.cpp, *.h, *.hpp files 2021-03-10 10:18:07 -08:00
sfc-gh-tclinkenbeard
7f0d14c8e4 Modernize/refactor workloads directory 2020-10-04 22:29:07 -07:00
Balachandar Namasivayam
a5af31de23 Addressed simple review comments 2020-03-31 18:34:13 -07:00
Balachandar Namasivayam
b1c3893d40 Fix some corner case bugs exposed by simulation.
In one case, when a SS joins the cluster and DD doesn't find any healthy server to form a team with the newly added server, then the SS does not get added to any team even when the other servers get healthy.
Another is an extreme case where a data center is down, and a SS in the active DC joins and then dies immediately but not before DD adds it to a destination team for a relocating shard which will result in DD waiting indefinitely for the dead data center to come back up for the cluster to be fully recovered.
2020-03-31 18:33:12 -07:00
Evan Tschannen
4a866290b7 Clients keep a persistent connection open with coordinators to get updates to the list of proxies
Status still needs to be updated with client information with information from the coordinators
2019-07-23 19:22:44 -07:00
A.J. Beamon
5f55f3f613 Replace g_random and g_nondeterministic_random with functions deterministicRandom() and nondeterministicRandom() that return thread_local random number generators. Delete g_debug_random and trace_random. Allow only deterministicRandom() to be seeded, and require it to be seeded from each thread on which it is used. 2019-05-10 14:01:52 -07:00
Andrew Noyes
d7612a4426 Fix OPEN_FOR_IDE build errors 2019-04-05 16:30:42 -07:00
Evan Tschannen
b8910ba7cd Merge branch 'master' into feature-fix-force-recovery
# Conflicts:
#	fdbclient/ManagementAPI.actor.h
#	fdbserver/DataDistribution.actor.cpp
#	fdbserver/storageserver.actor.cpp
#	fdbserver/workloads/KillRegion.actor.cpp
2019-02-22 14:38:13 -08:00
Evan Tschannen
27e3617548 fix: remove bad teams needed to use dd_stall_check delay, because in simulation the buggified delay time could make us remove bad teams before they submit their ranges to the queue 2019-02-20 14:18:36 -08:00
mpilman
999ea09bfd Use correct fwd decls in TesterInterface
Also TesterInterface.h -> TesterInterface.actor.h
2019-02-19 15:16:59 -08:00
mpilman
699216f713 Use fwd decls in workloads
Also workloads.h -> workloads.actor.h
2019-02-19 15:16:59 -08:00
mpilman
3f0fd2a20c Use fwd decls in WorkerInterface
Also WorkerInterface.h -> WorkerInterface.actor.h
2019-02-19 15:16:59 -08:00
mpilman
0bb60e5a3b Use proper fwd decl in NativeAPI
Also NativeAPI.h -> NativeAPI.actor.h
2019-02-19 15:16:59 -08:00
mpilman
3cb2391b58 use proper fwd declarations in ManagementAPI
Also ManagementAPI.h -> ManagementAPI.actor.h
2019-02-19 15:16:59 -08:00
Evan Tschannen
8ed89fd711 fixed review comments 2019-02-19 11:26:53 -08:00
Evan Tschannen
ed9e20ce17 forgot to fix merge conflicts 2019-02-18 17:09:55 -08:00
Evan Tschannen
065a45e05f Merge branch 'master' into feature-fix-force-recovery
# Conflicts:
#	fdbclient/ManagementAPI.actor.cpp
#	fdbserver/ClusterController.actor.cpp
#	fdbserver/workloads/KillRegion.actor.cpp
2019-02-18 17:09:06 -08:00
Evan Tschannen
62603d11a1 updated the killRegion simulation test to test a much larger variety of failure scenarios 2019-02-18 15:32:51 -08:00
Andrew Noyes
067a445e06 Replace unused _ variables with wait(success(...)) 2019-02-12 17:30:30 -08:00
Evan Tschannen
4b5d0b4e2c Merge branch 'release-6.0'
# Conflicts:
#	documentation/sphinx/source/release-notes.rst
#	fdbclient/AsyncFileBlobStore.actor.cpp
#	fdbclient/AsyncFileBlobStore.actor.h
#	fdbclient/BlobStore.actor.cpp
#	fdbclient/BlobStore.h
#	fdbclient/HTTP.actor.cpp
#	fdbclient/ManagementAPI.actor.cpp
#	fdbclient/NativeAPI.actor.cpp
#	fdbrpc/LoadBalance.actor.h
#	fdbrpc/batcher.actor.h
#	fdbrpc/fdbrpc.vcxproj
#	fdbrpc/sim2.actor.cpp
#	fdbserver/DataDistribution.actor.cpp
#	fdbserver/DataDistributionTracker.actor.cpp
#	fdbserver/SimulatedCluster.actor.cpp
#	fdbserver/TLogServer.actor.cpp
#	fdbserver/masterserver.actor.cpp
2018-11-10 13:04:24 -08:00
Evan Tschannen
c02690471d added protection against configuration changes which cannot be immediately reverted
the configure database workload tests region configurations
2018-11-04 19:53:55 -08:00
Robert Escriva
268093a96d Adjust all includes to be relative to the root.
Remove the use of relative paths.  A header at foo/bar.h could be included by
files under foo/ with "bar.h", but would be included everywhere else as
"foo/bar.h".  Adjust so that every include references such a header with the
latter form.

Signed-off-by: Robert Escriva <rescriva@dropbox.com>
2018-10-19 17:35:33 +00:00
Evan Tschannen
3922e477a5 Merge branch 'release-6.0'
# Conflicts:
#	documentation/sphinx/source/release-notes.rst
#	fdbclient/ManagementAPI.actor.cpp
#	fdbserver/ClusterController.actor.cpp
#	fdbserver/DataDistribution.actor.cpp
#	fdbserver/LogSystemDiskQueueAdapter.actor.cpp
#	fdbserver/SimulatedCluster.actor.cpp
#	fdbserver/TLogServer.actor.cpp
2018-10-03 16:57:18 -07:00
Evan Tschannen
c9f4109539 fix: add some additional time in the kill region workload to detect if we recovered successfully 2018-10-02 17:47:15 -07:00
Evan Tschannen
b560b94ebc fix: do not force a recovery if the master was already in the other region (and therefore already recovered)
fix: reboot the remaining DC, because any storage server rejoins that were rolled back will cause that server to be unusable
2018-09-28 12:10:04 -07:00
Evan Tschannen
200e65fe61 added a workload which tests killing an entire region, and recovering from the failure with data loss.
fix: we cannot pop the txs tag from remote logs until they have a full copy of the txnStateStore
fix: we have to modify all of history, we cannot stop after finding a local remote
2018-09-17 18:32:39 -07:00