1704 Commits

Author SHA1 Message Date
mpilman
1c16f87a4e Remove trace-calls to printable (in non-workloads) 2019-04-05 13:12:19 -07:00
mpilman
ea67b742c7 Implemented Traceable for printable types 2019-04-05 13:12:19 -07:00
mpilman
bb82f8560a process all volatile ints correctly in traces 2019-04-05 13:12:19 -07:00
mpilman
02e3b634fb Compile sqlite with NDEBUG so we can debug 2019-04-05 13:12:19 -07:00
mpilman
c008e16c81 Defer formatting in traces to make them cheaper
This is the first part of making `TraceEvent` cheaper. The main idea is
to defer calls to any code that formats string. These are the main
changes:

- TraceEvent::detail now takes a c-string instead of std::string for
  literals. This prevents unnecessary allocations if the trace is not
  going to be printed in the first place (for example for SevDebug).
  Before that `detail` expected a `std::string` as key, which mean that
  any string literal would be copied on each call.
- Templates Traceable and SpecialTraceMetricType. These templates can be
  specialized for any type that needs to be printed. The actual
  formatting will be deferred to after the `enabled` check. This
  provides two benefits: (1) if a TraceEvent is disabled, we don't pay
  for the formatting and (2) TraceEvent can trace types that it doesn't
  know about.
- TraceEvent::enabled will be set in the constructor if the Severity is
  passed. This will make sure that `TraceEvent::init` is not called.
- `TraceEvent::detail` will be inlined. So for disabled TraceEvent
  calls, a call to detail will only introduce a if-branch which is much
  cheaper than a function call.
2019-04-05 13:12:19 -07:00
Jingyu Zhou
acf60c5e9a
Merge pull request #1414 from jzhou77/pprof
Add manually triggered heap profiling
2019-04-04 22:27:33 -07:00
Jingyu Zhou
5be592632b Change trace event message
If heap profiler is not running, we can't take a snapshot of the profile.
2019-04-04 15:29:50 -07:00
Jingyu Zhou
f538df5e6c Add TraceEvent if unable to invoke heap profiler 2019-04-04 15:26:41 -07:00
Alex Miller
8f49be480b
Update fdbserver/worker.actor.cpp
Co-Authored-By: jzhou77 <jingyuzhou@gmail.com>
2019-04-04 13:32:10 -07:00
Jingyu Zhou
eaaf58ee34 Refactor profiler into cpu and heap profilers 2019-04-03 20:54:30 -07:00
Jingyu Zhou
3371cf22d4 Add manually triggered heap profiling
At client side:
fdb> profile
ERROR: Usage: profile <client|list|flow|heap>
fdb> profile heap 127.0.0.1:4500

On the server side:
$ HEAPPROFILE=/tmp/fdbserver bin/fdbserver -C ../test.cluster -p 127.0.0.1:4500
Starting tracking the heap
FDBD joined cluster.
Dumping heap profile to /tmp/fdbserver.0001.heap (1024 MB allocated cumulatively, 13 MB currently in use)
Dumping heap profile to /tmp/fdbserver.0002.heap (User triggered heap dump)
2019-04-03 16:00:54 -07:00
Markus Pilman
101a05ae77
Merge branch 'master' into features/client-simulator 2019-04-03 10:03:56 -08:00
Jingyu Zhou
fc59587b3c
Merge pull request #1393 from jzhou77/pprof
Gperftools Profiling fix.
2019-04-03 10:35:31 -07:00
Evan Tschannen
39c595223b Merge branch 'release-6.1' 2019-04-02 22:30:02 -07:00
Evan Tschannen
30133a30e0
Merge pull request #1403 from etschannen/release-6.1
Ported a bug fix to the 6.0 log system, and updated documentation
2019-04-02 17:56:18 -07:00
Jingyu Zhou
56a1128a9b Enhance cmake's gperftools support
Add compiler flags and link flags for gperftools.
2019-04-02 17:34:29 -07:00
Evan Tschannen
31ed73d9f5 Ported the bug fix https://github.com/apple/foundationdb/pull/1379 to OldTLogServer_6_0 2019-04-02 15:27:37 -07:00
Evan Tschannen
1d4a6ab551 cleaned up status to keep the healthyZone read separated from relicaFutures 2019-04-02 14:46:56 -07:00
Evan Tschannen
a38c396283 made all maintenance transactions lock aware 2019-04-02 14:27:48 -07:00
Evan Tschannen
628fec8c8b updated status with information about ongoing maintenance
clear the maintenance zone if a different storage server is detected failed
2019-04-02 14:15:51 -07:00
mpilman
371a41dbba Allow classPath to be modified at runtime 2019-04-02 11:56:40 -07:00
mpilman
e19901186f Fixed buggy register preparation for natives 2019-04-02 11:56:03 -07:00
Evan Tschannen
72203ba47a Merge commit '56f3f0b1bc60604f965152d856ae29a591227703' 2019-04-01 18:45:38 -07:00
Evan Tschannen
781cf9b5a0 added the ability to make a zoneId for maintenance in fdbcli 2019-04-01 17:55:13 -07:00
Evan Tschannen
f5de52de91 fix: cancel the previous log system recruitment before calling newEpoch, to avoid multiple actors attempting to modify oldLogSystem at the same time 2019-04-01 16:38:25 -07:00
Jingyu Zhou
49fdc35e5e Gperftools Profiling fix.
Fix a bug and update gperftools compiling flags

The added flags are recommended by gperftools here:
https://github.com/gperftools/gperftools

Verified that heap profiles are saved with the following command:
HEAPPROFILE=/tmp/fdbserver fdbserver [args...]
2019-04-01 14:42:18 -07:00
mpilman
b148981bba Fixed compilation issues with char* 2019-04-01 14:29:45 -07:00
mpilman
e23e63c6ac Implemented JavaWorkload
This change allows a user to write a workload in Java.

The way this is implemented is by creating a JVM within the
simulator and calling the corresponding workload class. A
workload can then run in the simulator or on a testing cluster.

If the workload is executed within the simulator, the resulting
test will not be deterministic anymore as it will execute in a
different thread (and even without that it is not clear, whether
we could get determinism as the JVM does a lot of stuff that are
not deterministic).

This is intendet to get better testing of the Java client and
layer authors can use the simulator to test their layers on a single
machine but they can still simulate failing machines etc.
2019-03-31 17:57:43 -07:00
Evan Tschannen
a46620fbee Merge branch 'release-6.1' 2019-03-30 17:59:28 -07:00
Evan Tschannen
8ebf771392 cleanup cluster controller trace events 2019-03-30 14:17:18 -07:00
Alex Miller
e7ad39246c
Fix typo 2019-03-29 20:16:26 -07:00
Evan Tschannen
a44ffd851e fix: the shared tlog could fail to update a stopped tlog’s queueCommitVersion to version if a second tlog registered before it could issue the first commit for the tlog 2019-03-29 20:11:30 -07:00
Evan Tschannen
d882c060bf Merge commit '5dd6396eed0de0dfea6cf9eecc307995eff5cedc' 2019-03-28 18:00:55 -07:00
Balachandar Namasivayam
0bbdc15f71 Multi-test processes waits until a timeout if any of the tester processes restarts. Use getReplyUnlessFailedFor instead of getReply to detect the restarts and fail quickly instead of waiting for a timeout which is usually large. 2019-03-28 17:05:30 -07:00
Evan Tschannen
b6008558d3 renamed BinaryWriter.toStringRef() to .toValue(), because the function now returns a Standalone<StringRef>()
eliminated an unnecessary copy from the proxy commit path
eliminated an unnecessary copy from buffered peek cursor
2019-03-28 11:52:50 -07:00
Evan Tschannen
836bb95a7a
Merge pull request #1372 from etschannen/master
Merge 6.1 into master
2019-03-27 21:00:49 -07:00
Evan Tschannen
34b9d5e722
Merge pull request #1364 from etschannen/feature-fast-serialize
A few performance optimizations
2019-03-27 20:57:25 -07:00
Evan Tschannen
e5a80f2c94 optimized IPaddress 2019-03-27 18:21:13 -07:00
A.J. Beamon
91014d4529 Add file changes that I accidentally failed to commit; fix naming issue in worker. 2019-03-27 08:41:19 -07:00
A.J. Beamon
71e2fdafb8 Changes to ratekeeper camel case 2019-03-27 08:24:25 -07:00
A.J. Beamon
d508658569 Make ratekeeper one word to match our existing convention 2019-03-27 08:15:19 -07:00
Jingyu Zhou
38c6681349 Fix some signed and unsigned mismatch warnings. 2019-03-26 14:54:11 -07:00
Jingyu Zhou
c0b58080ee Fix type name warning for DDTeamCollection
Seen using 'class' now seen using 'struct' in DataDistribution.actor.cpp
2019-03-26 14:18:25 -07:00
Jingyu Zhou
7c02ee6fdd Fix compiler warning about unreferenced exception variable 2019-03-26 13:43:47 -07:00
Jingyu Zhou
466a59a99d Merge remote-tracking branch 'apple/release-6.1' into ratekeeper 2019-03-25 15:27:38 -07:00
Jingyu Zhou
f57a22e2ed Add data distributor and ratekeeper to status output 2019-03-25 15:11:29 -07:00
Evan Tschannen
5e03e178de
Merge pull request #1345 from ajbeamon/support-multiple-client-or-worker-issues
Add support for a client or worker having multiple issues.
2019-03-24 17:27:50 -07:00
Evan Tschannen
d45159ebf7
Merge pull request #1307 from jzhou77/ratekeeper
Monitor placement of Ratekeeper and DataDistributor
2019-03-24 17:26:07 -07:00
Evan Tschannen
d6ad027d37 ratekeeper needs to be recruited for proxies to make progress, so if one has not registered with the cluster controller by the time we are accepting commits, recruit a new one 2019-03-24 16:48:24 -07:00
Evan Tschannen
f426d732ea fix: forgot to remove one location where id_used was incremented for distributor and ratekeeper 2019-03-24 16:04:59 -07:00