76 Commits

Author SHA1 Message Date
sfc-gh-tclinkenbeard
3755b25c43 Make IDiskQueue const-correct 2020-07-21 14:45:04 -07:00
Evan Tschannen
519ac70a2a
Revert "Enable -Wclass-memaccess and fix warnings" 2020-04-29 15:51:29 -07:00
tclinken
b1f525583a Added -Wclass-memaccess compiler option and fixed warnings 2020-04-22 21:53:42 -07:00
Alex Miller
da73164eda Move crc32c from fdbrpc to flow
So that we can use it from a piece of flow code without breaking module
boundaries.

Also rename generated-constants to crc32c-generated-constants so that
it's more apparent that they're related files.
2020-01-13 18:19:30 -08:00
Alvin Moore
3bf971ba8b Merge branch 'release-6.2' of github.com:apple/foundationdb into release_6.2_merge
# Conflicts:
#	documentation/sphinx/source/release-notes.rst
#	fdbserver/storageserver.actor.cpp
2019-12-12 07:13:12 -08:00
Andrew Noyes
46d10dc7dc Fix "null passed as argument declared not null"
Fix several such reports from ubsan

E.g.

/Users/anoyes/workspace/foundationdb/flow/Arena.h:794:16: runtime error: null pointer passed as argument 1, which is declared to never be null
2019-12-03 14:46:53 -08:00
Alex Miller
35a0fc948d Make DiskQueue V1 not ignore min recovery location.
I can't figure out why I made this branch on version, and it's breaking
having value and reference tlogs in the same SharedTLog
2019-10-03 01:45:10 -07:00
Meng Xu
c9c50ceff8 Comments:Add comments to DiskQueue
No functional change.
2019-08-01 15:20:01 -07:00
A.J. Beamon
e5381e0612 Fix some new usages of g_random 2019-05-23 09:23:27 -07:00
A.J. Beamon
603721e125 Merge branch 'master' into thread-safe-random-number-generation
# Conflicts:
#	fdbclient/ManagementAPI.actor.cpp
#	fdbrpc/AsyncFileCached.actor.h
#	fdbrpc/genericactors.actor.cpp
#	fdbrpc/sim2.actor.cpp
#	fdbserver/DiskQueue.actor.cpp
#	fdbserver/workloads/BulkSetup.actor.h
#	flow/ActorCollection.actor.cpp
#	flow/Net2.actor.cpp
#	flow/Trace.cpp
#	flow/flow.cpp
2019-05-23 08:35:47 -07:00
Alex Miller
4a7e0319c7 Refactor away pushlock.
Pushing was already a serialized, sequential operation.

Instead make it explicit that there are two waits as part of a push:
1. The setup work to reserve a spot on in the file
2. The work of writing and sync'ing the data

And we return a Future<Future<Void>> to force these to be done sequentially.
2019-05-10 20:30:52 -10:00
Alex Miller
ea12a54946 Rename DISK_QUEUE_MAX_TRUNCATE_EXTENTS -> ..._BYTES
So as to not make filesystem assumptions.  This knob did technically
appear in (only the) 6.1.5 release, but this feature was broken 6.1.5,
so thus impossible to use anyway.
2019-05-10 18:26:22 -10:00
Alex Miller
c95d09f9fd Convert truncate(0) to truncate(4KB) on Windows.
Blindly, in case Windows doesn't like 0 length truncates too.
2019-05-10 14:55:11 -10:00
Alex Miller
c502ed3d15 Fix a variety of problems stemming from a wait() being added to push().
And that this code was previously insufficiently tested.
2019-05-10 14:55:11 -10:00
A.J. Beamon
5f55f3f613 Replace g_random and g_nondeterministic_random with functions deterministicRandom() and nondeterministicRandom() that return thread_local random number generators. Delete g_debug_random and trace_random. Allow only deterministicRandom() to be seeded, and require it to be seeded from each thread on which it is used. 2019-05-10 14:01:52 -07:00
Alex Miller
510b0b2fcd Fix DiskQueue not replaceFile'ing frequently enough for the final time. 2019-05-08 23:08:25 -10:00
Alex Miller
c6c33a4daa Make replaceFile more likely to be tested. 2019-05-08 21:23:42 -10:00
Alex Miller
0d0f54d1e6 Fix IAsyncFileSystem::open() flags to stop a crash.
OPEN_ATOMIC_WRITE_AND_CREATE was missing a required OPEN_CREATE.

I'm honestly baffled how this was missed in testing.
2019-05-08 21:22:40 -10:00
Alex Miller
b50926c792 replaceFile is truncate(0) on windows 2019-05-08 21:22:14 -10:00
Alex Miller
e4ba2f5788 Add an ending TraceEvent. 2019-05-08 12:35:12 -10:00
Alex Miller
c093017c2f Add a TraceEvent and release note. 2019-05-08 12:34:25 -10:00
Alex Miller
0685e6c1c7 Avoid large truncates in the DiskQueue.
And instead create a new file while incrementally truncating the old one
down.  This avoids queueing up a massive number of filesystem metadata
operations in one call, thus flooding the disk with requests and
stalling out all other filesystem operations.

This sets the knobs so that a truncate of >10GB causes us to create a
new file rather than trying to truncate the old one.
2019-05-08 12:33:31 -10:00
Alex Miller
36dfbf4fb3 Only truncate DiskQueues down to TLOG_HARD_LIMIT*2.
DiskQueue shrinking was implemented for spill-by-reference, as now
a DiskQueue could grow "unboundedly" large.

Without a minimum file size, write burst workloads would cause the
DiskQueue to shrink down to 100MB, and then grow back to its usual ~4GB
size in a cycle.  File growth means filesystem metadata mutations, which
we'd prefer to avoid if possible since they're more unpredicatble in
terms of latency.

In a healthy cluster, the TLog never spills, so the disk of a single
DiskQueue file should stay less than 2*TLOG_SPILL_THRESHOLD.  In the
worst case of spill-by-value, the DiskQueue could grow to
2*TLOG_HARD_LIMIT.  Therefore, having this limit will cause DiskQueue
shrinking to never behave sub-optimally for spill-by-value, and will
cause the DiskQueue files to return to the optimal size with
spill-by-reference.
2019-05-08 12:33:31 -10:00
Alex Miller
a269a784cc Convert push() into an actor. 2019-05-08 12:33:31 -10:00
Alex Miller
37ea71b117 Implement limiting how many bytes recovery will read.
This time, track what location in the DiskQueue has been spilled in
persistent state, and then feed it back into the disk queue before
recovery.

This also introduces an ASSERT that recovery only reads exactly the
bytes that it needs to have in memory.
2019-03-18 15:09:43 -07:00
Alex Miller
ee4721a63f Make checking or ignoring checksums part of the IDiskQueue::read API. 2019-03-15 21:01:18 -07:00
Alex Miller
bf247eeed0 If TLogVersion >= 3, use crc32c for the DiskQueue hash for TLogs.
We don't have a forward compatibility story for the memory storage
engine, so its DiskQueue will still be hashlittle2 until one exists.
2019-03-15 21:01:16 -07:00
Alex Miller
686b097397 Remove verification code from DiskQueue and TLogServer. 2019-03-15 21:01:15 -07:00
Alex Miller
bdd7d5d3df Initialize firstPages with 0xFF.
There's various ASSERT()'s that assume firstPages is empty, and enforces
things about `seq`.  Some of these asserts have spuriously passed, since
uninitialized pages look like they have a `seq` of 0, which would be the
beginning of the disk queue.

Now they'll look like the end of the disk queue, which is far easier to
fail on.
2019-03-15 21:01:14 -07:00
Alex Miller
baa3e1af2c Replace /sizeof(Page)*sizeof(Page) with pageFloor(). 2019-03-04 01:42:39 -08:00
Alex Miller
ee64b43366 Change DQ shrink logic to consider "active" bytes rather than file size.
We know what the current ideal size of the DQ file should be, so we
should use it.
2019-03-04 01:42:39 -08:00
Alex Miller
94bf75cb00 Allow the disk queue to shrink if it has unneeded slack space. 2019-03-04 01:42:38 -08:00
Alex Miller
52d5a721a6 Don't allocate 2x the memory for a read to save 1% of allocated memory. 2019-03-04 01:42:38 -08:00
Alex Miller
ee8964c8ec Plumb through getNext{Commit,Push}Location 2019-02-26 18:00:55 -08:00
Alex Miller
b725d841ea Restore a hash check as an ASSERT_WE_THINK 2019-02-19 22:30:15 -08:00
Alex Miller
334730ce7d Do not re-hash firstPages.
They're already known to be valid.
2019-02-19 22:10:46 -08:00
Alex Miller
12123f41d6 Plumb a read function up the stack to IDiskQueue 2019-02-12 23:44:13 -08:00
Alex Miller
6c7229ec07 read fix while recovery 2019-02-12 23:44:13 -08:00
Alex Miller
8b21d1ac8f Add a standalone recovery initialization function. 2019-02-12 23:44:13 -08:00
Alex Miller
2f49acc8a0 Add a read function. 2019-02-12 23:44:13 -08:00
Alex Miller
63eb62cd36 Fix a bug when a read was delayed until after the entire disk queue has been rewritten. 2019-02-12 23:44:13 -08:00
Alex Miller
9886386a83 temporarily verify commited data as a test for read 2019-02-12 23:44:13 -08:00
Alex Miller
efa8aa7e2e Adjust findPhysicalLocation to not spam.
Context is now optional, so that our high-volume calls don't get logged,
but low-volume calls still get logged the same way that they did before.
2019-02-12 23:44:13 -08:00
Alex Miller
f1c31e2305 Add a read function to disk queue 2019-02-12 23:44:13 -08:00
Alex Miller
2d2b03a9ff prepare DiskQueue for actors 2019-02-12 23:44:13 -08:00
Alex Miller
40fe29c29b Abstract TrackMe into a reusable CRTP class. 2019-02-12 23:44:13 -08:00
Alex Miller
018d12fe90 use firstpages instead of recoveryfirstpages 2019-02-12 23:43:10 -08:00
Alex Miller
dbf7cefcd8 Add firstPages to DiskQueue 2019-02-12 23:43:10 -08:00
Alex Miller
2570b37e6e Add function to read pages from RawDiskQueue_TwoFiles 2019-02-12 23:43:10 -08:00
Andrew Noyes
067a445e06 Replace unused _ variables with wait(success(...)) 2019-02-12 17:30:30 -08:00