Alex Miller
4a7e0319c7
Refactor away pushlock.
...
Pushing was already a serialized, sequential operation.
Instead make it explicit that there are two waits as part of a push:
1. The setup work to reserve a spot on in the file
2. The work of writing and sync'ing the data
And we return a Future<Future<Void>> to force these to be done sequentially.
2019-05-10 20:30:52 -10:00
Alex Miller
ea12a54946
Rename DISK_QUEUE_MAX_TRUNCATE_EXTENTS -> ..._BYTES
...
So as to not make filesystem assumptions. This knob did technically
appear in (only the) 6.1.5 release, but this feature was broken 6.1.5,
so thus impossible to use anyway.
2019-05-10 18:26:22 -10:00
Alex Miller
c95d09f9fd
Convert truncate(0) to truncate(4KB) on Windows.
...
Blindly, in case Windows doesn't like 0 length truncates too.
2019-05-10 14:55:11 -10:00
Alex Miller
c502ed3d15
Fix a variety of problems stemming from a wait() being added to push().
...
And that this code was previously insufficiently tested.
2019-05-10 14:55:11 -10:00
Alex Miller
510b0b2fcd
Fix DiskQueue not replaceFile'ing frequently enough for the final time.
2019-05-08 23:08:25 -10:00
Alex Miller
c6c33a4daa
Make replaceFile more likely to be tested.
2019-05-08 21:23:42 -10:00
Alex Miller
0d0f54d1e6
Fix IAsyncFileSystem::open() flags to stop a crash.
...
OPEN_ATOMIC_WRITE_AND_CREATE was missing a required OPEN_CREATE.
I'm honestly baffled how this was missed in testing.
2019-05-08 21:22:40 -10:00
Alex Miller
b50926c792
replaceFile is truncate(0) on windows
2019-05-08 21:22:14 -10:00
Alex Miller
e4ba2f5788
Add an ending TraceEvent.
2019-05-08 12:35:12 -10:00
Alex Miller
c093017c2f
Add a TraceEvent and release note.
2019-05-08 12:34:25 -10:00
Alex Miller
0685e6c1c7
Avoid large truncates in the DiskQueue.
...
And instead create a new file while incrementally truncating the old one
down. This avoids queueing up a massive number of filesystem metadata
operations in one call, thus flooding the disk with requests and
stalling out all other filesystem operations.
This sets the knobs so that a truncate of >10GB causes us to create a
new file rather than trying to truncate the old one.
2019-05-08 12:33:31 -10:00
Alex Miller
36dfbf4fb3
Only truncate DiskQueues down to TLOG_HARD_LIMIT*2.
...
DiskQueue shrinking was implemented for spill-by-reference, as now
a DiskQueue could grow "unboundedly" large.
Without a minimum file size, write burst workloads would cause the
DiskQueue to shrink down to 100MB, and then grow back to its usual ~4GB
size in a cycle. File growth means filesystem metadata mutations, which
we'd prefer to avoid if possible since they're more unpredicatble in
terms of latency.
In a healthy cluster, the TLog never spills, so the disk of a single
DiskQueue file should stay less than 2*TLOG_SPILL_THRESHOLD. In the
worst case of spill-by-value, the DiskQueue could grow to
2*TLOG_HARD_LIMIT. Therefore, having this limit will cause DiskQueue
shrinking to never behave sub-optimally for spill-by-value, and will
cause the DiskQueue files to return to the optimal size with
spill-by-reference.
2019-05-08 12:33:31 -10:00
Alex Miller
a269a784cc
Convert push() into an actor.
2019-05-08 12:33:31 -10:00
Evan Tschannen
68c773987c
Merge pull request #1544 from etschannen/release-6.1
...
The team tracker does not provide data movement priority information for non-failure related data movement
2019-05-08 11:39:17 -07:00
Balachandar Namasivayam
d45e7bf0b1
Addressed review comments
2019-05-07 17:19:59 -07:00
Evan Tschannen
d9a4553270
fix: The team tracker does not provide data movement priority information for non-failure related data movement
2019-05-07 17:06:54 -07:00
Balachandar Namasivayam
5d824f5fbc
Address review comments
2019-05-07 17:06:52 -07:00
Balachandar Namasivayam
a0cc3d98a1
Add a workload to trigger repeated recoveries.
2019-05-06 18:16:44 -07:00
Evan Tschannen
93eb2a9395
Merge pull request #1527 from alexmiller-apple/tstlog-6.1
...
Spill-by-reference knob + TLog6.0 Spilled Peek deprioritization
2019-05-03 17:19:45 -07:00
Alex Miller
c918b21137
Deprioritize spilled peeks in spill-by-value, and improve its logic.
...
This deprioritizes before calling peekMessagesFromMemory, which should
improve the memory usage of the TLog, and makes sure to keep txsTag
peeks at a high priority to help recoveries stay fast.
2019-05-03 15:27:11 -07:00
Alex Miller
4052f3826a
Add a knob to limit the number of commits indexed per key.
...
Theoretically, we could spill 20MB of 22B mutations for one key, which
would generate a very long value being stored in SQLite, and very
inefficiently read back. This stops that from being a problem, at the
cost of some extra write calls.
2019-05-03 15:27:10 -07:00
Evan Tschannen
12088119d2
Merge pull request #1517 from alexmiller-apple/tstlog-6.1
...
Add a knob to limit amount of data read from sqlite for one PeekRequest.
2019-05-03 11:01:11 -07:00
Alex Miller
f4e48c3851
Add a knob to limit amount of data read from sqlite for one PeekRequest.
...
This prevents peeking from degrading over time if there are a very large
number of SpilledData entries for one particular tag.
2019-05-02 17:26:45 -07:00
Evan Tschannen
c91ac03ec6
LogRouterStats did not need to be a separate struct
2019-05-02 17:24:39 -07:00
Evan Tschannen
8590b710bf
added additional logging on the logs and log routers
2019-05-02 17:24:39 -07:00
Evan Tschannen
cacd82758e
Reduced data distribution speeds
2019-04-26 13:54:49 -07:00
Evan Tschannen
9ff8aca1da
Increased the SQLITE_CHUNK_SIZE to 100MB (left at 4MB for simulation)
2019-04-26 13:53:56 -07:00
Evan Tschannen
1f37f82b87
invalid knob overrides do not prevent fdbserver from starting
2019-04-25 17:08:13 -07:00
Evan Tschannen
6c77864731
separate GetStorageServerRejoinInfoRequest from GetKeyServerLocationsRequest, to avoid yielding for the rejoin requests
2019-04-25 17:07:35 -07:00
A.J. Beamon
253d2400ef
Merge branch 'release-6.1' into speed-up-and-parameterize-spring-cleaning
...
# Conflicts:
# documentation/sphinx/source/release-notes.rst
2019-04-23 14:38:52 -07:00
A.J. Beamon
ea7abff9df
Clean up from review
2019-04-23 14:16:52 -07:00
A.J. Beamon
4ad0496b39
Increase the frequency that lazy deletes are run. Add more parameters for better control over the spring cleaning process.
2019-04-23 14:01:51 -07:00
Stephen Atherton
df0548503d
Merge branch 'release-6.1' of https://github.com/apple/foundationdb into sqlite-grow-bigger
2019-04-23 13:43:58 -07:00
Stephen Atherton
83db547306
Implemented the chunk size and db size hint fileControl options in our SQLite VFS implementation. KeyValueStoreSQLite now sets file chunk size based on a new knob, SQLITE_CHUNK_SIZE_PAGES.
2019-04-23 04:50:58 -07:00
Evan Tschannen
e0f7ec96aa
Data distribution needs to build new teams as old teams are removed to ensure data remains balanced across servers
2019-04-22 17:29:46 -07:00
A.J. Beamon
43533b3d72
Don't validate the shard size estimate unless enough keys are sampled with a less than 100% probability.
2019-04-17 11:01:23 -07:00
Balachandar Namasivayam
04e9aa6afd
For small clusters that are growing quickly, it could happen that the rateLimit is set to a low value and it would take very long to read the entire database. Fix this by setting the rateLimit to the maximum allowed value if reading the entire database is taking a long time.
2019-04-10 17:13:37 -07:00
Evan Tschannen
d126730b4d
fixed a spurious test error where process_behind was treated as an error
2019-04-08 17:09:54 -07:00
A.J. Beamon
538b431656
Apply suggestions from code review
2019-04-08 14:55:58 -07:00
A.J. Beamon
a7288e1325
Throw process_behind instead of future_version when all storage nodes on a team are behind. process_behind gets the same backoff behavior as not_committed. Add proxy_memory_limit_exceeded to the retryable predicate.
2019-04-08 14:21:24 -07:00
Evan Tschannen
05869a8383
do not log a degraded reset message if the previous reset was more than a week ago
2019-04-07 23:00:58 -07:00
Evan Tschannen
390ab9cfed
A process will mark itself as degraded if it continually disconnects from a different process which the failure monitor thinks is healthy
2019-04-04 14:11:12 -07:00
Evan Tschannen
30133a30e0
Merge pull request #1403 from etschannen/release-6.1
...
Ported a bug fix to the 6.0 log system, and updated documentation
2019-04-02 17:56:18 -07:00
Evan Tschannen
31ed73d9f5
Ported the bug fix https://github.com/apple/foundationdb/pull/1379 to OldTLogServer_6_0
2019-04-02 15:27:37 -07:00
Evan Tschannen
1d4a6ab551
cleaned up status to keep the healthyZone read separated from relicaFutures
2019-04-02 14:46:56 -07:00
Evan Tschannen
a38c396283
made all maintenance transactions lock aware
2019-04-02 14:27:48 -07:00
Evan Tschannen
628fec8c8b
updated status with information about ongoing maintenance
...
clear the maintenance zone if a different storage server is detected failed
2019-04-02 14:15:51 -07:00
Evan Tschannen
781cf9b5a0
added the ability to make a zoneId for maintenance in fdbcli
2019-04-01 17:55:13 -07:00
Evan Tschannen
f5de52de91
fix: cancel the previous log system recruitment before calling newEpoch, to avoid multiple actors attempting to modify oldLogSystem at the same time
2019-04-01 16:38:25 -07:00
Evan Tschannen
8ebf771392
cleanup cluster controller trace events
2019-03-30 14:17:18 -07:00