yuanbiao/emqx - emqx

Commit Graph

Author	SHA1	Message	Date
Andrew Mayorov	86f99959b0	Merge pull request #13054 from keynslug/fix/EMQX-12365/node-leave fix(dsrepl): anticipate and handle nodes leaving the cluster	2024-05-17 09:43:15 +02:00
ieQu1	ee6e7174cf	fix(sessds): Rename the durable messages DB to `messages`	2024-05-16 21:31:32 +02:00
Andrew Mayorov	5157e61418	fix(dsrepl): verify if shards already allocated first	2024-05-16 18:56:54 +02:00
Andrew Mayorov	0119728d45	feat(dsrepl): also reflect pending transitions in ds status	2024-05-16 18:56:21 +02:00
Andrew Mayorov	26c4a4f597	feat(dsrepl): reflect conflicts and inconsistencies in ds status	2024-05-16 18:32:08 +02:00
Andrew Mayorov	7e86e3e61c	fix(dsrepl): anticipate and handle nodes leaving the cluster Also make `claim_site/2` safer by refusing to claim a site for a node that is already there.	2024-05-16 18:32:07 +02:00
Andrew Mayorov	35e360fcbe	feat(api-ds): provide more information on nonexistent site leave	2024-05-14 15:57:41 +02:00
ieQu1	525e4dac95	Merge pull request #13036 from ieQu1/dev/reduce-log-spam tests: Reduce log spam in the failed test suites	2024-05-14 10:53:30 +02:00
ieQu1	ac3f5a083d	test: Reduce log spam in the failed test suites	2024-05-13 22:00:33 +02:00
ieQu1	8506ca7919	Merge pull request #12998 from ieQu1/dev/improve-latency Use leader's clock when calculating LTS cutoff timestamp	2024-05-13 21:54:06 +02:00
ieQu1	3da3a36863	test(ds): Add generation in the replication suite	2024-05-13 19:51:04 +02:00
ieQu1	9f7ef9f34f	fix(ds): Apply review remarks	2024-05-13 19:35:24 +02:00
ieQu1	07aa708894	test(ds): Refactor replication suite	2024-05-09 03:56:56 +02:00
ieQu1	63e51fca66	test(ds): Use streams to fill the storage	2024-05-09 02:46:57 +02:00
ieQu1	a0a3977043	feat(ds): Assign latest timestamp deterministically	2024-05-08 23:17:57 +02:00
ieQu1	2236af84ba	feat(ds): two-stage storage commit on the storage level	2024-05-08 23:17:57 +02:00
ieQu1	1ddbbca90e	feat(ds): Allow incremental update of the LTS trie	2024-05-08 23:17:57 +02:00
ieQu1	68ca891f41	test(ds): Use streams to create traffic	2024-05-08 23:17:57 +02:00
Andrew Mayorov	d84c180ccb	feat(dsrepl): avoid contacting unreachable ra servers Assuming estabilished Erlang distribution channel is a reliable way to tell whether a remote node is reachable.	2024-05-08 18:12:13 +02:00
ieQu1	3642bcd1b6	docs(ds): Fix comment for the builtin DS metrics	2024-05-06 11:21:32 +02:00
ieQu1	b2a633aca1	fix(ds): Use leader's clock for computing LTS safe cutoff time	2024-05-06 11:21:32 +02:00
ieQu1	1ff2e02fd9	feat(ds): Pass current time to the storage layer via argument	2024-05-06 11:21:32 +02:00
ieQu1	8ac9700aab	feat(ds): Add an API for DB-global variables	2024-05-06 11:21:32 +02:00
ieQu1	86d45522e3	fix(dsrepl): Don't reverse elements of batches	2024-05-06 11:21:32 +02:00
ieQu1	bcfa7b2209	fix(ds): Destroy LTS tries when the generation is dropped	2024-05-06 11:21:32 +02:00
ieQu1	9999ccd36c	feat(ds): Ignore safe cutoff time for streams without varying levels	2024-05-06 11:21:32 +02:00
ieQu1	e4c3283c9c	docs(ds): Update README with CLI and REST API endpoints	2024-04-23 16:28:35 +02:00
ieQu1	4c76a2574d	fix(ds): Fix egress flush condition	2024-04-21 21:51:31 +02:00
Andrew Mayorov	43f8346c00	fix(dssnap): ensure idempotent write of empty chunks	2024-04-19 18:52:33 +02:00
ieQu1	93bb840365	docs(ds): Update README	2024-04-17 01:21:52 +02:00
Andrew Mayorov	5d7b2e2ce6	fix(dsrepl): attempt leadership transfer on terminate In addition to on removal. The reasoning is basically the same: try to avoid situations when log entries are replicated (or will be considered replicated when the new leader is elected) but the leader terminates before replying to the client. To be clear: this is a stupid solution. Something much more robust is needed.	2024-04-15 22:05:24 +02:00
Andrew Mayorov	89f42f1171	fix(dsrepl): make placeholder shard process permanent under supervisor	2024-04-15 16:43:52 +02:00
Andrew Mayorov	c4d1360b96	fix(dsrepl): trigger election for new ra servers unconditionallly Otherwise we might end up in a situation when there's no member online yet at the time of the election trigger, and the election will never happen.	2024-04-15 16:42:29 +02:00
Andrew Mayorov	d12e907209	fix(dsrepl): correctly handle ra membership change command results Before this change, results similar to `{error, {no_more_servers_to_try, [{error, nodedown}, {error, not_member}]}}` were considered retryable failures, which is incorrect.	2024-04-08 22:44:34 +02:00
Andrew Mayorov	3223797ae5	fix(dsrepl): attempt leadership transfer before server removal This should make it much less likely to hit weird edge cases that lead to duplicate Raft log entries because of client retries upon receiving `shutdown` from the leader being removed.	2024-04-08 22:43:58 +02:00
Andrew Mayorov	1e95bd4da6	test(dsrepl): test unresponsive nodes removal / node restarts	2024-04-08 21:27:56 +02:00
Andrew Mayorov	7a836317ac	fix(dsrepl): trigger unfinished shard transition upon startup Also provide a trivial API to trigger them by hand.	2024-04-08 16:12:42 +02:00
Andrew Mayorov	75bb7f5cdc	fix(dsrepl): retry only `{add, Site}` crashed membership transitions To minimize the potential negative impact of removal transitions that crash for some unknown and unusual reasons.	2024-04-08 16:04:33 +02:00
Andrew Mayorov	4c0cc079c2	fix(dsrepl): apply unnecessary rebalancing transitions cleanly	2024-04-08 13:25:45 +02:00
Andrew Mayorov	dcde30c38a	test(dsrepl): add two more testcases for rebalancing	2024-04-08 13:22:31 +02:00
Andrew Mayorov	2ace9bb893	chore(dsrepl): sprinkle few comments and typespecs for exports	2024-04-07 22:51:56 +02:00
Andrew Mayorov	ecaad348a7	chore(dsrepl): update few outdated comments / TODOs	2024-04-07 22:51:56 +02:00
Andrew Mayorov	6293efb995	fix(dsrepl): retry crashed membership transitions	2024-04-07 22:51:56 +02:00
Andrew Mayorov	826ce5806d	fix(dsrepl): ensure that new member UID matches server's UID Before that change, UIDs supplied in the `ra:add_member/3` were not the same as those servers were using. This haven't caused any issues for some reason, but it's better to ensure that UIDs are the same.	2024-04-07 22:31:24 +02:00
Andrew Mayorov	556ffc78c9	feat(dsrepl): implement membership changes and rebalancing	2024-04-05 18:57:28 +02:00
Andrew Mayorov	d6058b7f51	feat(dsrepl): allow to subscribe to DB metadata changes Currently, only shard metadata changes are announced to the subscribers.	2024-04-05 17:40:55 +02:00
Andrew Mayorov	a07295d3bc	fix(ds): address shards in the supervisor properly	2024-04-05 17:40:38 +02:00
ieQu1	a62db08676	feat(ds): Add REST API for durable storage	2024-04-05 15:22:06 +02:00
ieQu1	d09787d1a6	fix(ds): Fix return types in replication_layer_meta	2024-04-05 15:22:06 +02:00
Andrew Mayorov	70396e9766	Merge pull request #12825 from keynslug/feat/EMQX-12110/repl-meta-api feat(dsrepl): add APIs to manage DB replication sites	2024-04-04 22:32:03 +02:00
Andrew Mayorov	df6c5b35fe	feat(dsrepl): add more primitive operations to modify DB sites	2024-04-04 21:22:49 +02:00
Andrew Mayorov	bb8ffee18c	feat(dsrepl): add API to get current DB replication sites	2024-04-04 21:22:02 +02:00
Andrew Mayorov	ad52f7838e	feat(dsrepl): add APIs to manage DB replication sites	2024-04-04 21:22:01 +02:00
Thales Macedo Garitezi	c57c36adb2	feat(ds): clear all checkpoints when (re)starting storage layer Fixes https://emqx.atlassian.net/browse/EMQX-12143	2024-04-04 14:05:52 -03:00
ieQu1	f37ed3a40a	fix(ds): Limit the number of retries in egress to 0	2024-04-03 16:38:49 +02:00
ieQu1	2bbfada7af	fix(ds): Make async batches truly async	2024-04-03 11:57:47 +02:00
ieQu1	92ca90c0ca	fix(ds): Improve egress logging	2024-04-03 11:57:47 +02:00
ieQu1	ae5935e7f7	test(ds): Attempt to stabilize metrics_worker tests in CI	2024-04-02 19:14:10 +02:00
ieQu1	4382971443	fix(ds): Preserve errors in the egress	2024-04-02 16:47:43 +02:00
ieQu1	94ca7ad0f8	feat(ds): Report counters for LTS storage layout	2024-04-02 16:47:43 +02:00
ieQu1	b379f331de	fix(sessds): Handle errors when storing messages	2024-04-02 16:47:41 +02:00
ieQu1	f41e538526	feat(sessds): Observe next time	2024-04-02 16:45:52 +02:00
ieQu1	75b092bf0e	fix(ds): Actually retry sending batch	2024-04-02 16:45:49 +02:00
ieQu1	0de255cac8	feat(ds): Report egress flush time	2024-04-02 16:25:04 +02:00
ieQu1	044f3d4ef5	fix(ds): Don't reverse entries in the atomic batch	2024-04-02 16:25:04 +02:00
ieQu1	606f2a88cd	feat(ds): Add egress metrics	2024-04-02 16:25:04 +02:00
ieQu1	c9de336234	feat(ds): Add metrics worker to the builtin db supervision tree	2024-04-02 16:25:04 +02:00
Andrew Mayorov	778e897f1f	chore(dsrepl): describe snapshot ownership and few shortcomings	2024-04-02 13:48:51 +02:00
Andrew Mayorov	c666c65c6a	test(ds): factor out storage iteration into helper module	2024-04-02 13:48:51 +02:00
Andrew Mayorov	7cebf598a8	chore(dsrepl): simplify snapshot transfer code a bit Co-Authored-By: Thales Macedo Garitezi <thalesmg@gmail.com>	2024-04-02 13:48:51 +02:00
Andrew Mayorov	e029b8f996	test(dsrepl): wait for whole cluster readiness To minimize the chance of flaky tests due to the shards not being completely online. Co-Authored-By: Thales Macedo Garitezi <thalesmg@gmail.com>	2024-04-02 13:48:50 +02:00
Andrew Mayorov	e8b06a6a9f	chore(dsrepl): mark few more BPAPI targets as obsolete	2024-04-02 13:48:50 +02:00
Andrew Mayorov	d31cd0c728	feat(ds): ensure LTS state ids are deterministic	2024-04-02 13:48:50 +02:00
Andrew Mayorov	2cd357a5bd	fix(ds): ensure store batch is idempotent wrt generations	2024-04-02 13:48:50 +02:00
Andrew Mayorov	77a022bd93	feat(dsrepl): transfer storage snapshot during ra snapshot recovery	2024-04-02 13:48:49 +02:00
Andrew Mayorov	b8b9b7739b	chore(ds): slightly simplify working with storage generations	2024-04-02 13:48:08 +02:00
Andrew Mayorov	fa66a640c3	fix(dsrepl): handle RPC errors gracefully when storage is down	2024-03-28 15:17:01 +01:00
Ivan Dyachkov	db9efb9317	chore: bump apps versions	2024-03-28 10:19:09 +01:00
Thales Macedo Garitezi	796c04e7a8	test: fix flaky test We should emit the trace event before replying to callers. Example failure: https://github.com/emqx/emqx/actions/runs/8378977952/job/22946318696#step:6:182 ``` =CRITICAL REPORT==== 21-Mar-2024::17:45:37.676024 === "check stage" failed: error {assertMatch,[{module,emqx_ds_storage_bitfield_lts_SUITE}, {line,270}, {expression,"? of_kind ( emqx_ds_replication_layer_egress_flush , Trace )"}, {pattern,"[ # { batch := [ _ , _ , _ ] } ]"}, {value,[]}]} Stacktrace: [{emqx_ds_storage_bitfield_lts_SUITE, '-t_atomic_store_batch/1-fun-1-',1, [{file, "/__w/emqx/emqx/apps/emqx_durable_storage/test/emqx_ds_storage_bitfield_lts_SUITE.erl"}, {line,270}]}, {emqx_ds_storage_bitfield_lts_SUITE,t_atomic_store_batch,1, [{file, "/__w/emqx/emqx/apps/emqx_durable_storage/test/emqx_ds_storage_bitfield_lts_SUITE.erl"}, {line,249}]}] ```	2024-03-21 15:47:29 -03:00
Thales Macedo Garitezi	68af211130	fix(ds): reply sync callers after raft store failure	2024-03-21 15:40:21 -03:00
Thales Macedo Garitezi	70737a437a	fix(ds): add caller to pending replies before flushing	2024-03-21 14:39:21 -03:00
Andrew Mayorov	fe50a1711b	fix(ds-egress): drop pending batch on failures Before this commit, messages in the current batch will be retried as part of next batch. This could have led to message duplication which is probably not what the user wants by default.	2024-03-20 13:20:25 +01:00
Andrew Mayorov	a1f5de3f5b	fix(dsrepl): turn memoize into a safer function	2024-03-20 13:20:24 +01:00
Andrew Mayorov	d39ca41070	chore(dsrepl): mark per-node `add_generation` RPC target obsolete Also annotate internal exports with comments according with their intended use.	2024-03-20 13:20:24 +01:00
Andrew Mayorov	35b18f9125	fix(dsrepl): properly handle error conditions in generation mgmt Also update few outdated typespecs. Also make error reasons easier to comprehend.	2024-03-20 13:20:24 +01:00
Andrew Mayorov	f2268aa69a	fix(dsrepl): use correct base dir for ra system stuff Co-Authored-By: Thales Macedo Garitezi <thalesmg@gmail.com>	2024-03-20 13:20:24 +01:00
Andrew Mayorov	404e919494	refactor(dsrepl): make shard allocator more robust and consistent Co-Authored-By: Thales Macedo Garitezi <thalesmg@gmail.com>	2024-03-20 13:20:24 +01:00
Andrew Mayorov	0e18bd6e80	fix(dsrepl): increase replication site id bitsize back In order to minimize chances of site id collision to practically zero.	2024-03-20 13:20:24 +01:00
Andrew Mayorov	ac9700dd28	fix(dsrepl): split shard allocator into a separate module	2024-03-20 13:20:23 +01:00
Andrew Mayorov	1b647035d0	chore(dsrepl): make dialyzer a bit happier	2024-03-20 13:20:23 +01:00
Andrew Mayorov	611b3f0e07	feat(dsrepl): use more straightforward way to drop ra shards	2024-03-20 13:20:23 +01:00
Andrew Mayorov	74881e8706	feat(dsrepl): make storage layer unaware of granularity of time Storage also becomes a bit more pure, depending on the upper layer to provide the timestamps, which also makes it possible to handle more operations idempotently.	2024-03-20 13:20:23 +01:00
Andrew Mayorov	3cb36a5619	feat(ds-lts): extract timestamp from storage key itself 1. This avoids the need to deserialize the message to get the timestamp. 2. It also makes possible to decouple the storage key timestamp from the message timestamp, which might be useful for replication purposes.	2024-03-19 20:21:56 +01:00
Andrew Mayorov	5cc0246351	feat(dsrepl): allow to tune select ra options	2024-03-19 20:21:55 +01:00
Andrew Mayorov	54b5adf868	feat(dsrepl): allocate shards predictably To ensure strictly optimal and fair shard allocation across cluster. Before this commit it was quite easy to end up with an allocation significantly skewed towards some node, because of the nature of randomness and relatively small number of shards.	2024-03-19 20:21:55 +01:00
Andrew Mayorov	887e151be5	fix(dsrepl): handle errors gracefully in shard egress process Also add cooldown on timeout / unavailability.	2024-03-19 20:21:53 +01:00
Andrew Mayorov	e16aee99b4	chore(dsrepl): clarify how to perform leadership transfer in runtime	2024-03-19 20:21:18 +01:00
Andrew Mayorov	00d509f27b	feat(dsrepl): prefer local replica in read path To optimize out any unnecessary RPCs. Given the load should be smoothed evenly across the cluster, choosing non-leader node is not a priority.	2024-03-19 20:11:42 +01:00
Andrew Mayorov	19305c223c	fix(dsrepl): require majority for replication-related tables	2024-03-19 20:11:42 +01:00
Andrew Mayorov	f89909f60c	fix(dsrepl): tolerate trigger election timeouts for existing servers	2024-03-19 20:11:42 +01:00

1 2 3 4 5 ...

320 Commits