Commit Graph

11489 Commits

Author SHA1 Message Date
Thales Macedo Garitezi 04ba2aaf8a
Merge pull request #12830 from thalesmg/async-channel-hc-m-20240404
feat(resource): non-blocking channel health checks
2024-04-05 14:24:09 -03:00
ieQu1 3f7b14c861
Merge pull request #12833 from ieQu1/dev/ds-cluster-api
feat(ds): Add REST API for durable storage
2024-04-05 17:33:20 +02:00
ieQu1 2504b8126b
feat(ds): Pass mgmt_ds REST API calls to the application 2024-04-05 15:22:06 +02:00
ieQu1 46261440cb
feat(ds): Add a CLI for managing DB replicas 2024-04-05 15:22:06 +02:00
ieQu1 a62db08676
feat(ds): Add REST API for durable storage 2024-04-05 15:22:06 +02:00
ieQu1 d09787d1a6
fix(ds): Fix return types in replication_layer_meta 2024-04-05 15:22:06 +02:00
Ivan Dyachkov be47fe49ad chore: bump ecql version to 0.7.0
PR: https://github.com/emqx/ecql/pull/13
No functional changes, just switch gen_fsm to gen_statem.
2024-04-05 13:31:33 +02:00
Andrew Mayorov 70396e9766
Merge pull request #12825 from keynslug/feat/EMQX-12110/repl-meta-api
feat(dsrepl): add APIs to manage DB replication sites
2024-04-04 22:32:03 +02:00
Andrew Mayorov df6c5b35fe
feat(dsrepl): add more primitive operations to modify DB sites 2024-04-04 21:22:49 +02:00
Andrew Mayorov bb8ffee18c
feat(dsrepl): add API to get current DB replication sites 2024-04-04 21:22:02 +02:00
Andrew Mayorov ad52f7838e
feat(dsrepl): add APIs to manage DB replication sites 2024-04-04 21:22:01 +02:00
Thales Macedo Garitezi 60cad74286 feat(resource): non-blocking channel health checks
Fixes https://emqx.atlassian.net/browse/EMQX-12015

Continuation of https://github.com/emqx/emqx/pull/12812
2024-04-04 16:13:30 -03:00
Thales Macedo Garitezi 8d58b40f33
Merge pull request #12831 from thalesmg/ds-checkpoint-clean-m-20240404
feat(ds): clear all checkpoints when (re)starting storage layer
2024-04-04 15:41:50 -03:00
Thales Macedo Garitezi 217b35bce5
Merge pull request #12798 from thalesmg/ds-client-api-v2-m-20240327
feat(client mgmt api): add cursor-based list API
2024-04-04 15:10:49 -03:00
Thales Macedo Garitezi c57c36adb2 feat(ds): clear all checkpoints when (re)starting storage layer
Fixes https://emqx.atlassian.net/browse/EMQX-12143
2024-04-04 14:05:52 -03:00
Thales Macedo Garitezi 069cd4fbb4
Merge pull request #12812 from thalesmg/async-res-manager-health-check-m-20240328
feat(resource manager): perform non-blocking health checks
2024-04-04 09:07:02 -03:00
ieQu1 f37ed3a40a fix(ds): Limit the number of retries in egress to 0 2024-04-03 16:38:49 +02:00
ieQu1 2bbfada7af
fix(ds): Make async batches truly async 2024-04-03 11:57:47 +02:00
ieQu1 92ca90c0ca
fix(ds): Improve egress logging 2024-04-03 11:57:47 +02:00
ieQu1 ae5935e7f7
test(ds): Attempt to stabilize metrics_worker tests in CI 2024-04-02 19:14:10 +02:00
ieQu1 4382971443
fix(ds): Preserve errors in the egress 2024-04-02 16:47:43 +02:00
ieQu1 94ca7ad0f8
feat(ds): Report counters for LTS storage layout 2024-04-02 16:47:43 +02:00
ieQu1 f14c253dea
fix(prometheus): Don't add DS metrics when feature is disabled 2024-04-02 16:47:43 +02:00
ieQu1 b379f331de
fix(sessds): Handle errors when storing messages 2024-04-02 16:47:41 +02:00
ieQu1 f41e538526
feat(sessds): Observe next time 2024-04-02 16:45:52 +02:00
ieQu1 b9ad241658
feat(sessds): Add metrics for the number of persisted messages 2024-04-02 16:45:52 +02:00
ieQu1 75b092bf0e
fix(ds): Actually retry sending batch 2024-04-02 16:45:49 +02:00
ieQu1 0de255cac8
feat(ds): Report egress flush time 2024-04-02 16:25:04 +02:00
ieQu1 044f3d4ef5
fix(ds): Don't reverse entries in the atomic batch 2024-04-02 16:25:04 +02:00
ieQu1 606f2a88cd
feat(ds): Add egress metrics 2024-04-02 16:25:04 +02:00
ieQu1 c9de336234
feat(ds): Add metrics worker to the builtin db supervision tree 2024-04-02 16:25:04 +02:00
ieQu1 d8204021dc
refactor(metrics): Move metrics worker to emqx_utils application 2024-04-02 16:25:04 +02:00
Thales Macedo Garitezi 2097e854fc feat(client mgmt api): add cursor-based list API
Fixes https://emqx.atlassian.net/browse/EMQX-12028
2024-04-02 10:55:28 -03:00
Andrew Mayorov 778e897f1f
chore(dsrepl): describe snapshot ownership and few shortcomings 2024-04-02 13:48:51 +02:00
Andrew Mayorov c666c65c6a
test(ds): factor out storage iteration into helper module 2024-04-02 13:48:51 +02:00
Andrew Mayorov 7cebf598a8
chore(dsrepl): simplify snapshot transfer code a bit
Co-Authored-By: Thales Macedo Garitezi <thalesmg@gmail.com>
2024-04-02 13:48:51 +02:00
Andrew Mayorov e029b8f996
test(dsrepl): wait for whole cluster readiness
To minimize the chance of flaky tests due to the shards not being
completely online.

Co-Authored-By: Thales Macedo Garitezi <thalesmg@gmail.com>
2024-04-02 13:48:50 +02:00
Andrew Mayorov e8b06a6a9f
chore(dsrepl): mark few more BPAPI targets as obsolete 2024-04-02 13:48:50 +02:00
Andrew Mayorov d31cd0c728
feat(ds): ensure LTS state ids are deterministic 2024-04-02 13:48:50 +02:00
Andrew Mayorov 2cd357a5bd
fix(ds): ensure store batch is idempotent wrt generations 2024-04-02 13:48:50 +02:00
Andrew Mayorov 77a022bd93
feat(dsrepl): transfer storage snapshot during ra snapshot recovery 2024-04-02 13:48:49 +02:00
Andrew Mayorov b8b9b7739b
chore(ds): slightly simplify working with storage generations 2024-04-02 13:48:08 +02:00
Andrew Mayorov 2d074df209
Merge pull request #12797 from keynslug/fix/dsrepl-error-handling
fix(dsrepl): handle RPC errors gracefully when storage is down
2024-04-02 13:40:31 +02:00
Thales Macedo Garitezi bade09b56e feat(resource manager): perform non-blocking resource health checks
Fixes https://emqx.atlassian.net/browse/EMQX-12015

This introduces only _resource_ non-blocking health checks.  _Channel_ non-blocking health
checks may be introduced later.
2024-04-01 14:46:15 -03:00
Serge Tupchii c62410ff75 refactor: remove already bound variable 2024-04-01 17:03:50 +03:00
Andrew Mayorov 35c43eb8a0
feat(sessds): handle recoverable errors in stream scheduler 2024-03-28 15:17:01 +01:00
Andrew Mayorov fa66a640c3
fix(dsrepl): handle RPC errors gracefully when storage is down 2024-03-28 15:17:01 +01:00
Thales Macedo Garitezi 8fb4ef9fe3 test: fix flaky test 2024-03-28 10:53:44 -03:00
ieQu1 b1855f95c1
fix(bpapi): Add exceptions for experimental features 2024-03-28 12:07:45 +01:00
ieQu1 8c6d8bdd12
fix(bpapi): Add exceptions for experimental features 2024-03-28 12:03:41 +01:00