zhongwencool
a1495689c0
fix: clean self node's cluster commit when leave cluster
2024-04-09 14:13:25 +08:00
JimMoen
6cb00cc8c7
Merge pull request #12844 from JimMoen/EMQX-12012/cpu-use-idle-two-decimal-places
...
fix: cpu usage and idle use two decimal places
2024-04-09 10:11:15 +08:00
Andrew Mayorov
d12e907209
fix(dsrepl): correctly handle ra membership change command results
...
Before this change, results similar to `{error, {no_more_servers_to_try,
[{error, nodedown}, {error, not_member}]}}` were considered retryable
failures, which is incorrect.
2024-04-08 22:44:34 +02:00
Andrew Mayorov
3223797ae5
fix(dsrepl): attempt leadership transfer before server removal
...
This should make it much less likely to hit weird edge cases that lead
to duplicate Raft log entries because of client retries upon receiving
`shutdown` from the leader being removed.
2024-04-08 22:43:58 +02:00
Andrew Mayorov
1e95bd4da6
test(dsrepl): test unresponsive nodes removal / node restarts
2024-04-08 21:27:56 +02:00
zmstone
41677eb785
refactor: make elvis happy
2024-04-08 21:25:58 +02:00
zmstone
bf12efac6d
fix(variform): add basic tests
2024-04-08 21:08:43 +02:00
Andrew Mayorov
7a836317ac
fix(dsrepl): trigger unfinished shard transition upon startup
...
Also provide a trivial API to trigger them by hand.
2024-04-08 16:12:42 +02:00
Andrew Mayorov
75bb7f5cdc
fix(dsrepl): retry only `{add, Site}` crashed membership transitions
...
To minimize the potential negative impact of removal transitions that
crash for some unknown and unusual reasons.
2024-04-08 16:04:33 +02:00
Kjell Winblad
9628a00a82
docs(emqx_rule_api apply rule): fix doc strings
2024-04-08 15:34:29 +02:00
Thales Macedo Garitezi
ba96edb061
fix(clients api): use alternative base64 function for OTP 25
...
Fixes https://github.com/emqx/emqx/pull/12798#discussion_r1555524603
2024-04-08 10:23:10 -03:00
Kjell Winblad
600526a0e4
test(emqx_bridge_http_SUITE): test case after name change
2024-04-08 14:54:33 +02:00
Kjell Winblad
02ee873094
docs(emqx_rule_api_schema): fix type spec
2024-04-08 13:42:15 +02:00
Andrew Mayorov
4c0cc079c2
fix(dsrepl): apply unnecessary rebalancing transitions cleanly
2024-04-08 13:25:45 +02:00
Andrew Mayorov
dcde30c38a
test(dsrepl): add two more testcases for rebalancing
2024-04-08 13:22:31 +02:00
Kjell Winblad
b57725f996
test(resource_SUITE): do test case fixes needed due to rule tracing work
2024-04-08 12:41:35 +02:00
Kjell Winblad
79440064fe
style: fix problems reported by elvis
2024-04-08 11:03:55 +02:00
JianBo He
e11c4a9c83
Merge pull request #12826 from emqx/import-source-bridges
...
fix: source bridges missing after restore the backup files
2024-04-08 16:02:15 +08:00
Zaiming (Stone) Shi
3e87d4bf9f
Merge pull request #12772 from killme2008/feature/upgrade-greptimedb-ingester-ci
...
feat: update greptimedb client lib and ci version
2024-04-08 09:02:03 +02:00
JimMoen
282cbb18be
fix: cpu usage and idle use two decimal places
...
- prometheus
- opentelemetry
2024-04-08 14:14:09 +08:00
Andrew Mayorov
2ace9bb893
chore(dsrepl): sprinkle few comments and typespecs for exports
2024-04-07 22:51:56 +02:00
Andrew Mayorov
ecaad348a7
chore(dsrepl): update few outdated comments / TODOs
2024-04-07 22:51:56 +02:00
Andrew Mayorov
6293efb995
fix(dsrepl): retry crashed membership transitions
2024-04-07 22:51:56 +02:00
Andrew Mayorov
826ce5806d
fix(dsrepl): ensure that new member UID matches server's UID
...
Before that change, UIDs supplied in the `ra:add_member/3` were not
the same as those servers were using. This haven't caused any issues
for some reason, but it's better to ensure that UIDs are the same.
2024-04-07 22:31:24 +02:00
Shawn
e89dc32c90
ci: run emqx_management both with ee and ce profile
2024-04-07 18:33:52 +08:00
firest
f40e47365c
fix(iotdb): correctly handle undefined value of bool type
2024-04-07 17:26:15 +08:00
Shawn
1c81c79a2c
chore: add testcase for importing retained msgs and sources
2024-04-07 17:24:26 +08:00
Shawn
9d1a69aaa9
fix: cannot import retained messages
2024-04-07 17:03:14 +08:00
Kjell Winblad
5479932190
feat(apply rule test): make option to stop action after render work
...
This commit makes the apply rule HTTP API option to stop an action work
for the HTTP action, and adds infrastructure that makes it easy to add
this functionality to other actions.
2024-04-06 17:21:12 +02:00
Kjell Winblad
ef705c2285
feat: add apply rule API, clientid/ruleid tracing for rule and connector
...
This commit adds:
* Support for forwarding the rule id and client id to the connector so
that events such as template rendered successfully can be traced.
* HTTP API for for applying/activating a rule with the given context
2024-04-06 17:20:47 +02:00
Thales Macedo Garitezi
04ba2aaf8a
Merge pull request #12830 from thalesmg/async-channel-hc-m-20240404
...
feat(resource): non-blocking channel health checks
2024-04-05 14:24:09 -03:00
Andrew Mayorov
556ffc78c9
feat(dsrepl): implement membership changes and rebalancing
2024-04-05 18:57:28 +02:00
Andrew Mayorov
d6058b7f51
feat(dsrepl): allow to subscribe to DB metadata changes
...
Currently, only shard metadata changes are announced to the
subscribers.
2024-04-05 17:40:55 +02:00
Andrew Mayorov
a07295d3bc
fix(ds): address shards in the supervisor properly
2024-04-05 17:40:38 +02:00
ieQu1
3f7b14c861
Merge pull request #12833 from ieQu1/dev/ds-cluster-api
...
feat(ds): Add REST API for durable storage
2024-04-05 17:33:20 +02:00
ieQu1
2504b8126b
feat(ds): Pass mgmt_ds REST API calls to the application
2024-04-05 15:22:06 +02:00
ieQu1
46261440cb
feat(ds): Add a CLI for managing DB replicas
2024-04-05 15:22:06 +02:00
ieQu1
a62db08676
feat(ds): Add REST API for durable storage
2024-04-05 15:22:06 +02:00
ieQu1
d09787d1a6
fix(ds): Fix return types in replication_layer_meta
2024-04-05 15:22:06 +02:00
Ivan Dyachkov
be47fe49ad
chore: bump ecql version to 0.7.0
...
PR: https://github.com/emqx/ecql/pull/13
No functional changes, just switch gen_fsm to gen_statem.
2024-04-05 13:31:33 +02:00
Andrew Mayorov
70396e9766
Merge pull request #12825 from keynslug/feat/EMQX-12110/repl-meta-api
...
feat(dsrepl): add APIs to manage DB replication sites
2024-04-04 22:32:03 +02:00
Andrew Mayorov
df6c5b35fe
feat(dsrepl): add more primitive operations to modify DB sites
2024-04-04 21:22:49 +02:00
Andrew Mayorov
bb8ffee18c
feat(dsrepl): add API to get current DB replication sites
2024-04-04 21:22:02 +02:00
Andrew Mayorov
ad52f7838e
feat(dsrepl): add APIs to manage DB replication sites
2024-04-04 21:22:01 +02:00
Thales Macedo Garitezi
60cad74286
feat(resource): non-blocking channel health checks
...
Fixes https://emqx.atlassian.net/browse/EMQX-12015
Continuation of https://github.com/emqx/emqx/pull/12812
2024-04-04 16:13:30 -03:00
Thales Macedo Garitezi
8d58b40f33
Merge pull request #12831 from thalesmg/ds-checkpoint-clean-m-20240404
...
feat(ds): clear all checkpoints when (re)starting storage layer
2024-04-04 15:41:50 -03:00
Thales Macedo Garitezi
217b35bce5
Merge pull request #12798 from thalesmg/ds-client-api-v2-m-20240327
...
feat(client mgmt api): add cursor-based list API
2024-04-04 15:10:49 -03:00
Thales Macedo Garitezi
c57c36adb2
feat(ds): clear all checkpoints when (re)starting storage layer
...
Fixes https://emqx.atlassian.net/browse/EMQX-12143
2024-04-04 14:05:52 -03:00
Kjell Winblad
59a442cdb5
feat(rule trace): add support for ruleid as a trace type
2024-04-04 14:55:32 +02:00
Thales Macedo Garitezi
069cd4fbb4
Merge pull request #12812 from thalesmg/async-res-manager-health-check-m-20240328
...
feat(resource manager): perform non-blocking health checks
2024-04-04 09:07:02 -03:00
zmstone
0e79b543cf
refactor: move variform to emqx_utils
2024-04-04 11:10:56 +02:00
ieQu1
f37ed3a40a
fix(ds): Limit the number of retries in egress to 0
2024-04-03 16:38:49 +02:00
Shawn
319ec50c0d
fix: source bridges missing after restore the backup files
2024-04-03 18:26:51 +08:00
ieQu1
2bbfada7af
fix(ds): Make async batches truly async
2024-04-03 11:57:47 +02:00
ieQu1
92ca90c0ca
fix(ds): Improve egress logging
2024-04-03 11:57:47 +02:00
ieQu1
ae5935e7f7
test(ds): Attempt to stabilize metrics_worker tests in CI
2024-04-02 19:14:10 +02:00
ieQu1
4382971443
fix(ds): Preserve errors in the egress
2024-04-02 16:47:43 +02:00
ieQu1
94ca7ad0f8
feat(ds): Report counters for LTS storage layout
2024-04-02 16:47:43 +02:00
ieQu1
f14c253dea
fix(prometheus): Don't add DS metrics when feature is disabled
2024-04-02 16:47:43 +02:00
ieQu1
b379f331de
fix(sessds): Handle errors when storing messages
2024-04-02 16:47:41 +02:00
ieQu1
f41e538526
feat(sessds): Observe next time
2024-04-02 16:45:52 +02:00
ieQu1
b9ad241658
feat(sessds): Add metrics for the number of persisted messages
2024-04-02 16:45:52 +02:00
ieQu1
75b092bf0e
fix(ds): Actually retry sending batch
2024-04-02 16:45:49 +02:00
ieQu1
0de255cac8
feat(ds): Report egress flush time
2024-04-02 16:25:04 +02:00
ieQu1
044f3d4ef5
fix(ds): Don't reverse entries in the atomic batch
2024-04-02 16:25:04 +02:00
ieQu1
606f2a88cd
feat(ds): Add egress metrics
2024-04-02 16:25:04 +02:00
ieQu1
c9de336234
feat(ds): Add metrics worker to the builtin db supervision tree
2024-04-02 16:25:04 +02:00
ieQu1
d8204021dc
refactor(metrics): Move metrics worker to emqx_utils application
2024-04-02 16:25:04 +02:00
Thales Macedo Garitezi
2097e854fc
feat(client mgmt api): add cursor-based list API
...
Fixes https://emqx.atlassian.net/browse/EMQX-12028
2024-04-02 10:55:28 -03:00
Andrew Mayorov
778e897f1f
chore(dsrepl): describe snapshot ownership and few shortcomings
2024-04-02 13:48:51 +02:00
Andrew Mayorov
c666c65c6a
test(ds): factor out storage iteration into helper module
2024-04-02 13:48:51 +02:00
Andrew Mayorov
7cebf598a8
chore(dsrepl): simplify snapshot transfer code a bit
...
Co-Authored-By: Thales Macedo Garitezi <thalesmg@gmail.com>
2024-04-02 13:48:51 +02:00
Andrew Mayorov
e029b8f996
test(dsrepl): wait for whole cluster readiness
...
To minimize the chance of flaky tests due to the shards not being
completely online.
Co-Authored-By: Thales Macedo Garitezi <thalesmg@gmail.com>
2024-04-02 13:48:50 +02:00
Andrew Mayorov
e8b06a6a9f
chore(dsrepl): mark few more BPAPI targets as obsolete
2024-04-02 13:48:50 +02:00
Andrew Mayorov
d31cd0c728
feat(ds): ensure LTS state ids are deterministic
2024-04-02 13:48:50 +02:00
Andrew Mayorov
2cd357a5bd
fix(ds): ensure store batch is idempotent wrt generations
2024-04-02 13:48:50 +02:00
Andrew Mayorov
77a022bd93
feat(dsrepl): transfer storage snapshot during ra snapshot recovery
2024-04-02 13:48:49 +02:00
Andrew Mayorov
b8b9b7739b
chore(ds): slightly simplify working with storage generations
2024-04-02 13:48:08 +02:00
Andrew Mayorov
2d074df209
Merge pull request #12797 from keynslug/fix/dsrepl-error-handling
...
fix(dsrepl): handle RPC errors gracefully when storage is down
2024-04-02 13:40:31 +02:00
JimMoen
5759ba5162
chore: bump app version
2024-04-02 17:09:22 +08:00
JimMoen
50bceee9ab
fix(stats): `'subscribers.count'` contains shared-subscriber
2024-04-02 16:56:40 +08:00
JimMoen
0f4b148294
refactor: uniform shared_sub table macros
2024-04-02 16:56:39 +08:00
JimMoen
1a4cfc2a2d
fix(api_schema): removed metrics schema in api spec
...
- Followup [PR#6622](https://github.com/emqx/emqx/pull/6622 ).
2024-04-02 16:56:36 +08:00
Thales Macedo Garitezi
bade09b56e
feat(resource manager): perform non-blocking resource health checks
...
Fixes https://emqx.atlassian.net/browse/EMQX-12015
This introduces only _resource_ non-blocking health checks. _Channel_ non-blocking health
checks may be introduced later.
2024-04-01 14:46:15 -03:00
Serge Tupchii
c62410ff75
refactor: remove already bound variable
2024-04-01 17:03:50 +03:00
Serge Tupchii
ceb04ba06d
fix(emqx_mgmt): do not attempt to get a stacktrace of a remote client connection process
2024-04-01 16:42:12 +03:00
Serge Tupchii
42af1f9d63
fix: handle internal timeout errors in client Mqueue/Inflight APIs
2024-03-29 23:03:35 +02:00
Serge Tupchii
f5a820cb10
fix(emqx_mgmt): catch OOM shutdown exits properly when calling a client conn process
...
The exit reason is expected to include gen_server `Location`:
`{{shutdown, OOMInfo}, Location}`.
2024-03-29 13:09:08 +02:00
zmstone
bfca3ebc71
feat(variform): support array syntax '[' and ']'
2024-03-28 19:34:57 +01:00
zmstone
5f26e4ed5e
feat(variform): implement variform engine
2024-03-28 19:34:57 +01:00
SergeTupchiy
2e528d1dd8
Merge pull request #12802 from SergeTupchiy/EMQX-11826-prevent-left-node-from-rejoining-5.6.1
...
prevent a left node from rejoining the same cluster
2024-03-28 19:49:18 +02:00
zmstone
ad95473aae
refactor: move string functions to emqx_variform
2024-03-28 18:03:37 +01:00
Serge Tupchii
3eda182e9a
fix: prevent a node from discovering and re-joining the same cluster after it has (manually) left it.
2024-03-28 18:09:27 +02:00
zmstone
9bf65a415b
feat(variform): add a variable transformer
2024-03-28 16:11:26 +01:00
Andrew Mayorov
35c43eb8a0
feat(sessds): handle recoverable errors in stream scheduler
2024-03-28 15:17:01 +01:00
Andrew Mayorov
fa66a640c3
fix(dsrepl): handle RPC errors gracefully when storage is down
2024-03-28 15:17:01 +01:00
SergeTupchiy
63c017a72f
Merge pull request #12804 from SergeTupchiy/EMQX-12058-improve-force-shutdown-error-reason-5.6.1
...
chore: rename `message_queue_too_long` error reason to `mailbox_overflow` (5.6.1 port)
2024-03-28 16:14:51 +02:00
Thales Macedo Garitezi
8fb4ef9fe3
test: fix flaky test
2024-03-28 10:53:44 -03:00
SergeTupchiy
93c87fcb25
Merge pull request #12803 from SergeTupchiy/EMQX-11808-remove-uploaded-invalid-backups-5.6.1
...
fix(emqx_mgmt_data_backup): remove an uploaded backup file if it's not valid (5.6.1 port)
2024-03-28 15:37:10 +02:00
Thales Macedo Garitezi
04bf763890
fix(kafka-based bridges): avoid trying to get raw config for replayq dir
...
Fixes https://emqx.atlassian.net/browse/EMQX-12049
2024-03-28 09:13:34 -03:00