yuanbiao/emqx - emqx

Commit Graph

Author	SHA1	Message	Date
zhongwencool	a953b951fe	Merge branch 'master' into sync-release-50-to-master	2023-05-12 18:01:58 +08:00
Thales Macedo Garitezi	64dc9ed46a	perf(metrics): avoid increasing counters by 0 Some performance tests indicate that calling `telemetry` is costly in hot paths. Since increasing a counter by 0 is a no-op, we should avoid calling `telemetry` if the amount to increase is 0.	2023-05-11 15:13:37 -03:00
Kjell Winblad	70cf1533db	feat: add RabbitMQ bridge	2023-05-09 14:32:26 +02:00
Zaiming (Stone) Shi	13dcb5732f	Merge remote-tracking branch 'origin/release-50' into 0508-prepare-for-e5.0.4	2023-05-08 21:29:35 +02:00
Thales Macedo Garitezi	eba627b365	fix(buffer_worker): fix inflight count when updating inflight item	2023-05-08 09:27:51 -03:00
Zhongwen Deng	4f396a36a9	Merge remote-tracking branch 'upstream/master' into release-50	2023-05-08 14:58:03 +08:00
Thales Macedo Garitezi	8aa7c014e7	perf(buffer_worker): avoid calling `ets:info/2` (Almost?) fixes https://emqx.atlassian.net/browse/EMQX-9637 During the course of performance tests comparing the performance of e5.0.3 and e4.4.16 regarding the webhook bridge in sync mode, we observed that the throughput in e5.0.3 (sync) was much lower than in e4.4.16: ~ 9 k msgs / s vs. ~ 50 k msgs / s, respectively. Analyzing `observer_cli` output, we noticed that a lot of the time both buffer workers and ehttpc processes was spent in `ets:info/2`. That function was called to check the size of the inflight table when updating metrics and checking if the inflight table was full. Other uses of `ets:info/2` were contained inside the arguments to some `?tp/2` macro usages (https://github.com/kafka4beam/snabbkaffe/pull/60). By using a specific record to track the size of the table, we managed to improve the bridge performance to ~ 45 k msgs / s in sync mode.	2023-05-02 17:05:32 -03:00
Andrew Mayorov	670709f746	feat(resource): ensure uniqueness through `gproc` Also use it instead of a custom ETS table for simplicity and better consistency. This has drawbacks though: expect slightly increased load on gproc gen_server due to how `gproc:set_value/2` works.	2023-05-02 17:29:22 +03:00
Andrew Mayorov	4575167607	feat(resource): drop `manager_id()` type	2023-05-02 17:29:20 +03:00
Andrew Mayorov	aaef95b1da	feat(resman): stop adding uniqueness to manager ids Before this change, a separate `manager_id` / `instance_id` was used as resource manager id, which made connector interface somewhat inconsistent: part of function calls to connector implementation used instance id as first argument while the rest used resource id itself.	2023-05-02 17:28:26 +03:00
Thales Macedo Garitezi	7853a4c36e	chore: bump app vsns	2023-04-27 11:58:28 -03:00
Thales Macedo Garitezi	567413389c	Merge pull request #10519 from thalesmg/fix-flaky-res-test-v50 test(resource): fix flaky test	2023-04-27 09:33:40 -03:00
Thales Macedo Garitezi	c53741a08c	fix(buffer_worker): avoid sending late reply messages to callers Fixes https://emqx.atlassian.net/browse/EMQX-9635 During a sync call from process `A` to a buffer worker `B`, its call to the underlying resource `C` can be very slow. In those cases, `A` will receive a timeout response and expect no more messages from `B` nor `C`. However, prior to this fix, if `B` is stuck in a long sync call to `C` and then gets its response after `A` timed out, `B` would still send the late response to `A`, polluting its mailbox.	2023-04-26 13:18:28 -03:00
Thales Macedo Garitezi	d78312e10e	test(resource): fix flaky test	2023-04-26 09:25:33 -03:00
zhongwencool	9d893b49eb	Merge branch 'master' into sync-release-50-to-master	2023-04-26 10:54:46 +08:00
Thales Macedo Garitezi	ad4be08bb2	feat: implement Pulsar Producer bridge (e5.0) Fixes https://emqx.atlassian.net/browse/EMQX-8398	2023-04-24 10:28:26 -03:00
firest	7d2c336ab7	fix(resource): make sure resource will not crash when stopping	2023-04-23 15:31:08 +08:00
Serge Tupchii	423a30fbb3	fix(emqx_alarm): add safe call API to activate/deactivate alarms and use it in resource_manager Don't let 'emqx_resource_manager' crash because of emqx_alarm timeouts. Fixes: EMQX-9529/#10357	2023-04-20 17:15:13 +03:00
Serge Tupchii	b5eda9f0d1	perf(emqx_resource): don't reactivate alarms on reoccurring errors Avoid unnecessary calls to activate an alarm if it has been already activated. Fixes: EMQX-9529/#10357	2023-04-20 16:37:33 +03:00
Thales Macedo Garitezi	cb995e2033	fix(buffer_worker): avoid sending late reply messages to callers Fixes https://emqx.atlassian.net/browse/EMQX-9635 During a sync call from process `A` to a buffer worker `B`, its call to the underlying resource `C` can be very slow. In those cases, `A` will receive a timeout response and expect no more messages from `B` nor `C`. However, prior to this fix, if `B` is stuck in a long sync call to `C` and then gets its response after `A` timed out, `B` would still send the late response to `A`, polluting its mailbox.	2023-04-19 18:27:10 -03:00
Ivan Dyachkov	dc78ecb41c	chore: merge upstream/master	2023-04-18 17:33:32 +02:00
Andrew Mayorov	21e19a33ce	feat(respool): switch to `emqx_resource_pool` Which was previously known as `emqx_plugin_libs_pool`. This is part of the effort to get rid of `emqx_plugin_libs` application.	2023-04-18 12:51:14 +03:00
Ivan Dyachkov	9fc8a498f8	chore: bump apps versions	2023-04-17 09:09:08 +02:00
Stefan Strigler	7df0493312	Merge pull request #10390 from sstrigler/EMQX-9549-new-emqx-utils-app-to-collect-utility-modules New emqx_utils app to collect utility modules	2023-04-14 20:33:11 +02:00
Thales Macedo Garitezi	e073bc90bc	refactor(buffer_worker): rename `s/queue/buffer/g`	2023-04-14 11:37:19 -03:00
Thales Macedo Garitezi	14ed4a7ada	feat(buffer_worker): set default queue mode to `memory_only` Fixes https://emqx.atlassian.net/browse/EMQX-9367 For better user experience and performance for the average bridge, we should change the default queue mode to `memory_only`, as was the behavior of most bridges in e4.x. This leads to better performance when message rate is high enough and the remote resource is not keeping up with EMQX. Also, we set the default segment size to equal max queue bytes.	2023-04-14 11:37:19 -03:00
Thales Macedo Garitezi	4de13d2800	feat(buffer_worker): change default max queue bytes to 256 MB	2023-04-14 09:31:33 -03:00
Stefan Strigler	9c11bfce80	refactor: rename emqx_misc to emqx_utils	2023-04-14 13:41:27 +02:00
Andrew Mayorov	5e92ba6fa9	Merge pull request #10359 from ft/EMQX-9136/no-ask-metrics feat(resource): ask for metrics only when needed	2023-04-14 12:28:52 +03:00
Ivan Dyachkov	bdffa925db	chore: merge upstream/master release-50	2023-04-12 15:30:20 +02:00
Andrew Mayorov	9c9f39d0f7	feat(resman): also move out metrics collection for debugging Now `emqx_resource:list_instances_verbose/0` will populate the metrics for each instance, for the sake of simplicity.	2023-04-12 16:14:42 +03:00
Andrew Mayorov	e70deae1c3	feat(resource): ask for metrics only when needed	2023-04-11 12:00:19 +03:00
Zaiming (Stone) Shi	a9bf633e03	Merge pull request #10320 from zmstone/0403-sync-release-50-back-to-master 0403 sync release 50 back to master	2023-04-04 23:31:24 +02:00
Zaiming (Stone) Shi	68c15ffd48	Merge remote-tracking branch 'origin/release-50' into 0403-sync-release-50-back-to-master	2023-04-04 16:42:58 +02:00
Thales Macedo Garitezi	0b6fd7fe14	fix(buffer_worker): check request timeout and health check interval Port of https://github.com/emqx/emqx/pull/10154 for `release-50` Fixes https://emqx.atlassian.net/browse/EMQX-9099 Originally, the `resume_interval`, which is what defines how often a buffer worker will attempt to retry its inflight window, was set to the same as the `health_check_interval`. This had the problem that, with default values, `health_check_interval = request_timeout`. This meant that, if a buffer worker with those configs were ever blocked, all requests would have timed out by the time it retried them. Here we change the default `resume_interval` to a reasonable value dependent on `health_check_interval` and `request_timeout`, and also expose that as a hidden parameter for fine tuning if necessary.	2023-04-04 08:58:36 -03:00
Thales Macedo Garitezi	f3ffc02bff	feat(bridges): enable async query mode for all bridges with buffer workers Fixes https://emqx.atlassian.net/browse/EMQX-9130 Since buffer workers always support async calls ("outer calls"), we should decouple those two call modes (inner and outer), and avoid exposing the inner call configuration to user to avoid complexity. For bridges that currently only allow sync query modes, we should allow them to be configured with async. That means basically all bridge types except Kafka Producer.	2023-04-03 14:49:51 -03:00
Zaiming (Stone) Shi	36000abf51	refactor: relocate i18n files for apps/emqx	2023-04-03 13:12:24 +02:00
zhongwencool	d63680cf25	Merge pull request #10307 from emqx/release-50 Sync release-50 back to master	2023-04-02 11:36:41 +08:00
Thales Macedo Garitezi	246a792965	Merge pull request #10273 from thalesmg/refactor-kprod-start-error-msg-rv50 fix: return friendly message when kafka producer and consumer fails to start (rv5.0)	2023-03-31 16:25:26 -03:00
Thales Macedo Garitezi	5011486b18	fix(kafka_consumer): return better error messages when probing kafka consumer bridge Fixes https://emqx.atlassian.net/browse/EMQX-9422	2023-03-31 11:33:15 -03:00
Zaiming (Stone) Shi	bcde52383b	docs: fix max batch size desc	2023-03-31 12:35:27 +02:00
Thales Macedo Garitezi	632bffd451	fix: return friendly message when kafka producer fails to start (rv5.0) Fixes https://emqx.atlassian.net/browse/EMQX-9392 The returned information does not allow to diagnose the issue (i.e.: a connection issue due to the wrong host and port, the wrong password failing authn). However, such information is printed to the logs. This changes the returned error to the API so that the user is hinted at looking at the logs for further investigation of the error.	2023-03-30 11:51:36 -03:00
Kjell Winblad	8e0d315b7b	Merge pull request #10197 from kjellwinblad/0321-fix-inflight-window-hand-over-to-kjell fix: add inflight window setting to the clickhouse bridge	2023-03-29 09:38:24 +02:00
Zaiming (Stone) Shi	d07987288a	chore: add some example annotations for config importance level	2023-03-28 14:29:24 +02:00
Zaiming (Stone) Shi	dd996ad1dc	chore: bump app vsns	2023-03-24 21:47:15 +01:00
Thales Macedo Garitezi	ff272a2071	Merge pull request #10206 from thalesmg/decouple-buffer-worker-query-call-mode-v50 feat(buffer_worker): decouple query mode from underlying connector call mode	2023-03-24 13:49:00 -03:00
Thales Macedo Garitezi	f8d5d53908	feat(buffer_worker): decouple query mode from underlying connector call mode Fixes https://emqx.atlassian.net/browse/EMQX-9129 Currently, if an user configures a bridge with query mode sync, then all calls to the underlying driver/connector ("inner calls") will always be synchronous, regardless of its support for async calls. Since buffer workers always support async queries ("outer calls"), we should decouple those two call modes (inner and outer), and avoid exposing the inner call configuration to user to avoid complexity. There are two situations when we want to force synchronous calls to the underlying connector even if it supports async: 1) When using `simple_sync_query`, since we are bypassing the buffer workers; 2) When retrying the inflight window, to avoid overwhelming the driver.	2023-03-23 13:40:31 -03:00
Kjell Winblad	35474578ca	refactor: rename async_inflight_window to inflight_window everywhere	2023-03-23 14:21:57 +01:00
Kjell Winblad	9d3f369cca	docs: fix spelling mistake Co-authored-by: Thales Macedo Garitezi <thalesmg@gmail.com>	2023-03-23 14:09:57 +01:00
Thales Macedo Garitezi	ddffba0355	Merge pull request #10154 from thalesmg/fix-buffer-worker-default-req-timeout fix(buffer_worker): calculate default `resume_interval` based on `request_timeout` and `health_check_interval`	2023-03-22 20:21:04 -03:00

1 2 3 4 5 ...

422 Commits