yuanbiao/emqx - emqx

Commit Graph

Author	SHA1	Message	Date
Thales Macedo Garitezi	29ae45c39d	fix(connector): don't start buffer workers for the connector itself Fixes https://emqx.atlassian.net/browse/EMQX-11448	2023-12-01 17:20:38 -03:00
Thales Macedo Garitezi	9e1796ec4f	feat(gcp_pubsub_producer): migrate GCP PubSub producer to actions Fixes https://emqx.atlassian.net/browse/EMQX-11157	2023-11-21 14:22:42 -03:00
Thales Macedo Garitezi	b92821188b	fix(kafka_producer): make status `connecting` while the client fails to connect Fixes https://emqx.atlassian.net/browse/EMQX-11408 To make it consistent with the previous bridge behavior. Also, introduces macros for resource status to avoid problems with typos.	2023-11-16 14:50:23 -03:00
Thales Macedo Garitezi	7dcdbc9e51	fix(resource): take error from action/connector before attempting query Fixes https://emqx.atlassian.net/browse/EMQX-11284 Fixes https://emqx.atlassian.net/browse/EMQX-11298	2023-11-07 10:04:04 -03:00
Kjell Winblad	9dc3a169b3	feat: split bridges into a connector part and a bridge part Co-authored-by: Thales Macedo Garitezi <thalesmg@gmail.com> Co-authored-by: Stefan Strigler <stefan.strigler@emqx.io> Co-authored-by: Zaiming (Stone) Shi <zmstone@gmail.com> Several bridges should be able to share a connector pool defined by a single connector. The connectors should be possible to enable and disable similar to how one can disable and enable bridges. There should also be an API for checking the status of a connector and for add/edit/delete connectors similar to the current bridge API. Issues: https://emqx.atlassian.net/browse/EMQX-10805	2023-10-30 14:48:47 +01:00
Thales Macedo Garitezi	cf2075d7d8	chore: remove mention of `is_buffer_supported` from typespec	2023-10-10 09:49:18 -03:00
Thales Macedo Garitezi	902b1d6ec5	fix(pulsar_producer): use `simple_async_internal_buffer` query mode for Pulsar Since it has internal buffering, it necessitates the same fix as Kafka producer.	2023-10-09 15:02:25 -03:00
Thales Macedo Garitezi	79cf0a2ced	fix(kafka_producer): correctly handle metrics for connector that have internal buffers Fixes https://emqx.atlassian.net/browse/EMQX-11086 There’s currently a metric inconsistency due to the internal buffering nature of Kafka Producer (wolff). We use simple_sync_query to call the Kafka Producer bridge. If that times out, the call is accounted as failed, even though the message is buffered in wolff and later sent successfully.	2023-10-09 15:02:25 -03:00
Thales Macedo Garitezi	eb41b77de4	fix(rule_metrics): notify rule metrics of late replies and expired requests Fixes https://emqx.atlassian.net/browse/EMQX-10600	2023-07-19 11:39:28 -03:00
zhongwencool	b5cc8fb3c3	fix: start_after_created's default value	2023-07-07 16:39:26 +08:00
Stefan Strigler	321fd53132	fix: use ReplyTo in QUERY for async	2023-06-29 16:09:45 +02:00
Thales Macedo Garitezi	7f850f7499	fix(resource): fix `query_mode/0` type and usage	2023-06-19 15:59:00 -03:00
Kjell Winblad	2671e8ecf9	fix: dialyzer type problem	2023-06-09 11:00:05 +02:00
Thales Macedo Garitezi	99796224d8	refactor(resource): rename `request_timeout` -> `request_ttl` See https://emqx.atlassian.net/wiki/spaces/P/pages/612368639/open+e5.1+remove+auto+restart+interval+from+buffer+worker+resource+options	2023-06-01 13:01:53 -03:00
Thales Macedo Garitezi	f42ccb6262	feat(resource): increase default request timeout to 45 s See https://emqx.atlassian.net/wiki/spaces/P/pages/612368639/open+e5.1+remove+auto+restart+interval+from+buffer+worker+resource+options	2023-06-01 11:20:06 -03:00
Thales Macedo Garitezi	10425eb925	feat(resource): deprecate `auto_restart_interval` in favor of `health_check_interval` See: https://emqx.atlassian.net/wiki/spaces/P/pages/612368639/open+e5.1+remove+auto+restart+interval+from+buffer+worker+resource+options Current problem: In 5.0.x, we have two timer options that control the state changing of buffer worker resources: auto_restart_interval and health_check_interval. - auto_restart_interval controls how often the resource attempts to transition from disconnected to connected. - health_check_interval controls how often the resource is checked and potentially moved from connected to disconnected or connecting. The existence of two independent timers for very similar purposes is confusing to users, QA and even developers. Also, an intimately related configuration is request_timeout, which can interact badly with auto_restart_interval if the latter is poorly configured: requests may always expire if request_timeout < auto_restart_interval and if the resource enters the disconnected state. For health_check_interval, we attempt to derive a sane default that gives requests a chance to retry (if request timeout is finite, then the resource retries requests with a period of min(health_check_interval, request_timeout / 3). Another problem with the separate auto_restart_interval is that its default value (60 s) is too high when compared to the default request timeout and health check, leading to the problems described above if not tuned. Proposed solution: We propose to drop auto_restart_interval in favor of health_check_interval, which will be used for both disconnected -> connected and connected -> {disconnected, connecting} transition checks. With that, the resource will attempt to reconnect at the same interval as the health check, which currently is 15 s. Also, as two smaller changes to accompany this one: - Increase the default request_timeout from 15 s to 45 s. - Rename request_timeout to request_ttl.	2023-06-01 11:20:06 -03:00
Zaiming (Stone) Shi	cc5b4d3748	Merge remote-tracking branch 'origin/release-50' into 0526-ci-delete-otp-24-from-standalone-app-test	2023-05-26 15:58:16 +02:00
JianBo He	71b636e321	fix: fix auto_restart_interval checker	2023-05-25 12:04:23 +08:00
Thales Macedo Garitezi	fd2940cd77	feat(pulsar): ensure allocated resources are removed on failures (v5.0) Fixes https://emqx.atlassian.net/browse/EMQX-9937	2023-05-24 12:29:00 -03:00
Thales Macedo Garitezi	7d798c10e9	perf(buffer_worker): flush metrics periodically inside buffer worker process Fixes https://emqx.atlassian.net/browse/EMQX-9905 Since calling `telemetry` is costly in a hot path, we instead collect metrics inside the buffer workers state and periodically flush them, rather than immediately as events happen.	2023-05-22 09:11:23 -03:00
Andrew Mayorov	4575167607	feat(resource): drop `manager_id()` type	2023-05-02 17:29:20 +03:00
Thales Macedo Garitezi	e073bc90bc	refactor(buffer_worker): rename `s/queue/buffer/g`	2023-04-14 11:37:19 -03:00
Thales Macedo Garitezi	14ed4a7ada	feat(buffer_worker): set default queue mode to `memory_only` Fixes https://emqx.atlassian.net/browse/EMQX-9367 For better user experience and performance for the average bridge, we should change the default queue mode to `memory_only`, as was the behavior of most bridges in e4.x. This leads to better performance when message rate is high enough and the remote resource is not keeping up with EMQX. Also, we set the default segment size to equal max queue bytes.	2023-04-14 11:37:19 -03:00
Thales Macedo Garitezi	4de13d2800	feat(buffer_worker): change default max queue bytes to 256 MB	2023-04-14 09:31:33 -03:00
Andrew Mayorov	9c9f39d0f7	feat(resman): also move out metrics collection for debugging Now `emqx_resource:list_instances_verbose/0` will populate the metrics for each instance, for the sake of simplicity.	2023-04-12 16:14:42 +03:00
Kjell Winblad	8e0d315b7b	Merge pull request #10197 from kjellwinblad/0321-fix-inflight-window-hand-over-to-kjell fix: add inflight window setting to the clickhouse bridge	2023-03-29 09:38:24 +02:00
Kjell Winblad	35474578ca	refactor: rename async_inflight_window to inflight_window everywhere	2023-03-23 14:21:57 +01:00
Stefan Strigler	53825b9aba	fix(emqx_bridge): propagate connection error to resource status	2023-03-21 15:02:29 +01:00
Thales Macedo Garitezi	d464e2aad5	refactor: rename test resource prefix Co-authored-by: Zaiming (Stone) Shi <zmstone@gmail.com>	2023-03-16 13:43:01 -03:00
Thales Macedo Garitezi	03342923b9	fix(bridge): use the same dry run prefix Kafka Producer and Consumer bridges rely on this prefix for detecting a dry run and avoid leaking atoms. At some point, this prefix was changed, effectively disabling the check in Kafka Producer.	2023-03-16 13:43:01 -03:00
Thales Macedo Garitezi	e9d3fc511f	chore(buffer_worker): change default `batch_time` to 0 and improve docs	2023-03-06 15:31:28 -03:00
Zaiming (Stone) Shi	fb61c2b266	perf: avoid getting metrics (gen_server:call) for each resource lookup	2023-02-10 19:40:37 +01:00
Zaiming (Stone) Shi	c0d478bd41	fix(buffer_worker): type spec	2023-02-02 14:11:12 +01:00
Zaiming (Stone) Shi	5fdf7fd24c	fix(kafka): use async callback to bump success counters some telemetry events from wolff are discarded: * dropped: this is double counted in wolff, we now only subscribe to the dropped_queue_full event * retried_failed: it has different meanings in wolff, in wolff, it means it's the 2nd (or onward) produce attempt in EMQX, it means it's eventually failed after some retries * retried_success since we are going to handle the success counters in callbac this having this reported from wolff will only make things harder to understand * failed wolff never fails (unelss drop which is a different counter)	2023-01-24 21:12:36 +01:00
Zaiming (Stone) Shi	8fde169abb	Merge pull request #9821 from thalesmg/buffer-worker-expiry-v50 feat(buffer_worker): add expiration time to requests	2023-01-24 13:54:04 +01:00
Thales Macedo Garitezi	6fa6c679bb	feat(buffer_worker): add expiration time to requests With this, we avoid performing work or replying to callers that are no longer waiting on a result. Also introduces two new counters: - `dropped.expired` :: happens when a request expires before being sent downstream - `late_reply` :: when a response is receive from downstream, but the caller is no longer for a reply because the request has expired, and the caller might even have retried it.	2023-01-20 11:36:52 -03:00
Ilya Averyanov	44a6e5ed15	chore(resources): add missing parameters to emqx_resource schema	2023-01-18 14:33:45 +02:00
Thales Macedo Garitezi	fd360ac6c0	feat(buffer_worker): refactor buffer/resource workers to always use queue This makes the buffer/resource workers always use `replayq` for queuing, along with collecting multiple requests in a single call. This is done to avoid long message queues for the buffer workers and rely on `replayq`'s capabilities of offloading to disk and detecting overflow. Also, this deprecates the `enable_batch` and `enable_queue` resource creation options, as: i) queuing is now always enables; ii) batch_size > 1 <=> batch_enabled. The corresponding metric `dropped.queue_not_enabled` is dropped, along with `batching`. The batching is too ephemeral, especially considering a default batch time of 20 ms, and is not shown in the dashboard, so it was removed.	2023-01-05 10:15:09 -03:00
Zaiming (Stone) Shi	dbc10c2eed	chore: update copyright year 2023	2023-01-02 09:22:27 +01:00
Zaiming (Stone) Shi	479e191dcf	refactor: refine worker pool config and doc worker pool is a buffer pool the description hinted connection pool which is wrong.	2022-12-20 09:02:51 +01:00
Thales Macedo Garitezi	1cd91a24e9	feat(gcp_pubsub): implement GCP PubSub bridge (ee5.0)	2022-12-12 17:18:19 -03:00
Shawn	f41adb0997	refactor: change some default values of resource_opts	2022-09-14 15:18:07 +08:00
Shawn	2b33ca6d49	fix: no error log print if insert bool values into mysql	2022-09-07 16:00:09 +08:00
Shawn	0ef0b68de4	refactor: change '{recoverable_error,Reason}' to '{error,{recoverable_error,Reason}}'	2022-08-31 18:25:00 +08:00
Shawn	9e50866cd0	fix: rename queue_max_bytes -> max_queue_bytes	2022-08-30 17:18:54 +08:00
Shawn	6b0ccfbc43	refactor: rename the error return resource_down -> recoverable_error	2022-08-26 17:11:12 +08:00
Shawn	86577365e4	fix: use gen_statem:cast/3 for async query	2022-08-23 22:41:45 +08:00
JimMoen	f0c2b53868	fix(bpapi): make bpapi static_checks happy	2022-08-22 10:51:44 +08:00
JimMoen	7c4ea38c06	fix(resource): make some resource opts internal Resource options `start_after_created` and `start_timeout` are internal opts. Not provided to users anymore.	2022-08-22 02:22:57 +08:00
JimMoen	06363e63d9	fix(influxdb): connector use a fallbacke `pool_size` for influxdb client	2022-08-19 15:54:19 +08:00

1 2

88 Commits