yuanbiao/emqx - emqx

Commit Graph

Author	SHA1	Message	Date
Thales Macedo Garitezi	ebecbd1545	fix(bridge): make dryrun health check timeout more malleable Fixes https://emqx.atlassian.net/browse/EMQX-10773 - Makes the timeout for probing a bridge more malleable to account for differences between each database. - Increases GCP PubSub Consumer default health check timeout to account for GCP slowness/throttling.	2023-08-17 09:21:19 -03:00
zhongwencool	b5cc8fb3c3	fix: start_after_created's default value	2023-07-07 16:39:26 +08:00
Thales Macedo Garitezi	1d791d7a8c	fix(resource): validate maximum worker pool size Fixes https://emqx.atlassian.net/browse/EMQX-10297	2023-06-20 14:26:42 -03:00
Zaiming (Stone) Shi	ddef751527	fix(mongodb): hide batch_size for mongodb resource MongoDB connector currently does not support batching so the batch_size option has no effect. However we cannot remove the field, so we choose to hide it from schema	2023-06-11 11:08:58 +02:00
Thales Macedo Garitezi	46393343e2	chore: use `timeout_duration` types for timer fields Fixes https://emqx.atlassian.net/browse/EMQX-10020	2023-06-05 11:46:38 -03:00
Thales Macedo Garitezi	0790c88aaf	refactor: use default's type as first union member Co-authored-by: Zaiming (Stone) Shi <zmstone@gmail.com>	2023-06-02 09:08:11 -03:00
Thales Macedo Garitezi	99796224d8	refactor(resource): rename `request_timeout` -> `request_ttl` See https://emqx.atlassian.net/wiki/spaces/P/pages/612368639/open+e5.1+remove+auto+restart+interval+from+buffer+worker+resource+options	2023-06-01 13:01:53 -03:00
Thales Macedo Garitezi	f42ccb6262	feat(resource): increase default request timeout to 45 s See https://emqx.atlassian.net/wiki/spaces/P/pages/612368639/open+e5.1+remove+auto+restart+interval+from+buffer+worker+resource+options	2023-06-01 11:20:06 -03:00
Thales Macedo Garitezi	10425eb925	feat(resource): deprecate `auto_restart_interval` in favor of `health_check_interval` See: https://emqx.atlassian.net/wiki/spaces/P/pages/612368639/open+e5.1+remove+auto+restart+interval+from+buffer+worker+resource+options Current problem: In 5.0.x, we have two timer options that control the state changing of buffer worker resources: auto_restart_interval and health_check_interval. - auto_restart_interval controls how often the resource attempts to transition from disconnected to connected. - health_check_interval controls how often the resource is checked and potentially moved from connected to disconnected or connecting. The existence of two independent timers for very similar purposes is confusing to users, QA and even developers. Also, an intimately related configuration is request_timeout, which can interact badly with auto_restart_interval if the latter is poorly configured: requests may always expire if request_timeout < auto_restart_interval and if the resource enters the disconnected state. For health_check_interval, we attempt to derive a sane default that gives requests a chance to retry (if request timeout is finite, then the resource retries requests with a period of min(health_check_interval, request_timeout / 3). Another problem with the separate auto_restart_interval is that its default value (60 s) is too high when compared to the default request timeout and health check, leading to the problems described above if not tuned. Proposed solution: We propose to drop auto_restart_interval in favor of health_check_interval, which will be used for both disconnected -> connected and connected -> {disconnected, connecting} transition checks. With that, the resource will attempt to reconnect at the same interval as the health check, which currently is 15 s. Also, as two smaller changes to accompany this one: - Increase the default request_timeout from 15 s to 45 s. - Rename request_timeout to request_ttl.	2023-06-01 11:20:06 -03:00
Zaiming (Stone) Shi	cc5b4d3748	Merge remote-tracking branch 'origin/release-50' into 0526-ci-delete-otp-24-from-standalone-app-test	2023-05-26 15:58:16 +02:00
JianBo He	71b636e321	fix: fix auto_restart_interval checker	2023-05-25 12:04:23 +08:00
Paulo Zulato	122ebcac24	fix: add user-friendly message when interval is out of range	2023-05-24 15:46:00 -03:00
Zaiming (Stone) Shi	732a7be187	Merge remote-tracking branch 'origin/release-50'	2023-05-22 17:46:54 +02:00
Thales Macedo Garitezi	7d798c10e9	perf(buffer_worker): flush metrics periodically inside buffer worker process Fixes https://emqx.atlassian.net/browse/EMQX-9905 Since calling `telemetry` is costly in a hot path, we instead collect metrics inside the buffer workers state and periodically flush them, rather than immediately as events happen.	2023-05-22 09:11:23 -03:00
Paulo Zulato	5d289ade56	fix: validate range for some bridge options Fixes https://emqx.atlassian.net/browse/EMQX-9864 Setting a very large interval can cause `erlang:start_timer` to crash. Also, setting auto_restart_interval or health_check_interval to "0s" causes the state machine to be in loop as time 0 is handled separately: \| state_timeout() = timeout() \| integer() \| (...) \| If Time is relative and 0 no timer is actually started, instead the the \| time-out event is enqueued to ensure that it gets processed before any \| not yet received external event. from "https://www.erlang.org/doc/man/gen_statem.html#type-state_timeout" Therefore, both fields are now validated against the range [1ms, 1h], which doesn't cause above issues.	2023-05-18 10:10:58 -03:00
Thales Macedo Garitezi	e073bc90bc	refactor(buffer_worker): rename `s/queue/buffer/g`	2023-04-14 11:37:19 -03:00
Thales Macedo Garitezi	14ed4a7ada	feat(buffer_worker): set default queue mode to `memory_only` Fixes https://emqx.atlassian.net/browse/EMQX-9367 For better user experience and performance for the average bridge, we should change the default queue mode to `memory_only`, as was the behavior of most bridges in e4.x. This leads to better performance when message rate is high enough and the remote resource is not keeping up with EMQX. Also, we set the default segment size to equal max queue bytes.	2023-04-14 11:37:19 -03:00
Zaiming (Stone) Shi	a9bf633e03	Merge pull request #10320 from zmstone/0403-sync-release-50-back-to-master 0403 sync release 50 back to master	2023-04-04 23:31:24 +02:00
Zaiming (Stone) Shi	68c15ffd48	Merge remote-tracking branch 'origin/release-50' into 0403-sync-release-50-back-to-master	2023-04-04 16:42:58 +02:00
Thales Macedo Garitezi	0b6fd7fe14	fix(buffer_worker): check request timeout and health check interval Port of https://github.com/emqx/emqx/pull/10154 for `release-50` Fixes https://emqx.atlassian.net/browse/EMQX-9099 Originally, the `resume_interval`, which is what defines how often a buffer worker will attempt to retry its inflight window, was set to the same as the `health_check_interval`. This had the problem that, with default values, `health_check_interval = request_timeout`. This meant that, if a buffer worker with those configs were ever blocked, all requests would have timed out by the time it retried them. Here we change the default `resume_interval` to a reasonable value dependent on `health_check_interval` and `request_timeout`, and also expose that as a hidden parameter for fine tuning if necessary.	2023-04-04 08:58:36 -03:00
Thales Macedo Garitezi	f3ffc02bff	feat(bridges): enable async query mode for all bridges with buffer workers Fixes https://emqx.atlassian.net/browse/EMQX-9130 Since buffer workers always support async calls ("outer calls"), we should decouple those two call modes (inner and outer), and avoid exposing the inner call configuration to user to avoid complexity. For bridges that currently only allow sync query modes, we should allow them to be configured with async. That means basically all bridge types except Kafka Producer.	2023-04-03 14:49:51 -03:00
Kjell Winblad	8e0d315b7b	Merge pull request #10197 from kjellwinblad/0321-fix-inflight-window-hand-over-to-kjell fix: add inflight window setting to the clickhouse bridge	2023-03-29 09:38:24 +02:00
Zaiming (Stone) Shi	d07987288a	chore: add some example annotations for config importance level	2023-03-28 14:29:24 +02:00
Thales Macedo Garitezi	61cb03b45a	fix(buffer_worker): change the default `resume_interval` value and expose it as hidden config Also removes the previously added alarm for request timeout. There are situations where having a short request timeout and a long health check interval make sense, so we don't want to alarm the user for those situations. Instead, we automatically attempt to set a reasonable `resume_interval` value.	2023-03-22 11:47:36 -03:00
Kjell Winblad	27b8445337	fix: add inflight window setting to the clickhouse bridge This commit makes sure the inflight window setting is present for the clickhouse bridge. It also changes emqx_resource_schema that previously removed the inflight window setting from resources with query mode `always_sync`. We don't need to do that because all bridges that uses the buffer worker queue will get async call handling even if the bridge don't support the async callback. Co-authored-by: Zaiming (Stone) Shi <zmstone@gmail.com>	2023-03-21 17:14:03 +01:00
Zhongwen Deng	f8936013b7	chore: replace async with sync	2023-02-02 17:37:18 +08:00
Zhongwen Deng	22c3f50020	fix: add query_mode_sync_only for mysql pgsql redis mongodb bridge	2023-02-02 17:37:18 +08:00
Ilya Averyanov	44a6e5ed15	chore(resources): add missing parameters to emqx_resource schema	2023-01-18 14:33:45 +02:00
Thales Macedo Garitezi	fa01deb3eb	chore: retry as much as possible, don't reply to caller too soon	2023-01-17 16:49:15 -03:00
Thales Macedo Garitezi	fd360ac6c0	feat(buffer_worker): refactor buffer/resource workers to always use queue This makes the buffer/resource workers always use `replayq` for queuing, along with collecting multiple requests in a single call. This is done to avoid long message queues for the buffer workers and rely on `replayq`'s capabilities of offloading to disk and detecting overflow. Also, this deprecates the `enable_batch` and `enable_queue` resource creation options, as: i) queuing is now always enables; ii) batch_size > 1 <=> batch_enabled. The corresponding metric `dropped.queue_not_enabled` is dropped, along with `batching`. The batching is too ephemeral, especially considering a default batch time of 20 ms, and is not shown in the dashboard, so it was removed.	2023-01-05 10:15:09 -03:00
Zaiming (Stone) Shi	dbc10c2eed	chore: update copyright year 2023	2023-01-02 09:22:27 +01:00
Zaiming (Stone) Shi	479e191dcf	refactor: refine worker pool config and doc worker pool is a buffer pool the description hinted connection pool which is wrong.	2022-12-20 09:02:51 +01:00
Shawn	f41adb0997	refactor: change some default values of resource_opts	2022-09-14 15:18:07 +08:00
Shawn	b45f3de8db	refactor(resource): rename metrics batched,queued -> batching,queuing	2022-09-02 12:41:14 +08:00
Shawn	9e50866cd0	fix: rename queue_max_bytes -> max_queue_bytes	2022-08-30 17:18:54 +08:00
JimMoen	7c4ea38c06	fix(resource): make some resource opts internal Resource options `start_after_created` and `start_timeout` are internal opts. Not provided to users anymore.	2022-08-22 02:22:57 +08:00
JimMoen	06363e63d9	fix(influxdb): connector use a fallbacke `pool_size` for influxdb client	2022-08-19 15:54:19 +08:00
Shawn	9e35032d78	fix: make resume_interval defaults to health_check_interval	2022-08-16 10:09:02 +08:00
Xinyu Liu	2898966439	Merge branch 'dev/ee5.0' into resource_opts	2022-08-15 21:43:22 +08:00
Shawn	19d85d485b	refactor(resource): add resource_opts level into config structure	2022-08-15 21:40:10 +08:00
JimMoen	3678673124	fix: schema default value using raw type before convert	2022-08-12 16:38:46 +08:00
Shawn	0cdf4b47f1	feat: add more resource creation opts	2022-08-12 13:47:45 +08:00
Shawn	2872f0b668	fix(bridges): support create resources with options	2022-08-11 19:11:44 +08:00
JimMoen	22a4ca311c	feat(resource): resource batch/async/queue config schema	2022-08-11 16:59:18 +08:00

44 Commits