yuanbiao/emqx - emqx

Commit Graph

Author	SHA1	Message	Date
Thales Macedo Garitezi	4c24b08244	fix(rule_action): fix metrics for bridges returning `async_return` Kafka Producer, when called asynchronously, will return `{async_return, {ok, pid()}}`, which currently counts as an unknown failure.	2023-04-06 16:00:01 -03:00
Thales Macedo Garitezi	5d5b7ea215	Merge pull request #10306 from thalesmg/enable-async-buffer-workers-all-bridges-v50 feat(bridges): enable async query mode for all bridges with buffer workers	2023-04-04 17:10:46 -03:00
Thales Macedo Garitezi	0b6fd7fe14	fix(buffer_worker): check request timeout and health check interval Port of https://github.com/emqx/emqx/pull/10154 for `release-50` Fixes https://emqx.atlassian.net/browse/EMQX-9099 Originally, the `resume_interval`, which is what defines how often a buffer worker will attempt to retry its inflight window, was set to the same as the `health_check_interval`. This had the problem that, with default values, `health_check_interval = request_timeout`. This meant that, if a buffer worker with those configs were ever blocked, all requests would have timed out by the time it retried them. Here we change the default `resume_interval` to a reasonable value dependent on `health_check_interval` and `request_timeout`, and also expose that as a hidden parameter for fine tuning if necessary.	2023-04-04 08:58:36 -03:00
Thales Macedo Garitezi	a8f8228a12	Merge pull request #10308 from thalesmg/test-increase-peer-timeout-v50 test(peer): increase init and startup timeout for peer nodes	2023-04-03 15:46:13 -03:00
Thales Macedo Garitezi	f3ffc02bff	feat(bridges): enable async query mode for all bridges with buffer workers Fixes https://emqx.atlassian.net/browse/EMQX-9130 Since buffer workers always support async calls ("outer calls"), we should decouple those two call modes (inner and outer), and avoid exposing the inner call configuration to user to avoid complexity. For bridges that currently only allow sync query modes, we should allow them to be configured with async. That means basically all bridge types except Kafka Producer.	2023-04-03 14:49:51 -03:00
Thales Macedo Garitezi	8b5a717a1f	test(peer): increase init and startup timeout for peer nodes Attempt to stabilize tests that use cluster nodes.	2023-04-03 13:20:22 -03:00
Thales Macedo Garitezi	ec1871ffde	test(janitor): catch each callback invocation	2023-04-03 10:20:19 -03:00
zhongwencool	d63680cf25	Merge pull request #10307 from emqx/release-50 Sync release-50 back to master	2023-04-02 11:36:41 +08:00
Kjell Winblad	58898ea11d	Merge pull request #10294 from kjellwinblad/kjell/feat/collection_var_syntax_mongodb/EMQX-9246 feat: (MongoDB bridge) use ${var} syntax in MongoDB collection field	2023-03-31 17:18:27 +02:00
Thales Macedo Garitezi	5011486b18	fix(kafka_consumer): return better error messages when probing kafka consumer bridge Fixes https://emqx.atlassian.net/browse/EMQX-9422	2023-03-31 11:33:15 -03:00
Kjell Winblad	e808fef1e4	feat: (MongoDB bridge) use ${var} syntax for MongoDB collection This commit makes it possible to use the ${var} syntax to refer to variables in the payload of the message in the collection field. This makes it possible to select which collection to insert into dynamically. Fixes: https://emqx.atlassian.net/browse/EMQX-9246	2023-03-30 17:49:56 +02:00
Thales Macedo Garitezi	632bffd451	fix: return friendly message when kafka producer fails to start (rv5.0) Fixes https://emqx.atlassian.net/browse/EMQX-9392 The returned information does not allow to diagnose the issue (i.e.: a connection issue due to the wrong host and port, the wrong password failing authn). However, such information is printed to the logs. This changes the returned error to the API so that the user is hinted at looking at the logs for further investigation of the error.	2023-03-30 11:51:36 -03:00
Zaiming (Stone) Shi	81a104690d	test: fix flaky influxdb test	2023-03-30 16:19:22 +02:00
Zaiming (Stone) Shi	80eb9d7542	Merge pull request #10252 from emqx/release-50 0327 merge release-50 to master	2023-03-29 12:33:17 +02:00
Thales Macedo Garitezi	64faccf50b	test: fix flaky kafka consumer test	2023-03-28 14:50:55 -03:00
Thales Macedo Garitezi	1a7ca7235e	Merge pull request #10249 from thalesmg/fix-kafka-offset-doc-rv50 feat(kafka_consumer): tie `offset_reset_policy` and `begin_offset` together (rv5.0)	2023-03-28 11:37:46 -03:00
Thales Macedo Garitezi	69fc1123ee	refactor: change enum constructors and improve docs	2023-03-27 17:30:17 -03:00
Thales Macedo Garitezi	5cf09209cd	feat: tie `offset_reset_policy` and `begin_offset` together To make the configuration more intuitive and avoid exposing more parameters to the user, we should: 1) Remove reset_by_subscriber as an enum constructor for `offset_reset_policy`, as that might make the consumer hang indefinitely without manual action. 2) Set the `begin_offset` `brod_consumer` parameter to `earliest` or `latest` depending on the value of `offset_reset_policy`, as that’s probably the user’s intention.	2023-03-27 14:20:31 -03:00
Zaiming (Stone) Shi	da5e6f3d0a	test: test with only one Kafka partition for bad config recover test	2023-03-27 17:38:34 +02:00
JianBo He	bfa5922209	Merge pull request #10140 from HJianBo/cassa feat: support cassandra data bridge	2023-03-27 10:23:02 +08:00
Thales Macedo Garitezi	ff272a2071	Merge pull request #10206 from thalesmg/decouple-buffer-worker-query-call-mode-v50 feat(buffer_worker): decouple query mode from underlying connector call mode	2023-03-24 13:49:00 -03:00
Thales Macedo Garitezi	f8d5d53908	feat(buffer_worker): decouple query mode from underlying connector call mode Fixes https://emqx.atlassian.net/browse/EMQX-9129 Currently, if an user configures a bridge with query mode sync, then all calls to the underlying driver/connector ("inner calls") will always be synchronous, regardless of its support for async calls. Since buffer workers always support async queries ("outer calls"), we should decouple those two call modes (inner and outer), and avoid exposing the inner call configuration to user to avoid complexity. There are two situations when we want to force synchronous calls to the underlying connector even if it supports async: 1) When using `simple_sync_query`, since we are bypassing the buffer workers; 2) When retrying the inflight window, to avoid overwhelming the driver.	2023-03-23 13:40:31 -03:00
JianBo He	9b63bdc1e0	chore: apply review suggestions - Rename sql to cql - Add tests for `bridges_probe` API	2023-03-23 15:27:34 +08:00
JianBo He	8cbbc9f271	Merge remote-tracking branch 'upstream/master' into cassa	2023-03-23 11:53:17 +08:00
Thales Macedo Garitezi	ddffba0355	Merge pull request #10154 from thalesmg/fix-buffer-worker-default-req-timeout fix(buffer_worker): calculate default `resume_interval` based on `request_timeout` and `health_check_interval`	2023-03-22 20:21:04 -03:00
Thales Macedo Garitezi	1ca6a51425	Merge pull request #10198 from thalesmg/fix-flaky-kconsumer-test-v50 test: attempt to fix flaky kafka consumer test	2023-03-22 17:19:50 -03:00
Thales Macedo Garitezi	127a075b66	test(dynamo): attempt to fix dynamo tests Those tests in the `flaky` test are really flaky and require lots of CI retries. Apparently, the flakiness comes from race conditions from restarting bridges with the same name too fast between test cases. Previously, all test cases were sharing the same bridge name (the module name).	2023-03-22 14:34:37 -03:00
Thales Macedo Garitezi	61cb03b45a	fix(buffer_worker): change the default `resume_interval` value and expose it as hidden config Also removes the previously added alarm for request timeout. There are situations where having a short request timeout and a long health check interval make sense, so we don't want to alarm the user for those situations. Instead, we automatically attempt to set a reasonable `resume_interval` value.	2023-03-22 11:47:36 -03:00
lafirest	84def357a9	Merge pull request #10143 from lafirest/feat/rocketmq feat(bridges): integrate RocketMQ into data bridges	2023-03-22 20:43:22 +08:00
firest	4ad3579966	test(bridges): add test suite for RocketMQ	2023-03-22 10:36:58 +08:00
JianBo He	65c2da7ef5	Merge remote-tracking branch 'ce/master' into cassa	2023-03-22 09:30:50 +08:00
Zaiming (Stone) Shi	e6091db351	Merge remote-tracking branch 'origin/release-50' into 0321-merge-release-50-to-master	2023-03-21 22:03:31 +01:00
Thales Macedo Garitezi	7e6f52e8fe	test: attempt to fix flaky kafka consumer test It might need some time for the metrics to be set.	2023-03-21 17:45:58 -03:00
JianBo He	539ec2f774	chore(bridge): cover username/password auth for cassandra bridges	2023-03-21 13:55:53 +08:00
Serge Tupchii	3a46681dde	feat: handle escaped characters in InfluxDB data bridge write_syntax Closes: EMQX-7834	2023-03-20 16:42:23 +02:00
JianBo He	d1689f6957	chore: correct api examples follow https://github.com/emqx/emqx/pull/10114	2023-03-20 17:03:05 +08:00
Erik Timan	2d75c7d6d9	fix(emqx_bridge): remove metrics from non-dedicated bridge API endpoints Metrics should only be exposed via the /bridges/:id/metrics endpoint, and not in other operations such as getting the list of all bridges, or in the response when a bridge has been created. This commit removes all traces of metrics for the non-dedicated API endpoints.	2023-03-20 09:43:11 +01:00
JianBo He	12942b676d	Merge remote-tracking branch 'upstream/master' into cassa	2023-03-20 09:50:27 +08:00
JianBo He	678cc937c0	test(bridge): cover ssl testing for cassandra bridge	2023-03-17 18:25:05 +08:00
JianBo He	5f0828a2ea	ci: add certs for cassandra tls	2023-03-17 16:39:10 +08:00
JianBo He	c0a216a740	feat(bridge): support cassandra bridge	2023-03-17 11:34:48 +08:00
Thales Macedo Garitezi	966276127e	test: trying to make tests more stable	2023-03-16 13:43:01 -03:00
Thales Macedo Garitezi	41b8d47696	test(kafka_consumer): add more clusterized tests	2023-03-16 13:43:01 -03:00
Thales Macedo Garitezi	fc5dfa108a	fix(kafka_consumer): rename `force_utf8` to `none` We actually enforce the key/value to be a valid UTF-8 string when using `emqx_json:encode`, if we do encode using that, which is template-dependent.	2023-03-16 13:43:01 -03:00
Thales Macedo Garitezi	53979b6261	feat(kafka_consumer): support multiple topic mappings, payload templates and key/value encodings Added after feedback from the product team.	2023-03-16 13:43:01 -03:00
Thales Macedo Garitezi	f31f15e44e	chore(kafka_producer): make schema changes more backwards compatible This will still require fixes to the frontend.	2023-03-16 13:43:01 -03:00
Thales Macedo Garitezi	c182a4053e	fix(kafka_consumer): avoid leaking atoms in bridge probe API	2023-03-16 13:43:01 -03:00
Thales Macedo Garitezi	e1fdd041b3	feat(kafka_consumer): add `offset_commit_interval_seconds` kafka parameter	2023-03-16 13:43:01 -03:00
Thales Macedo Garitezi	1d5fe14a30	test: remove sleeps	2023-03-16 13:43:01 -03:00
Thales Macedo Garitezi	969fa03ecc	feat: implement kafka consumer	2023-03-16 13:43:01 -03:00
Thales Macedo Garitezi	5ab5236ad3	test: fix flaky test	2023-03-16 13:42:38 -03:00
Andrew Mayorov	f7c0d29478	test(tde): add testcase for a nasty string in SQL query Similar to what we have in mysql and pgqsl testsuites.	2023-03-10 18:43:30 +03:00
Andrew Mayorov	0a7f6c7d03	fix(mysql): ensure proper escaping in batch inserts Also hexencode non-utf8 binaries. This is essentially an heuristic. We don't know column types in runtime, and there's no simple way to find them out. Since we're already doing full binary scan during escaping it should be cheap to bail out on non-utf8 strings and hexencode them instead. Also introduce separate function to highlight that this escaping is MySQL-specific.	2023-03-10 18:42:04 +03:00
Andrew Mayorov	fc37d9b3cd	fix(mysql): be explicit that batch queries are parameterless So that mysql client won't attempt to prepare them automatically, thus trashing the server's prepared statements table and making interaction overall heavier.	2023-03-10 18:42:04 +03:00
Zaiming (Stone) Shi	fe27604010	Merge remote-tracking branch 'origin/release-50' into 0308-merge-release-50-back-to-master	2023-03-08 16:46:45 +01:00
Serge Tupchii	97e71c54d4	fix: use default template if timestamp is empty (undefined) in InfluxDB bridge Closes EMQX-8926	2023-03-08 11:58:23 +01:00
firest	984dd3446d	test(bridges): add test suite for DynamoDB	2023-03-08 11:13:51 +08:00
Thales Macedo Garitezi	e9d3fc511f	chore(buffer_worker): change default `batch_time` to 0 and improve docs	2023-03-06 15:31:28 -03:00
Thales Macedo Garitezi	9b087a21f5	fix(gcp_pubsub): remove conflicting `request_timeout` option Since the buffer worker schema already contains that configuration, having it two places can lead to quite confusing behavior.	2023-03-06 10:12:38 -03:00
Kjell Winblad	67acdf0888	feat: add clickhouse database bridge This commit adds a Clickhouse bridge to EMQX 5. The bridge is similar to the Clickhouse bridge in the 4.4, but adds the possibility to use different formats (such as JSON) for values to be inserted.	2023-03-02 12:22:11 +01:00
Zaiming (Stone) Shi	083330ad80	Merge remote-tracking branch 'origin/master' into 0301-merge-release-50-to-master	2023-03-01 08:53:03 +01:00
Zaiming (Stone) Shi	24f476e35f	test: add README to influxdb test script	2023-02-28 19:38:43 +01:00
Ivan Dyachkov	6ce5029d79	Merge pull request #9881 from olcai/log-influxdb-is-alive-reason fix(emqx_ee_connector): log reason for failure when starting influxdb connector	2023-02-28 09:49:08 +00:00
Andrew Mayorov	9cbe64a132	fix(test): make strings json-friendly in kafka testsuite	2023-02-24 15:05:20 +03:00
Erik Timan	da42c91fb2	test(emqx_ee_bridge): check influxdb:is_alive/2 return	2023-02-24 09:03:34 +01:00
Ivan Dyachkov	c869eff6e8	test(kafka): disable overload protection in 2 more places	2023-02-20 21:38:14 +01:00
Ivan Dyachkov	9d30a52910	test: disable replayq memory overload protection	2023-02-20 19:29:26 +01:00
firest	0420e9acb5	test(bridges): add test cases for TDEngine	2023-02-14 22:04:29 +08:00
Stefan Strigler	c7535fce56	test: fix test expecting 200 instead of 204	2023-02-08 09:14:35 +01:00
Zhongwen Deng	1c9035d24c	test: remove async from redis ct	2023-02-02 17:37:18 +08:00
Ilya Averyanov	9d91ebe266	Merge pull request #9842 from savonarola/fix-redis-cluster-recover fix: fix redis cluster resource recovery	2023-02-01 10:38:52 +02:00
Ilya Averyanov	fce1e74c3d	fix(connector): fix redis cluster resource recovery	2023-01-31 16:55:05 +02:00
Zaiming (Stone) Shi	d47941601d	refactor(buffer_worker): rename trace points	2023-01-28 11:52:11 +01:00
Zaiming (Stone) Shi	52b75ada04	Merge pull request #9832 from sstrigler/EMQX-8774-failure-to-handle-timeout-error-in-resource-worker EMQX 8774 failure to handle timeout error in resource worker	2023-01-27 14:36:44 +01:00
Andrew Mayorov	d35e46b2d5	Merge pull request #9838 from keynslug/fix/redis-cluster-batching feat(redis): disable batching in redis_cluster bridges	2023-01-27 15:27:57 +04:00
Stefan Strigler	7d18128ba9	test: async write can return noproc	2023-01-27 11:43:51 +01:00
Stefan Strigler	2d62de5188	test: fix expected result from timeout error	2023-01-27 11:43:48 +01:00
Zaiming (Stone) Shi	f6b3b930b0	chore: improve a error log	2023-01-26 14:21:27 +01:00
Andrew Mayorov	26fcaecad7	fix(redis): disable batching in `redis_cluster` bridges Through configuration subsystem.	2023-01-25 17:28:11 +03:00
Andrew Mayorov	903a77b471	test(redis): ensure batch query hit different cluster shards This will inevitably fail: it's not generally possible to update different keys through the same cluster connection, one or more update will fail with `MOVED` status. This testcase should serve as a regression test later.	2023-01-25 15:33:05 +03:00
Zaiming (Stone) Shi	5fdf7fd24c	fix(kafka): use async callback to bump success counters some telemetry events from wolff are discarded: * dropped: this is double counted in wolff, we now only subscribe to the dropped_queue_full event * retried_failed: it has different meanings in wolff, in wolff, it means it's the 2nd (or onward) produce attempt in EMQX, it means it's eventually failed after some retries * retried_success since we are going to handle the success counters in callbac this having this reported from wolff will only make things harder to understand * failed wolff never fails (unelss drop which is a different counter)	2023-01-24 21:12:36 +01:00
Zaiming (Stone) Shi	8fde169abb	Merge pull request #9821 from thalesmg/buffer-worker-expiry-v50 feat(buffer_worker): add expiration time to requests	2023-01-24 13:54:04 +01:00
Thales Macedo Garitezi	ca4a262b75	refactor: re-organize dealing with unrecoverable errors	2023-01-20 12:00:17 -03:00
Thales Macedo Garitezi	6fa6c679bb	feat(buffer_worker): add expiration time to requests With this, we avoid performing work or replying to callers that are no longer waiting on a result. Also introduces two new counters: - `dropped.expired` :: happens when a request expires before being sent downstream - `late_reply` :: when a response is receive from downstream, but the caller is no longer for a reply because the request has expired, and the caller might even have retried it.	2023-01-20 11:36:52 -03:00
Thales Macedo Garitezi	47f796dd12	refactor: rename `emqx_resource_worker` -> `emqx_resource_buffer_worker` To make it more clear that it's purpose is serve as a buffering layer.	2023-01-18 16:15:34 -03:00
Ilya Averyanov	f9843de7ae	Merge pull request #9628 from savonarola/fix-flaky-redis-bridge-test chore(ee bridge): fix Redis bridge test flakyness	2023-01-18 20:56:13 +02:00
Zaiming (Stone) Shi	2d01e604a5	Merge pull request #9799 from zmstone/0118-fix-key_dispatch-kafka-produce-strategy 0118 fix key dispatch kafka produce strategy	2023-01-18 13:52:49 +01:00
Ilya Averyanov	f6fbbf3ee3	chore(bridges): reduce Redis bridge flakyness	2023-01-18 14:34:11 +02:00
Zaiming (Stone) Shi	8f275a66d0	test: add coverage for key_dispatch partition strategy	2023-01-18 11:47:37 +01:00
Zaiming (Stone) Shi	d4f3b4c8c2	Merge remote-tracking branch 'origin/master' into fix-buffer-clear-replayq-on-delete-v50	2023-01-18 11:39:47 +01:00
Thales Macedo Garitezi	087b667263	fix(buffer_worker): allow signalling unrecoverable errors	2023-01-17 19:50:30 -03:00
Thales Macedo Garitezi	fa01deb3eb	chore: retry as much as possible, don't reply to caller too soon	2023-01-17 16:49:15 -03:00
Thales Macedo Garitezi	32a9e60313	feat(buffer_worker): also use the inflight table for sync requests Related: https://emqx.atlassian.net/browse/EMQX-8692 This should also correctly account for `retried.*` metrics for sync requests. Also fixes cases where race conditions for retrying async requests could potentially lead to inconsistent metrics. Fixes more cases where a stale reference to `replayq` was being held accidentally after a `pop`.	2023-01-17 16:48:48 -03:00
Stefan Strigler	f37b3e4bc4	test: test against `bridges_probe` API	2023-01-17 15:29:19 +01:00
Zaiming (Stone) Shi	a7fc5e8fe1	Merge pull request #9761 from zmstone/0114-fix-kafka-value-template-and-docs feat: introduce 'this' concept for placeholder, and use it in Kafka bridge	2023-01-16 13:37:29 +01:00
Zaiming (Stone) Shi	47414e0d53	docs: improve kafka key and value field description	2023-01-16 11:32:09 +01:00
Zaiming (Stone) Shi	91c5a89985	test: wait for redis connected state the case is sometimes flaky because the health check sometimes return connecting	2023-01-14 18:33:55 +01:00
Stefan Strigler	e08c1d2229	Merge remote-tracking branch 'olcai/refactor-bridges-api' into dev/api-refactor	2023-01-13 15:49:52 +01:00
Erik Timan	7a17fb7308	test(emqx_ee_bridge): fix bridge enable/disable in kafka producer suite	2023-01-13 14:40:54 +01:00
Thales Macedo Garitezi	f25bd288ad	Merge pull request #9742 from thalesmg/expose-resource-opts-mongo-v50 feat(mongo): expose buffer worker opts to the bridge frontend (5.0)	2023-01-13 10:23:49 -03:00

1 2 3 4 5

217 Commits