diff --git a/changes/e5.0.2.en.md b/changes/e5.0.2.en.md new file mode 100644 index 000000000..8d552ec17 --- /dev/null +++ b/changes/e5.0.2.en.md @@ -0,0 +1,135 @@ +# e5.0.2 + +## Enhancements + +- [#10022](https://github.com/emqx/emqx/pull/10022) Start releasing Rocky Linux 9 (compatible with Enterprise Linux 9) and MacOS 12 packages + +- [#10139](https://github.com/emqx/emqx/pull/10139) Add `extraVolumeMounts` to EMQX Helm Chart, it will have the ability to mount the user-own files into the EMQX instance, for example, ACL rule files as mentioned in [#9052](https://github.com/emqx/emqx/issues/9052) + Done of [#10116](https://github.com/emqx/emqx/issues/10116) + +- [#9893](https://github.com/emqx/emqx/pull/9893) When connecting with the flag `clean_start=false`, EMQX will filter out messages that published by banned clients. + Previously, the messages sent by banned clients may still be delivered to subscribers in this scenario. + +- [#9986](https://github.com/emqx/emqx/pull/9986) Add MQTT ingress to helm charts and update helm charts documentation + +- [#10083](https://github.com/emqx/emqx/pull/10083) Add `DynamoDB` support for Data-Brdige. + +- [#9564](https://github.com/emqx/emqx/pull/9564) Implement Kafka Consumer bridge. + Now it's possible to consume messages from Kafka and publish them to MQTT topics. + +- [#9881](https://github.com/emqx/emqx/pull/9881) Enhance the error logs related to InfluxDB connectivity health checks. + Previously, if InfluxDB failed to pass the health checks using the specified parameters, the only message provided was "timed out waiting for it to become healthy". + With the updated implementation, the error message will be displayed in both the logs and the dashboard, enabling easier identification and resolution of the issue. + +- [#10123](https://github.com/emqx/emqx/pull/10123) Improve the performance of `/bridges` API. + Earlier, when the number of nodes in the cluster was large or the node was busy, the API may have a request timeout. + +- [#9998](https://github.com/emqx/emqx/pull/9998) Redact the HTTP request body in the authentication error logs for security reasons. + +## Bug Fixes + +- [#10013](https://github.com/emqx/emqx/pull/10013) Fix return type structure for error case in API schema for `/gateways/:name/clients`. + +- [#10014](https://github.com/emqx/emqx/pull/10014) Ensure Monitor API `/monitor(_current)/nodes/:node` returns `404` instead of `400` if node does not exist. + +- [#10026](https://github.com/emqx/emqx/pull/10026) Metrics are now only exposed via the /bridges/:id/metrics endpoint. Metrics are no longer returned in other API operations such as getting the list of all bridges, or in the response when a bridge has been created. + +- [#10027](https://github.com/emqx/emqx/pull/10027) Allow setting node name from `EMQX_NODE__NAME` when running in docker. + Prior to this fix, only `EMQX_NODE_NAME` is allowed. + +- [#10050](https://github.com/emqx/emqx/pull/10050) Ensure Bridge API returns `404` status code consistently for resources that don't exist. + +- [#10052](https://github.com/emqx/emqx/pull/10052) Improve daemon mode startup failure logs. + + Before this change, it was difficult for users to understand the reason for EMQX 'start' command failed to boot the node. + The only information they received was that the node did not start within the expected time frame, + and they were instructed to boot the node with 'console' command in the hope of obtaining some logs. + However, the node might actually be running, which could cause 'console' mode to fail for a different reason. + + With this new change, when daemon mode fails to boot, a diagnosis is issued. Here are the possible scenarios: + + * If the node cannot be found from `ps -ef`, the user is instructed to find information in log files `erlang.log.*`. + * If the node is found to be running but not responding to pings, the user is advised to check if the host name is resolvable and reachable. + * If the node is responding to pings, but the EMQX app is not running, it is likely a bug. In this case, the user is advised to report a Github issue. + +- [#10055](https://github.com/emqx/emqx/pull/10055) The configuration parameter `mqtt.max_awaiting_rel` was not functional and has now been corrected. + +- [#10056](https://github.com/emqx/emqx/pull/10056) Fix `/bridges` API status code. + - Return `400` instead of `403` in case of removing a data bridge that is dependent on an active rule. + - Return `400` instead of `403` in case of calling operations (start|stop|restart) when Data-Bridging is not enabled. + +- [#10066](https://github.com/emqx/emqx/pull/10066) Improve error messages for `/briges_probe` and `[/node/:node]/bridges/:id/:operation` API calls to make them more readable. And set HTTP status code to `400` instead of `500`. + +- [#10074](https://github.com/emqx/emqx/pull/10074) Check if type in `PUT /authorization/sources/:type` matches `type` given in body of request. + +- [#10079](https://github.com/emqx/emqx/pull/10079) Fix description of `shared_subscription_strategy`. + +- [#10085](https://github.com/emqx/emqx/pull/10085) Consistently return `404` for all requests on non existent source in `/authorization/sources/:source[/*]`. + +- [#10098](https://github.com/emqx/emqx/pull/10098) A crash with an error in the log file that happened when the MongoDB authorization module queried the database has been fixed. + +- [#10100](https://github.com/emqx/emqx/pull/10100) Fix channel crash for slow clients with enhanced authentication. + Previously, when the client was using enhanced authentication, but the Auth message was sent slowly or the Auth message was lost, the client process would crash. + +- [#10107](https://github.com/emqx/emqx/pull/10107) For operations on Bridges API if `bridge-id` is unknown we now return `404` + instead of `400`. Also a bug was fixed that caused a crash if that was a node + operation. Additionally we now also check if the given bridge is enabled when + doing the cluster operation `start` . Affected endpoints: + * [cluster] `/bridges/:id/:operation`, + * [node] `/nodes/:node/bridges/:id/:operation`, where `operation` is one of + `[start|stop|restart]`. + Moreover, for a node operation, EMQX checks if node name is in our cluster and + return `404` instead of `501`. + +- [#10117](https://github.com/emqx/emqx/pull/10117) Fix an error occurring when a joining node doesn't have plugins that are installed on other nodes in the cluster. + After this fix, the joining node will copy all the necessary plugins from other nodes. + +- [#10118](https://github.com/emqx/emqx/pull/10118) Fix problems related to manual joining of EMQX replicant nodes to the cluster. + Previously, after manually executing joining and then leaving the cluster, the `replicant` node can only run normally after restarting the node after joining the cluster again. + + [Mria PR](https://github.com/emqx/mria/pull/128) + +- [#10119](https://github.com/emqx/emqx/pull/10119) Fix crash when `statsd.server` is set to an empty string. + +- [#10124](https://github.com/emqx/emqx/pull/10124) The default heartbeat period for MongoDB has been increased to reduce the risk of too excessive logging to the MongoDB log file. + +- [#10130](https://github.com/emqx/emqx/pull/10130) Fix garbled config display in dashboard when the value is originally from environment variables. + For example, `env EMQX_STATSD__SERVER='127.0.0.1:8124' . /bin/emqx start` results in unreadable string (not '127.0.0.1:8124') displayed in Dashboard's Statsd settings page. + Related PR: [HOCON#234](https://github.com/emqx/hocon/pull/234). + +- [#10132](https://github.com/emqx/emqx/pull/10132) Fix some error logs generated by `systemctl stop emqx` command. + Prior to the fix, the command was not stopping jq and os_mon applications properly. + +- [#10144](https://github.com/emqx/emqx/pull/10144) Add `-setcookie` emulator flag when invoking `emqx ctl` to prevent problems with emqx cli when home directory is read only. Fixes [#10142](https://github.com/emqx/emqx/issues/10142). + +- [#10154](https://github.com/emqx/emqx/pull/10154) Change the default `resume_interval` for bridges and connectors to be + the minimum of `health_check_interval` and `request_timeout / 3`. + Also exposes it as a hidden configuration to allow fine tuning. + + Before this change, the default values for `resume_interval` meant + that, if a buffer ever got blocked due to resource errors or high + message volumes, then, by the time the buffer would try to resume its + normal operations, almost all requests would have timed out. + +- [#10157](https://github.com/emqx/emqx/pull/10157) Fixed default rate limit configuration not being applied correctly when creating a new listener. + +- [#10237](https://github.com/emqx/emqx/pull/10237) Ensure we return `404` status code for unknown node names in `/nodes/:node[/metrics|/stats]` API. + +- [#10251](https://github.com/emqx/emqx/pull/10251) Consider bridges referenced in `FROM` rule clauses as dependencies. + + Before this fix, when one tried to delete an ingress rule referenced in an action like `select * from "$bridges/mqtt:ingress"`, the UI would not trigger a warning about dependent rule actions. + +- [#10313](https://github.com/emqx/emqx/pull/10313) Ensure that when the core or replicant node starting, the `cluster-override.conf` file is only copied from the core node. + Previously, when sorting nodes by startup time, the core node may have copied this file from the replicant node. + +- [#10314](https://github.com/emqx/emqx/pull/10314) Fix /monitor_current API so that it only looks at the current node. + Fix /stats API to not crash when one or more nodes in the cluster are down. + +- [#10327](https://github.com/emqx/emqx/pull/10327) Don't increment 'actions.failed.unknown' rule metrics counter upon receiving unrecoverable bridge errors. + This counter is displayed on the dashboard's rule overview tab ('Action statistics' - 'Unknown'). + The fix is only applicable for synchronous bridges, as all rule actions for asynchronous bridges + are counted as successful (they increment 'actions.success' which is displayed as 'Action statistics' - 'Success'). + +- [#10095](https://github.com/emqx/emqx/pull/10095) Stop MySQL client from bombarding server repeatedly with unnecessary `PREPARE` queries on every batch, trashing the server and exhausting its internal limits. This was happening when the MySQL bridge was in the batch mode. + + Ensure safer and more careful escaping of strings and binaries in batch insert queries when the MySQL bridge is in the batch mode. diff --git a/changes/e5.0.2.zh.md b/changes/e5.0.2.zh.md new file mode 100644 index 000000000..a73372570 --- /dev/null +++ b/changes/e5.0.2.zh.md @@ -0,0 +1,118 @@ +# e5.0.2 + +## 增强 + +- [#10022](https://github.com/emqx/emqx/pull/10022) 开始发布Rocky Linux 9(与Enterprise Linux 9兼容)和 MacOS 12 软件包。 + +- [#10139](https://github.com/emqx/emqx/pull/10139) 将 `extraVolumeMounts` 添加到 EMQX Helm Chart 中,它将能够挂载用户自己的文件到 EMQX 实例中,例如在 [#9052](https://github.com/emqx/emqx/issues/9052) 中提到的 ACL 规则文件。 + 修复了 issue [#10116](https://github.com/emqx/emqx/issues/10116) + +- [#9893](https://github.com/emqx/emqx/pull/9893) 当使用 `clean_start=false` 标志连接时,EMQX 将会从消息队列中过滤出被封禁客户端发出的消息,使它们不能被下发给订阅者。 + 此前被封禁客户端发出的消息仍可能在这一场景下被下发给订阅者。 + +- [#9986](https://github.com/emqx/emqx/pull/9986) 在 helm chart 中新增了 MQTT 桥接 ingress 的配置参数;并删除了旧版本遗留的 `mgmt` 配置。 + +- [#10083](https://github.com/emqx/emqx/pull/10083) 为数据桥接增加 `DynamoDB` 支持。 + +- [#9564](https://github.com/emqx/emqx/pull/9564) 实现了 Kafka 消费者桥接。 + 现在可以从 Kafka 消费消息并将其发布到 MQTT 主题。 + +- [#9881](https://github.com/emqx/emqx/pull/9881) 增强了与 InfluxDB 连接健康检查相关的错误日志。 + 在此更改之前,如果使用配置的参数 InfluxDB 未能通过健康检查,用户仅能获得一个“超时”的信息。 + 现在,详细的错误消息将显示在日志和控制台,从而让用户更容易地识别和解决问题。 + +- [#10123](https://github.com/emqx/emqx/pull/10123) 改进 `/bridges` API 的性能。 + 此前,当集群中节点数目较多或节点忙时,该 API 可能出现请求超时的情况。 + +- [#9998](https://github.com/emqx/emqx/pull/9998) 出于安全原因,在身份验证错误日志中模糊 HTTP 请求正文。 + +## 修复 + +- [#10013](https://github.com/emqx/emqx/pull/10013) 修复 API `/gateways/:name/clients` 返回值的类型结构错误。 + +- [#10014](https://github.com/emqx/emqx/pull/10014) 如果 API 查询的节点不存在,将会返回 `404` 而不再是 `400`。 + +- [#10026](https://github.com/emqx/emqx/pull/10026) 现在只有显式调用 `/bridges/:id/metrics` 接口时才可以获得指标数据,而其他 API 接口将不再返回相关数据。 + +- [#10027](https://github.com/emqx/emqx/pull/10027) 在 docker 中启动时,允许使用 `EMQX_NODE__NAME` 环境变量来配置节点名。 + 在此修复前,只能使 `EMQX_NODE_NAME`。 + +- [#10050](https://github.com/emqx/emqx/pull/10050) 确保 Bridge API 对不存在的资源一致返回 `404` 状态代码。 + +- [#10052](https://github.com/emqx/emqx/pull/10052) 优化 EMQX daemon 模式启动启动失败的日志。 + + 在进行此更改之前,当 EMQX 用 `start` 命令启动失败时,用户很难理解出错的原因。 + 所知道的仅仅是节点未能在预期时间内启动,然后被指示以 `console` 式引导节点以获取一些日志。 + 然而,节点实际上可能正在运行,这可能会导致 `console` 模式因不同的原因而失败。 + + 此次修复后,启动脚本会发出诊断: + + * 如果无法从 `ps -ef` 中找到节点,则指示用户在 `erlang.log.*` 中查找信息。 + * 如果发现节点正在运行但不响应 ping,则建议用户检查节点主机名是否有效并可达。 + * 如果节点响应 ping 但 EMQX 应用程序未运行,则很可能是一个错误。在这种情况下,建议用户报告一个Github issue。 + +- [#10055](https://github.com/emqx/emqx/pull/10055) 修复配置项 `mqtt.max_awaiting_rel` 更新不生效问题。 + +- [#10056](https://github.com/emqx/emqx/pull/10056) 修复 `/bridges` API 的 HTTP 状态码。 + - 当删除被活动中的规则依赖的数据桥接时,将返回 `400` 而不是 `403` 。 + - 当数据桥接未启用时,调用操作(启动|停止|重启)将返回 `400` 而不是 `403`。 + +- [#10066](https://github.com/emqx/emqx/pull/10066) 改进 `/briges_probe` 和 `[/node/:node]/bridges/:id/:operation` API 调用的错误信息,使之更加易读。并将 HTTP 状态代码设置为 `400` 而不是 `500`。 + +- [#10074](https://github.com/emqx/emqx/pull/10074) 检查 `PUT /authorization/sources/:type` 中的类型是否与请求正文中的 `type` 相符。 + +- [#10079](https://github.com/emqx/emqx/pull/10079) 修正对 `shared_subscription_strategy` 的描述。 + + +- [#10085](https://github.com/emqx/emqx/pull/10085) 如果向 `/authorization/sources/:source[/*]` 请求的 `source` 不存在,将一致地返回 `404`。 + +- [#10098](https://github.com/emqx/emqx/pull/10098) 当 MongoDB 授权模块查询数据库时,在日志文件中发生的崩溃与错误已经被修复。 + +- [#10100](https://github.com/emqx/emqx/pull/10100) 修复响应较慢的客户端在使用增强认证时可能出现崩溃的问题。 + 此前,当客户端使用增强认证功能,但发送 Auth 报文较慢或 Auth 报文丢失时会导致客户端进程崩溃。 + +- [#10107](https://github.com/emqx/emqx/pull/10107) 现在对桥接的 API 进行调用时,如果 `bridge-id` 不存在,将会返回 `404`,而不再是`400`。 + 然后,还修复了这种情况下,在节点级别上进行 API 调用时,可能导致崩溃的问题。 + 另外,在启动某个桥接时,会先检查指定桥接是否已启用。 + 受影响的接口有: + * [cluster] `/bridges/:id/:operation`, + * [node] `/nodes/:node/bridges/:id/:operation`, + 其中 `operation` 是 `[start|stop|restart]` 之一。 + 此外,对于节点操作,EMQX 将检查节点是否存在于集群中,如果不在,则会返回`404`,而不再是`501`。 + +- [#10117](https://github.com/emqx/emqx/pull/10117) 修复节点加入集群时,由于缺少集其它节点已安装的插件所导致的错误。 + 在此修复后,加入集群的节点将从其它节点复制所有必须的插件。 + +- [#10118](https://github.com/emqx/emqx/pull/10118) 修复 `replicant` 节点因为手动加入 EMQX 集群导致的相关问题。 + 此前,手动执行 `加入集群-离开集群` 后,`replicant` 节点再次加入集群后只有重启节点才能正常运行。 + + [Mria PR](https://github.com/emqx/mria/pull/128) + +- [#10119](https://github.com/emqx/emqx/pull/10119) 修复 `statsd.server` 配置为空字符串时启动崩溃的问题。 + +- [#10124](https://github.com/emqx/emqx/pull/10124) 增加了 MongoDB 的默认心跳周期,以减少 MongoDB 日志文件记录过多的风险。 + +- [#10130](https://github.com/emqx/emqx/pull/10130) 修复通过环境变量配置启动的 EMQX 节点无法通过HTTP API获取到正确的配置信息。 + 比如:`EMQX_STATSD__SERVER='127.0.0.1:8124' ./bin/emqx start` 后通过 Dashboard看到的 Statsd 配置信息是乱码。 + 相关 PR: [HOCON:234](https://github.com/emqx/hocon/pull/234). + +- [#10132](https://github.com/emqx/emqx/pull/10132) 修复 `systemctl stop emqx` 命令没有正常停止 jq、os_mon 组件所产生一些错误日志。 + +- [#10144](https://github.com/emqx/emqx/pull/10144) 为 emqx 可执行文件加入 `-setcookie` 标志,以避免由于 home 目录只读,导致 emqx cli 所提供的 `emqx ctl` 等命令在执行时出现的一些问题。修复 [#10142](https://github.com/emqx/emqx/issues/10142)。 + +- [#10154](https://github.com/emqx/emqx/pull/10154) 将数据桥接和连接器的 resume_interval 参数值设为 health_check_interval 和 request_timeout / 3 中的较小值,以解决请求超时的问题。 + +- [#10157](https://github.com/emqx/emqx/pull/10157) 修复在创建新的监听器时,没有正确应用速率限制默认配置的问题。 + +- [#10237](https://github.com/emqx/emqx/pull/10237) 当调用 `/nodes/:node[/metrics|/stats]` API ,若节点不存在则返回 `404` 状态码。 + +- [#10251](https://github.com/emqx/emqx/pull/10251) 修复了当删除一个使用中的 ingress 类型的桥接时,未提示存在规则依赖的问题。 + +- [#10313](https://github.com/emqx/emqx/pull/10313) 确保当 core 或 replicant 节点启动时,仅从 core 节点复制 `cluster-override.conf` 文件。 + 此前按照节点启动时间排序时,core 节点可能从 replicant 节点复制该文件。 + +- [#10314](https://github.com/emqx/emqx/pull/10314) 修复 `/monitor_current` API ,使其仅查看当前 节点。修复了 `/stats` API,以防止当集群中的一个或多个节点关闭时出现崩溃。 + +- [#10327](https://github.com/emqx/emqx/pull/10327) 在收到不可恢复的错误时,不要增加 'actions.failed.unknown' 规则指标计数。 + +- [#10095](https://github.com/emqx/emqx/pull/10095) 优化 MySQL 桥接在批量模式下能更高效的使用预处理语句 ,减少了对 MySQL 服务器的写入压力, 并确保对 SQL 语句进行更安全和谨慎的转义。