emqx/changes/e5.6.0.en.md

18 KiB

e5.6.0

Enhancements

  • #12251 Optimize performance of the RocksDB-based persistent session. Reduce RAM usage and frequency of database requests.

    • Introduce dirty session state to avoid frequent mria transactions
    • Introduce an intermediate buffer for the persistent messages
    • Use separate tracks of PacketIds for QoS1 and QoS2 messages
    • Limit the number of continuous ranges of inflight messages to one per stream
  • #12326 Add session registration history.

    Setting config broker.session_history_retain allows EMQX to keep track of expired sessions for the retained period.

    API GET /api/v5/sessions_count?since=1705682238 can be called to count the cluster-wide sessions which were alive (unexpired) since the provided timestamp (UNIX epoch at seconds precision).

    A new gauge cluster_sessions is added to the metrics collection. Exposed to prometheus as

    # TYPE emqx_cluster_sessions_count gauge
    emqx_cluster_sessions_count 1234
    

    NOTE: The counter can only be used for an approximate estimation as the collection and calculations are async.

  • #12338 Added time-based message garbage collection to the RocksDB-based persistent session backend.

  • #12398 Exposed the swagger_support option in configuration for Dashboard to disable the swagger API document.

  • #12467 Support cluster discovery using AAAA DNS record type.

  • #12483 Renamed emqx ctl conf cluster_sync tnxid ID to emqx ctl conf cluster_sync inspect ID. For backward compatibility, tnxid is kept, but considered deprecated and will be removed in 5.7.

  • #12499 Added ability to ban clients by extended rules:

    • by matching clientids to a regular expression;
    • by matching client's username to a regular expression;
    • by matching client's peer address to an CIDR range.

    Warning: large number of matching rules (not tied to a concrete clientid, username or host) will impact performance.

  • #12509 Implement API to re-order all authenticators / authorization sources.

  • #12517 Congifuration files now support multi-line string values with indentation.

    Introduced the """~ and ~""" to quote indented lines. For example:

    rule_xlu4 {
    sql = """~
    SELECT
    *
    FROM
    "t/#"
    ~"""
    }
    

    See HOCON 0.42.0 release notes for more details.

  • #12520 Implement log throttling. The feature reduces the number of potentially flooding logged events by dropping all but the first event within a configured time window. Throttling is applied to the following log events:

    • authentication_failure
    • authorization_permission_denied
    • cannot_publish_to_topic_due_to_not_authorized
    • cannot_publish_to_topic_due_to_quota_exceeded
    • connection_rejected_due_to_license_limit_reached
    • dropped_msg_due_to_mqueue_is_full
  • #12561 Implement HTTP APIs to get the list of client's in-flight and mqueue messages.

    To get the first chunk of data:

    • GET /clients/{clientid}/mqueue_messages?limit=100
    • GET /clients/{clientid}/inflight_messages?limit=100

    Alternatively:

    • GET /clients/{clientid}/mqueue_messages?limit=100&position=none
    • GET /clients/{clientid}/inflight_messages?limit=100&position=none

    To get the next chunk of data:

    • GET /clients/{clientid}/mqueue_messages?limit=100&position={position}
    • GET /clients/{clientid}/inflight_messages?limit=100&position={position}

    Where {position} is a value (opaque string token) of meta.position field from the previous response.

    Mqueue messages are ordered according to their priority and queue (FIFO) order: from higher priority to lower priority. By default, all messages in Mqueue have the same priority of 0.

    In-flight messages are ordered by time at which they were inserted to the in-flight storage (from older to newer messages).

  • #12590 Removed mfa meta data from log messages to improve clarity.

  • #12641 Improve text log formatter fields order.

    tag > clientid > msg > peername > username > topic > [other fields]

  • #12670 Add field shared_subscriptions to endpoint /monitor_current and /monitor_current/nodes/:node.

  • #12679 Upgrade docker image base from Debian 11 to Debian 12

  • #12700 Support "b" and "B" unit in bytesize hocon fields.

    For example, all three fields below will have the value of 1024 bytes:

    bytesize_field = "1024b"
    bytesize_field2 = "1024B"
    bytesize_field2 = 1024
    
  • #12719 Support multiple clientid and username Query string parameters in /clients API, and make it possible to specify which client info fields must be included in the response.

    Multi clientid/username queries examples:

    • /clients?clientid=client1&clientid=client2
    • /clients?username=user11&username=user2
    • /clients?clientid=client1&clientid=client2&username=user1&username=user2

    Selecting which fields to include in response examples:

    • /clients?fields=all (omitting fields query string parameter defaults to returning all fields)
    • /clients?fields=clientid,username
  • #12330 The Cassandra bridge has been split into connector and action components. They are backwards compatible with the bridge HTTP API. Configuration will be upgraded automatically.

  • #12353 The OpenTSDB bridge has been split into connector and action components. They are backwards compatible with the bridge HTTP API. Configuration will be upgraded automatically.

  • #12376 The Kinesis bridge has been split into connector and action components. They are backwards compatible with the bridge HTTP API. Configuration will be upgraded automatically.

  • #12386 The GreptimeDB bridge has been split into connector and action components. They are backwards compatible with the bridge HTTP API. Configuration will be upgraded automatically.

  • #12423 The RabbitMQ bridge has been split into connector, action and source components. They are backwards compatible with the bridge HTTP API. Configuration will be upgraded automatically.

  • #12425 The ClickHouse bridge has been split into connector and action components. They are backwards compatible with the bridge HTTP API. Configuration will be upgraded automatically.

  • #12439 The Oracle bridge has been split into connector and action components. They are backwards compatible with the bridge HTTP API. Configuration will be upgraded automatically.

  • #12449 The TDEngine bridge has been split into connector and action components. They are backwards compatible with the bridge HTTP API. Configuration will be upgraded automatically.

  • #12488 The RocketMQ bridge has been split into connector and action components. They are backwards compatible with the bridge HTTP API. Configuration will be upgraded automatically.

  • #12512 The HSTreamDB bridge has been split into connector and action components. They are backwards compatible with the bridge HTTP API. Configuration will be upgraded automatically, however, it is recommended to do the upgrade manually as new fields have been added to the configuration.

  • #12543 The DynamoDB bridge has been split into connector and action components. They are backwards compatible with the bridge HTTP API. Configuration will be upgraded automatically.

  • #12595 The Kafka Consumer bridge has been split into connector and source components. They are backwards compatible with the bridge HTTP API. Configuration will be upgraded automatically.

  • #12619 The Microsoft SQL Server bridge has been split into connector and action components. They are backwards compatible with the bridge HTTP API. Configuration will be upgraded automatically.

  • #12381 Added new SQL functions: map_keys(), map_values(), map_to_entries(), join_to_string(), join_to_string(), join_to_sql_values_string(), is_null_var(), is_not_null_var().

    For more information on the functions and their usage, refer to the documentation.

  • #12427 Made possible to limit the number of Kafka partitions to utilize for Kafka data integration.

  • #12577 Changed the type of service_account_json of both GCP PubSub Producer and Consumer connectors to a string. Now, it's possible to set this field to a JSON-encoded string. Using the previous format (a HOCON map) is still supported but not encouraged.

  • #12581 Add JSON schema to schema registry.

    JSON Schema supports Draft 03, Draft 04 and Draft 06.

  • #12602 Enhanced health checking for IoTDB connector, using its ping API instead of just checking for an existing socket connection.

  • #12336 Isolate channels cleanup from other async tasks (like routes cleanup) by using a dedicated pool, as this task can be quite slow under high network latency conditions.

  • #12494 Improve MongoDB connector performance.

  • #12746 Add username log field.

    If MQTT client is connected with a non-empty username the logs and traces will include username field.

Bug Fixes

  • #11868 Fix a bug when will message was not published after session takeover.

  • #12347 Always render valid messages for egress MQTT data bridge from the data fetched by Rule SQL, even if the data is incomplete and placeholders used in the bridge configuration are missing. Previously, some messages were rendered as invalid and were discarded by the MQTT egress data bridge.

    Render undefined variables as empty strings in payload and topic templates of the MQTT egress data bridge. Previously, undefined variables were rendered as undefined strings.

  • #12472 Fixed an issue that could lead to some read operations on /api/v5/actions/ and /api/v5/sources/ to return 500 while rolling upgrades are underway.

  • #12492 Return Receive-Maximum in CONNACK for MQTT v5 clients.

    EMQX takes the min value of client's Receive-Maximum and server's max_inflight config as the max number of inflight (unacknowledged) messages allowed. Prior to this fix, the value was not sent back to the client in CONNACK message.

  • #12500 Now disconnected persistent sessions are returned in the GET /clients and GET /client/:clientid HTTP APIs.

    Known issue: the total count returned by this API may overestimate the total number of clients.

  • #12505 Upgrade Kafka producer client wolff from version 1.10.1 to 1.10.2

    The new version client keeps a long-lived metadata connection for each connector. This makes EMQX perform less new connection establishment for action and connector healchecks.

  • #12513 Change level of several flooding log events from warning to info.

  • #12530 Enhanced frame_too_large and malformed CONNECT packet parse failures to include more information to help troubleshooting.

  • #12541 Added a config validation to check if node.name is compatible with cluster.discover_strategy.

    For dns strategy with a or aaaa record types, all nodes must use (static) IP address as host name.

  • #12562 Add a new configuration root: durable_storage.

    This configuration tree contains the settings related to the new persistent session feature.

  • #12566 Enhanced the bootstrap file for REST API keys:

    • now the empty line will be skipped instead of throwing an error

    • keys from bootstrap file now have highest priority, if one of them conflicts with an old key, the old key will be deleted

  • #12646 Fix rule engine date time string parser.

    Prior to this fix, time zone shift can only work when date time string is at second level precision.

  • #12652 The subbits functions with 4 and 5 parameters are documented but did not exist in the implementation. These functions have now been added.

  • #12663 Fixed an issue where emqx_vm_cpu_use and emqx_vm_cpu_idle metrics in Prometheus endpoint /prometheus/stats are always calculating average usage since operating system boot.

  • #12668 Refactor the SQL function: date_to_unix_ts() by using calendar:datetime_to_gregorian_seconds/1. This change also added validation for the input date format.

  • #12672 Load {data_dir}/configs/cluster.hocon when generating node boot config.

    Logging related config changes made from the dashboard are persisted in {data_dir}/configs/cluster.hocon. Prior to this change, it only takes etc/emqx.conf to generate the boot config (including the logger part), then {data_dir}/configs/cluster.hocon is loaded to reconfigure the logger after boot is complete.

    This late reconfigure may cause some log segment files to be lost.

    Now {data_dir}/configs/cluster.hocon and etc/emqx.conf are both loaded (emqx.conf overlaying on top) to generate boot config.

  • #12696 Fixed an issue where attempting to reconnect an action or source could lead to the wrong error message being returned in the HTTP API.

  • #12714 Fixed some field errors in prometheus api /prometheus/stats.

    Related metrics names:

    • emqx_cluster_sessions_count
    • emqx_cluster_sessions_max
    • emqx_cluster_nodes_running
    • emqx_cluster_nodes_stopped
    • emqx_subscriptions_shared_count
    • emqx_subscriptions_shared_max

    Fixed the issue in endpoint: /stats that the values of fields subscriptions.shared.count and subscriptions.shared.max can not be updated in time when the client disconnected or unsubscribed the Shared-Subscription.

  • #12390 Fixed /license API request maybe crash during cluster join processes.

  • #12411 Fixed a bug where null values would be inserted as 1853189228 in int columns in Cassandra data integration.

  • #12522 Improved parsing for Kafka bootstrap hosts.

    Previously, spaces following commas in the Kafka bootstrap hosts list were included in the parsing result. This inclusion led to connection timeouts or DNS resolution failures due to the malformed host entries.

  • #12656 Added a topic check when creating a GCP PubSub Producer action, so it now fails when the topic does not exist or the provided credentials do not have enough permissions to use it.

  • #12678 The DynamoDB connector now explicitly reports the error reason upon connection failure. This update addresses the previous limitation where connection failures did not result in any explanation.

  • #12681 When sending messages to a RocketMQ bridge/action while debug level logging was activated, secrets could be emitted in debug level log messages. This has now been fixed.

  • #12715 Fixed an issue when configuration update could crash if connector for corresponding ingress data integration source has active channels.

  • #12740 Fixed an issue when durable session could not be kicked out.

Breaking Changes

  • #12576 Starting from 5.6, the "Configuration Manual" document will no longer include the bridges config root.

    A bridge is now either action + connector for egress data integration, or source + connector for ingress data integration. Please note that the bridges config (in cluster.hocon) and the REST API path api/v5/bridges still works, but considered deprecated.

  • #12634 Triple-quote string values in HOCON config files no longer support escape sequence.

    The detailed information can be found in this pull request. Here is a summary for the impact on EMQX users:

    • EMQX 5.6 is the first version to generate triple-quote strings in cluster.hocon, meaning for generated configs, there is no compatibility issue.
    • For user hand-crafted configs (such as emqx.conf) a thorough review is needed to inspect if escape sequences are used (such as \n, \r, \t and \\), if yes, such strings should be changed to regular quotes (one pair of ") instead of triple-quotes.
  • #12715 Fixed an issue when configuration update could crash if connector for corresponding ingress data integration source has active channels.

  • #12740 Fixed an issue when durable session could not be kicked out.