Rancher Release v2.9.0

Release v2.9.0

Important: Review the Install/Upgrade Notes before upgrading to any Rancher version.

Rancher v2.9.0 is the latest minor release of Rancher. This is a Community version release that introduces new features, enhancements, and various updates.

Highlights

Rancher General

Features and Enhancements

  • Rancher now supports Kubernetes v1.29 and v1.30. See #43110 for information on Rancher support for Kubernetes v1.29, and see #45089 for information on Rancher support for Kubernetes v1.30. Additionally, the upstream Kubernetes changelogs for v1.29 and v1.30 can be viewed for a full list of changes.

Behavior Changes

  • Kubernetes v1.25 and v1.26 are no longer supported. Before you upgrade to Rancher v2.9.0, make sure that all clusters are running Kubernetes v1.27 or later. See #45882.
  • The external-rules feature flag functionality is removed in Rancher v2.9.0 as the behavior is enabled by default. The feature flag is still present when upgrading from v2.8.5; however, enabling or disabling the feature won’t have any effect. For more information, see CVE-2023-32196 and #45863.
  • Rancher now validates Container Default Resource Limit on Projects. Validation mimics the upstream behavior of the Kubernetes API server when it validates LimitRanges. The container default resource configuration must have properly formatted quantities for all requests and limits. Limits for any resource must not be less than requests. See #39700.

Known Issues

  • Rancher v2.9.0 does not currently support vSphere CSI charts in RKE2 v1.30.2 instances that run Kubernetes v1.30. A fix in RKE2 is in progress and being tracked by #6334. Once the fix is completed, support for vSphere CSI charts while using RKE2 in Rancher v2.9.0 will be added via a KDM release. See #46132.

Rancher App (Global UI)

Features and Enhancements

  • A deprecation toggle has been added to the UI Apps > Charts view. When toggled on, you will see all deprecated charts that have been labeled with a Deprecated badge. See #11042.
  • The Cluster Dashboard view has replaced the Install Monitoring link with a Cluster Tools link. Clicking Cluster Tools takes you to the UI view to install cluster tools. The Add Cluster Badge link in the Cluster Dashboard has also been changed, to an icon button with a tooltip to customize appearance. See #10034 and #9228.
  • Improved Rancher UI navigation menu performance, for environments with many CRDs, namespaces or resource churn. See #7773.
  • Improved UI performance by supporting a new API feature to independently fetch schemes related to the primary resource. See #7716.

Major Bug Fixes

  • Fixed a bug where after a user created a resource and is returned to the list view for that resource, a duplicate row was shown. See #11151.
  • Fixed an issue where global permissions in the UI were displayed in random order. See #11013.

Known Issues

  • The Azure in-tree cloud provider has been removed for Kubernetes versions v1.30 and later. The Rancher UI has a known issue with not properly updating the cloud provider dropdown options for RKE1 and RKE2 clusters running Kubernetes v1.30 and later. As a workaround, you can modify your configuration with the Edit Yaml option located in the dropdown attached to your respective cluster in the Cluster Management view. See #11363.

Security

Features and Enhancements

  • Added a new setting, agent-tls-mode, which allows users to specify if agents will use strict certificate verification when connecting to Rancher. The field can be set to strict (which requires the agent to verify the certificate using only the Certificate Authority in the cacerts setting) or system-store (which allows the agent to verify the certificate using any Certificate Authority in the operating system’s trust store). This setting will default to strict on new installs of v2.9.0+. When upgrading from a prior version, the current value will be kept. For example, when upgrading from 2.8 (where the setting defaults to system-store), the value will still be system-store after the upgrade to 2.9. See #45628.
    Important: Agents running on Windows nodes do not yet respect the agent-tls-mode setting, but will continue to function as expected. See #46396.

Breaking Changes

  • When agent-tls-mode is set to strict, users must provide the certificate authority to Rancher or downstream clusters will disconnect from Rancher, and require manual intervention to fix. This applies to several setup types, including:
    • Let’s Encrypt - When set to strict, users must upload the Let’s Encrypt Certificate Authority and provide privateCA=true when installing the chart.
    • Bring Your Own Cert - When set to strict, users must upload the Certificate Authority used to generate the cert and provide privateCA=true when installing the chart.
    • Proxy/External - When set to strict, users must upload the Certificate Authority used by the proxy and provide privateCA=true when installing the chart.

RKE Provisioning

Important: With the release of Rancher Kubernetes Engine (RKE) v1.6.0, we are informing customers that RKE is now deprecated. RKE will be maintained for two more versions, following our deprecation policy.

Please note, End-of-Life (EOL) for RKE is July 31st, 2025. Prime customers must re-platform from RKE to RKE2 or k3s.

RKE2 and k3s provide stronger security, and move away from upstream-deprecated Docker machine. Learn more about re-platforming here.

Features and Enhancements

  • Rancher now supports Docker v26.0 and v26.1 for RKE provisioning. See #45013 for supporting v26.0 information and #45439 for supporting v26.1 information.

Major Bug Fixes

  • Fixed an issue where scaling up etcd nodes in RKE would lead to nodes being stuck waiting to register with Kubernetes. This caused clusters to hang. See #43356.

Behavior Changes

  • Rancher has added support for external Azure cloud providers in downstream RKE clusters. Note that migration to an external Azure cloud provider is required when running Kubernetes v1.30 and recommended when running Kubernetes v1.29. See #44857.
  • Weave CNI support for RKE clusters is removed in response to Weave CNI not being supported by upstream Kubernetes v1.30 and later. See #45954

Known Issues

  • The Weave CNI plugin for RKE v1.27 and later is now deprecated, due to the plugin being deprecated for upstream Kubernetes v1.27 and later. RKE creation will not go through as it will raise a validation warning. See #11322.

RKE2 Provisioning

Features and Enhancements

  • Rancher now provides support for configuring the data directory for RKE2 clusters through the UI Cluster Configuration > Advanced tab. Note that values can only be set during cluster creation. See #45038 and #10824. For known limitations with this feature please note the section below
  • Added a new annotation, rke.cattle.io/delete-missing-custom-machines-after, to enable machine deletion on corresponding Node deletion. Users can set the annotation to the clusters.provisioning.cattle.io/v1 or rkecluster.rke.cattle.io objects. Once set, it causes missing machines to be deleted after a specified duration (example: rke.cattle.io/delete-missing-custom-machines-after: '5s'). See #43686.

Major Bug Fixes

  • Fixed an issue where scaling down etcd node pools on RKE2 machine-provisioned clusters caused unexpected behavior and clusters to hang in various states. See #43097 and #42582.

Behavior Changes

  • Rancher has added support for external Azure cloud providers in downstream RKE2 clusters. Note that migration to an external Azure cloud provider is required when running Kubernetes v1.30 and recommended when running Kubernetes v1.29. See #44856.
  • Added a new annotation, provisioning.cattle.io/allow-dynamic-schema-drop. When set to true, it drops the dynamicSchemaSpec field from machine pool definitions. This prevents cluster nodes from re-provisioning unintentionally when the cluster object is updated from an external source such as Terraform or Fleet. See #44618.

Known Issues

  • Currently there are known issues with the data directory feature which are outlined below:
    • Provisioning an RKE2 cluster with the feature enabled causes snapshot restores to fail. See #46066.
    • Custom RKE2 clusters need to ensure the environment variable (CATTLE_AGENT_VAR_DIR) is configured into the system agent install command as the default variable is missing. A workaround exists to manually specify the CATTLE_AGENT_VAR_DIR in the registration command per this comment. Currently, node driver provisioned clusters can make use of the data directory. See #46362.
    • K3s does not support the data directory feature. See #10589.
    • The system-agent-upgrader SUC plan is not applied correctly after cluster provisioning leading to the specified system-agent data directory not being used. See #46361.
    • Currently selecting Use the same path for System-agent, Provisioning and K8s Distro data directories configuration results in Rancher using the same data directory for system agent, provisioning, and distribution components as opposed to appending the specified component names to the root directory. To mitigate this issue, you will need to configure the 3 paths separately and they must follow the guidelines below:
      - Absolute paths (start with /)
      - Clean (not contain env vars, shell expressions, ., or …)
      - Not set to the same thing
      - Not nested one within another
      See #11566.
  • When adding the provisioning.cattle.io/allow-dynamic-schema-drop annotation through the cluster config UI, the annotation disappears before adding the value field. When viewing the YAML, the respective value field is not updated and is displayed as an empty string. As a workaround, when creating the cluster, set the annotation by using the Edit Yaml option located in the dropdown attached to your respective cluster in the Cluster Management view. See #11435.

Rancher CLI

Features and Enhancements

  • Added support for authentication with an Azure AD provider with kubectl through the Rancher CLI. Note you must allow public client flows to login from Rancher CLI. See #42939.

Known Issues

  • The Rancher CLI currently lists the Azure authentication provider options out of order. See #46128.

Authentication

Features and Enhancements

  • Extended Rancher’s Azure AD (aka Microsoft Entra ID) authentication provider with a filter field which lists the groups attached to a user. When you enable the setting the groups of a user are later filtered down as per the provided clauses. See #42940.
    Caution: This filter does not affect the general group listing, only the list of groups for a user. Additionally, if you filter out a group, all permissions the group would have given will not apply. As the filter prevents Rancher from seeing that the user belongs to the group, it also does not see any permissions from that group. This means that filtering can have the side effect of denying users permissions they should have.
  • Added authentication support for generic OpenID Connect providers, such as Keycloak and Okta. See #10053.

Major Bug Fixes

  • Upon login using a SAML provider, the lastLogin user attribute is now set at login time. See #46124.
  • An issue was fixed where Rancher was not updating additional groups a user had been added to in their SAML provider. See #45956.

Known Issues

  • There are some known issues with the OpenID Connect provider support:
    • When the generic OIDC auth provider is enabled, and you attempt to add auth provider users to a cluster or project, users are not populated in the dropdown search bar. This is expected behavior as the OIDC auth provider alone is not searchable. See #46104.
    • When the generic OIDC auth provider is enabled, auth provider users that are added to a cluster/project by their username are not able to access resources upon logging in. A user will only have access to resources upon login if the user is added by their userID. See #46105.
    • When the generic OIDC auth provider is enabled and an auth provider user in a nested group is logged into Rancher, the user will see the following error when they attempt to create a Project: [projectroletemplatebindings.management.cattle.io](http://projectroletemplatebindings.management.cattle.io/) is forbidden: User "u-gcxatwsnku" cannot create resource "projectroletemplatebindings" in API group "[management.cattle.io](http://management.cattle.io/)" in the namespace "p-9t5pg". However, the project is still created. See #46106.

Pod Security Admissions (PSA)

Major Bug Fixes

  • Rancher now updates the value of the PSACT rancher-restricted to include cattle-provisioning-capi-system and cattle-fleet-local-system under the exemptions.namespaces list. See #43150.

Extensions

Features and Enhancements

  • Rancher now enables UI extensions to load without the need for authentication. To do so set the extension chart’s annotation noAuth to true. For more information see Extensions configuration | Rancher UI Extensions. See #43090.
  • A new feature flag uiextensions has been added for enabling and disabling the UI extension feature (this replaces the need to install the ui-plugin-operator). The first time it’s set to true (the default value is true) it will create the CRD and enable the controllers and endpoints necessary for the feature to work. If set to false, it won’t create the CRD if it doesn’t already exist, but it won’t delete it if it does. It will also disable the controllers and endpoints used by the feature. Enabling or disabling the feature flag will cause Rancher to restart. See #44230 and #43089.

Behavior Changes

  • UI extension owners must update and publish a new version of their extensions to be compatible with Rancher v2.9.0 and later. For more information see the Rancher v2.9 extension support page.

Role-Based Access Control (RBAC)

Features and Enhancements

  • Users can now provide rules that apply only in specific namespaces using the new field namespacedRules, which has been added to the GlobalRules type. View this example configuration and context and see #42215 for general information.

  • Added the new field, InheritedFleetWorkspacePermissions, to the GlobalRole type. The field allows users to deploy resources using Fleet in downstream clusters. The Restricted Admin user, which was once needed to deploy resources, is now deprecated and can be replaced by using InheritedFleetWorkspacePermissions. Note that Admin or Restricted Admin users can create GlobalRoles with InheritedFleetWorkspacePermissions that allow users to deploy Fleet resources in all workspaces except fleet-local.
    Each fleetworkspace has a backing namespace in the local cluster. For a user to be able to deploy resources using Fleet in a downstream cluster they need:

    • Permissions to deploy Fleet resources in the backing namespace of the fleetworkspace. For example, permission to create GitRepos in fleet-default.
    • Permissions to get fleetworkspace cluster-wide resources in the local cluster.

    Two new fields were added to GlobalRole in order to provide access to all Fleet workspaces except fleet-local:

    • InheritedFleetWorkspacePermissions.ResourceRules: rules granted in all backing namespaces for all Fleet workspaces excluding fleet-local.
    • InheritedFleetWorkspacePermissions.WorkspaceVerbs: verbs used to grant permissions to the cluster-wide fleetworkspace resources. ResourceNames for this rule will contain all Fleet workspace names except fleet-local.

    View this example configuration and context and see #42170 for general information.

vSphere Charts

Major Bug Fixes

  • Support for using snapshots to backup vSphere CSI volumes has been fixed by adding a missing csi-snapshotter field used by images in the Rancher provided chart configuration. See #41321.

VM Management (Harvester)

Major Bug Fixes

  • Fixed a bug where attempting to provision a Harvester RKE2 cluster and adding two vGPUs would have the provisioning process get stuck in a starting state and the VMs would not start. See #10947.

Known Issues

  • In the Rancher UI when navigating between Harvester clusters of different versions a refresh may be required to view version specific functionality. See #11559.

Install/Upgrade Notes

Upgrade Requirements

  • Creating backups: Create a backup before you upgrade Rancher. To roll back Rancher after an upgrade, you must first back up and restore Rancher to the previous Rancher version. Because Rancher will be restored to the same state as when the backup was created, any changes post-upgrade will not be included after the restore.
  • CNI requirements:
    • For Kubernetes v1.19 and later, disable firewalld as it’s incompatible with various CNI plugins. See #28840.
    • When upgrading or installing a Linux distribution that uses nf_tables as the backend packet filter, such as SLES 15, RHEL 8, Ubuntu 20.10, Debian 10, or later, upgrade to RKE v1.19.2 or later to get Flannel v0.13.0. Flannel v0.13.0 supports nf_tables. See Flannel #1317.
  • Requirements for air gapped environments:
    • When using a proxy in front of an air-gapped Rancher instance, you must pass additional parameters to NO_PROXY. See the documentation and issue #2725.
    • When installing Rancher with Docker in an air-gapped environment, you must supply a custom registries.yaml file to the docker run command, as shown in the K3s documentation. If the registry has certificates, then you’ll also need to supply those. See #28969.
  • Requirements for general Docker installs:
    • When starting the Rancher Docker container, you must use the privileged flag. See documentation.
    • When upgrading a Docker installation, a panic may occur in the container, which causes it to restart. After restarting, the container will come up and work as expected. See #33685.

Versions

Please refer to the README for the latest and stable Rancher versions.

Please review our version documentation for more details on versioning and tagging conventions.

Important: With the release of Rancher Kubernetes Engine (RKE) v1.6.0, we are informing customers that RKE is now deprecated. RKE will be maintained for two more versions, following our deprecation policy.

Please note, End-of-Life (EOL) for RKE is July 31st, 2025. Prime customers must re-platform from RKE to RKE2 or k3s.

RKE2 and k3s provide stronger security, and move away from upstream-deprecated Docker machine. Learn more about re-platforming here.

Images

  • rancher/rancher:v2.9.0

Tools

Kubernetes Versions for RKE

  • v1.30.2 (Default)
  • v1.29.6
  • v1.28.11
  • v1.27.15

Kubernetes Versions for RKE2/K3s

  • v1.30.2 (Default)
  • v1.29.6
  • v1.28.11
  • v1.27.15

Rancher Helm Chart Versions

In Rancher v2.6.0 and later, in the Apps & Marketplace UI, many Rancher Helm charts are named with a major version that starts with 100. This avoids simultaneous upstream changes and Rancher changes from causing conflicting version increments. This also complies with semantic versioning (SemVer), which is a requirement for Helm. You can see the upstream version number of a chart in the build metadata, for example: 100.0.0+up2.1.0. See #32294.

Other Notes

Experimental Features

Rancher now supports the ability to use an OCI Helm chart registry for Apps & Marketplace. View documentation on using OCI based Helm chart repositories and note this feature is in an experimental stage. See #29105 and #45062

Deprecated Upstream Projects

In June 2023, Microsoft deprecated the Azure AD Graph API that Rancher had been using for authentication via Azure AD. When updating Rancher, update the configuration to make sure that users can still use Rancher with Azure AD. See the documentation and issue #29306 for details.

Removed Legacy Features

Apps functionality in the cluster manager has been deprecated as of the Rancher v2.7 line. This functionality has been replaced by the Apps & Marketplace section of the Rancher UI.

Also, rancher-external-dns and rancher-global-dns have been deprecated as of the Rancher v2.7 line.

The following legacy features have been removed as of Rancher v2.7.0. The deprecation and removal of these features was announced in previous releases. See #6864.

UI and Backend

  • CIS Scans v1 (Cluster)
  • Pipelines (Project)
  • Istio v1 (Project)
  • Logging v1 (Project)
  • RancherD

UI

  • Multiclusterapps (Global): Apps within the Multicluster Apps section of the Rancher UI.

Previous Rancher Behavior Changes

Previous Rancher Behavior Changes - Rancher General

  • Rancher 2.8.4:
    • The controller now cleans up instances of ClusterUserAttribute that have no corresponding UserAttribute. See #44985.
  • Rancher 2.8.3:
    • When Rancher starts, it now identifies all deprecated and unrecognized setting resources and adds a cattle.io/unknown label. You can list these settings with the command kubectl get settings -l 'cattle.io/unknown==true'. In Rancher v2.9 and later, these settings will be removed instead. See #43992.
  • Rancher v2.8.0:
    • Rancher Compose is no longer supported, and all parts of it are being removed in the v2.8 release line. See #43341.
    • Kubernetes v1.23 and v1.24 are no longer supported. Before you upgrade to Rancher v2.8.0, make sure that all clusters are running Kubernetes v1.25 or later. See #42828.

Previous Rancher Behavior Changes - Cluster Provisioning

  • Rancher 2.8.4:
    • Docker CLI 20.x is at end-of-life and no longer supported in Rancher. Please update your local Docker CLI versions to 23.0.x or later. Earlier versions may not recognize OCI compliant Rancher image manifests. See #45424.
  • Rancher v2.8.0:
    • Kontainer Engine v1 (KEv1) provisioning and the respective cluster drivers are now deprecated. KEv1 provided plug-ins for different targets using cluster drivers. The Rancher-maintained cluster drivers for EKS, GKE and AKS have been replaced by the hosted provider drivers, EKS-Operator, GKE-Operator and AKS-Operator. Node drivers are now available for self-managed Kubernetes.
  • Rancher v2.7.2:
    • When you provision a downstream cluster, the cluster’s name must conform to RFC-1123. Previously, characters that did not follow the specification, such as ., were permitted and would result in clusters being provisioned without the necessary Fleet components. See #39248.
    • Privilege escalation is disabled by default when creating deployments from the Rancher API. See #7165.

Previous Rancher Behavior Changes - RKE Provisioning

  • Rancher v2.8.0:
    • Rancher no longer supports the Amazon Web Services (AWS) in-tree cloud provider for RKE clusters. This is in response to upstream Kubernetes removing the in-tree AWS provider in Kubernetes v1.27. You should instead use the out-of-tree AWS cloud provider for any Rancher-managed clusters running Kubernetes v1.27 or later. See #43175.
    • The Weave CNI plugin for RKE v1.27 and later is now deprecated. Weave will be removed in RKE v1.30. See #42730.

Previous Rancher Behavior Changes - RKE2 Provisioning

  • Rancher v2.8.0:
    • Rancher no longer supports the Amazon Web Services (AWS) in-tree cloud provider for RKE2 clusters. This is in response to upstream Kubernetes removing the in-tree AWS provider in Kubernetes v1.27. You should instead use the out-of-tree AWS cloud provider for any Rancher-managed clusters running Kubernetes v1.27 or later. See #42749.
    • Similar to Rancher v2.7.9, when you upgrade to Rancher v2.8.0 with provisioned RKE2/K3s clusters in an unhealthy state, you may encounter the error message, implausible joined server for entry. This requires manually marking the nodes in the cluster with a joined server. See #42856.

Previous Rancher Behavior Changes - Cluster API

  • Rancher v2.7.7:
    • The cluster-api core provider controllers run in a pod in the cattle-provisioning-cattle-system namespace, within the local cluster. These controllers are installed with a Helm chart. Previously, Rancher ran cluster-api controllers in an embedded fashion. This change makes it easier to maintain cluster-api versioning. See #41094.
    • The token hashing algorithm generates new tokens using SHA3. Existing tokens that don’t use SHA3 won’t be re-hashed. This change affects ClusterAuthTokens (the downstream synced version of tokens for ACE) and Tokens (only when token hashing is enabled). SHA3 tokens should work with ACE and Token Hashing. Tokens that don’t use SHA3 may not work when ACE and token hashing are used in combination. If, after upgrading to Rancher v2.7.7, you experience issues with ACE while token hashing is enabled, re-generate any applicable tokens. See #42062.

Previous Rancher Behavior Changes - Rancher App (Global UI)

  • Rancher v2.8.0:
    • The built-in restricted-admin role is being deprecated in favor of a more flexible global role configuration, which is now available for different use cases other than only the restricted-admin. If you want to replicate the permissions given through this role, use the new inheritedClusterRoles feature to create a custom global role. A custom global role, like the restricted-admin role, grants permissions on all downstream clusters. See #42462. Given its deprecation, the restricted-admin role will continue to be included in future builds of Rancher through the v2.8.x and v2.9.x release lines. However, in accordance with the CVSS standard, only security issues scored as critical will be backported and fixed in the restricted-admin role until it is completely removed from Rancher.
    • Reverse DNS server functionality has been removed. The associated rancher/rdns-server repository is now archived. Reverse DNS is already disabled by default.
    • The Rancher CLI configuration file ~/.rancher/cli2.json previously had permissions set to 0644. Although 0644 would usually indicate that all users have read access to the file, the parent directory would block users’ access. New Rancher CLI configuration files will only be readable by the owner (0600). Invoking the CLI will trigger a warning, in case old configuration files are world-readable or group-readable. See #42838.

Previous Rancher Behavior Changes - Rancher App (Helm Chart)

  • Rancher v2.7.0:
    • When installing or upgrading an official Rancher Helm chart app in a RKE2/K3s cluster, if a private registry exists in the cluster configuration, that registry will be used for pulling images. If no cluster-scoped registry is found, the global container registry will be used. A custom default registry can be specified during the Helm chart install and upgrade workflows. Previously, only the global container registry was used when installing or upgrading an official Rancher Helm chart app for RKE2/K3s node driver clusters.

Previous Rancher Behavior Changes - Pod Security Standard (PSS) & Pod Security Admission (PSA)

  • Rancher v2.7.2:
    • You must manually change the psp.enabled value in the chart install yaml when you install or upgrade v102.x.y charts on hardened RKE2 clusters. Instructions for updating the value are available. See #41018.

Previous Rancher Behavior Changes - Authentication

  • Rancher v2.8.3:
    • Rancher uses additional trusted CAs when establishing a secure connection to the keycloak OIDC authentication provider. See #43217.
  • Rancher v2.8.0:
    • The kubeconfig-token-ttl-minutes setting has been replaced by the setting, kubeconfig-default-token-ttl-minutes, and is no longer available in the UI. See #38535.
    • API tokens now have default time periods after which they expire. Authentication tokens expire after 90 days, while kubeconfig tokens expire after 30 days. See #41919.
  • Rancher v2.7.2:
    • Rancher might retain resources from a disabled auth provider configuration in the local cluster, even after you configure another auth provider. To manually trigger cleanup for a disabled auth provider, add the management.cattle.io/auth-provider-cleanup annotation with the unlocked value to its auth config. See #40378.

Previous Rancher Behavior Changes - Rancher Webhook

  • Rancher v2.8.3:
    • The embedded Cluster API webhook is removed from the Rancher webhook and can no longer be installed from the webhook chart. It has not been used as of Rancher v2.7.7, where it was migrated to a separate Pod. See #44619.
  • Rancher v2.8.0:
    • Rancher’s webhook now honors the bind and escalate verbs for GlobalRoles. Users who have * set on GlobalRoles will now have both of these verbs, and could potentially use them to escalate privileges in Rancher v2.8.0 and later. You should review current custom GlobalRoles, especially cases where bind, escalate, or * are granted, before you upgrade.
  • Rancher v2.7.5:
    • Rancher installs the same pinned version of the rancher-webhook chart not only in the local cluster but also in all downstream clusters. Restoring Rancher from v2.7.5 to an earlier version will result in downstream clusters’ webhooks being at the version set by Rancher v2.7.5, which might cause incompatibility issues. Local and downstream webhook versions need to be in sync. See #41730 and #41917.
    • The mutating webhook configuration for secrets is no longer active in downstream clusters. See #41613.

Previous Rancher Behavior Changes - Apps & Marketplace

  • Rancher v2.8.0:
    • Legacy code for the following v1 charts is no longer available in the rancher/system-charts repository:

      • rancher-cis-benchmark
      • rancher-gatekeeper-operator
      • rancher-istio
      • rancher-logging
      • rancher-monitoring

      The code for these charts will remain available for previous versions of Rancher.

    • Helm v2 support is deprecated as of the Rancher v2.7 line and will be removed in Rancher v2.9.

  • Rancher v2.7.0:
    • Rancher no longer validates an app registration’s permissions to use Microsoft Graph on endpoint updates or initial setup. You should add Directory.Read.All permissions of type Application. If you configure a different set of permissions, Rancher may not have sufficient privileges to perform some necessary actions within Azure AD, causing errors.
    • The multi-cluster app legacy feature is no longer available. See #39525.

Previous Rancher Behavior Changes - OPA Gatekeeper

  • Rancher v2.8.0:
    • OPA Gatekeeper is now deprecated and will be removed in a future release. As a replacement for OPA Gatekeeper, consider switching to Kubewarden. See #42627.

Previous Rancher Behavior Changes - Feature Charts

  • Rancher v2.7.0:
    • A configurable priorityClass is available in the Rancher pod and its feature charts. Previously, pods critical to running Rancher didn’t use a priority class. This could cause a cluster with limited resources to evict Rancher pods before other noncritical pods. See #37927.

Previous Rancher Behavior Changes - Backup/Restore

  • Rancher v2.7.7:
    • If you use a version of backup-restore older than v102.0.2+up3.1.2 to take a backup of Rancher v2.7.7, the migration will encounter a capi-webhook error. Make sure that the chart version used for backups is v102.0.2+up3.1.2, which has cluster.x-k8s.io/v1alpha4 resources removed from the resourceSet. If you can’t use v102.0.2+up3.1.2 for backups, delete all cluster.x-k8s.io/v1alpha4 resources from the backup tar before using it. See #382.

Previous Rancher Behavior Changes - Logging

  • Rancher v2.7.0:
    • Rancher defaults to using the bci-micro image for sidecar audit logging. Previously, the default image was Busybox. See #35587.

Previous Rancher Behavior Changes - Monitoring

  • Rancher v2.7.2:
    • Rancher maintains a /v1/counts endpoint that the UI uses to display resource counts. The UI subscribes to changes to the counts for all resources through a websocket to receive the new counts for resources.
      • Rancher aggregates the changed counts and only sends a message every 5 seconds. This, in turn, requires the UI to update the counts at most once every 5 seconds, improving UI performance. Previously, Rancher would send a message each time the resource counts changed for a resource type. This lead to the UI needing to constantly stop other areas of processing to update the resource counts. See #36682.
      • Rancher now only sends back a count for a resource type if the count has changed from the previously known number, improving UI performance. Previously, each message from this socket would include all counts for every resource type in the cluster, even if the counts only changed for one specific resource type. This would cause the UI to need to re-update resource counts for every resource type at a high frequency, with a significant performance impact. See #36681.

Previous Rancher Behavior Changes - Project Monitoring

  • Rancher v2.7.2:
    • The Helm Controller in RKE2/K3s respects the managedBy annotation. In its initial release, Project Monitoring V2 required a workaround to set helmProjectOperator.helmController.enabled: false, since the Helm Controller operated on a cluster-wide level and ignored the managedBy annotation. See #39724.

Previous Rancher Behavior Changes - Security

  • Rancher v2.8.0:
    • TLS v1.0 and v1.1 are no longer supported for Rancher app ingresses. See #42027.

Previous Rancher Behavior Changes - Extensions

  • Rancher 2.8.4:
    • The Rancher dashboard fails to load an extension that utilizes backported Vue 3 features, displaying an error in the console object(...) is not a function. New extensions that utilize the defineComponent will not be backwards compatible with older versions of the dashboard. Existing extensions should continue to work moving forward. See #10568.

Long-standing Known Issues

Long-standing Known Issues - Cluster Provisioning

  • Not all cluster tools can be installed on a hardened cluster.

  • Rancher v2.8.1:

    • When you attempt to register a new etcd/controlplane node in a CAPR-managed cluster after a failed etcd snapshot restoration, the node can become stuck in a perpetual paused state, displaying the error message [ERROR] 000 received while downloading Rancher connection information. Sleeping for 5 seconds and trying again. As a workaround, you can unpause the cluster by running kubectl edit clusters.cluster clustername -n fleet-default and set spec.unpaused to false. See #43735.
  • Rancher v2.7.2:

    • If you upgrade or update any hosted cluster, and go to Cluster Management > Clusters while the cluster is still provisioning, the Registration tab is visible. Registering a cluster that is already registered with Rancher can cause data corruption. See #8524.
    • When you upgrade your Kubernetes cluster, you might see the following error: Cluster health check failed. This is a benign error that occurs as part of the upgrade process, and will self-resolve. It’s caused by the Kubernetes API server becoming temporarily unavailable as it is being upgraded within your cluster. See #41012.
    • Once you configure a setting with an environmental variable, it can’t be updated through the Rancher API or the UI. It can only be updated through changing the value of the environmental variable. Setting the environmental variable to “” (the empty string) changes the value in the Rancher API but not in Kubernetes. As a workaround, run kubectl edit setting <setting-name>, then set the value and source fields to "", and re-deploy Rancher. See #37998.
  • Rancher 2.6.1:

    • When using the Rancher UI to add a new port of type ClusterIP to an existing Deployment created using the legacy UI, the new port won’t be created upon your first attempt to save the new port. You must repeat the procedure to add the port again. The Service Type field will display Do not create a service during the second procedure. Change this to ClusterIP and save to create the new port. See #4280.

Long-standing Known Issues - RKE2 Provisioning

  • Rancher v2.7.7:
    • Due to the backoff logic in various components, downstream provisioned K3s and RKE2 clusters may take longer to re-achieve Active status after a migration. If you see that a downstream cluster is still updating or in an error state immediately after a migration, please let it attempt to resolve itself. This might take up to an hour to complete. See #34518 and #42834.
  • Rancher v2.7.6:
    • Provisioning RKE2/K3s clusters with added (not built-in) custom node drivers causes provisioning to fail. As a workaround, fix the added node drivers after activating. See #37074.
  • Rancher v2.7.4:
    • RKE2 clusters with invalid values for tolerations or affinity agent customizations don’t display an error message, and remain in an Updating state. This causes cluster creation to hang. See #41606.
  • Rancher v2.7.2:
    • When viewing or editing the YAML configuration of downstream RKE2 clusters through the UI, spec.rkeConfig.machineGlobalConfig.profile is set to null, which is an invalid configuration. See #8480.
    • Deleting nodes from custom RKE2/K3s clusters in Rancher v2.7.2 can cause unexpected behavior, if the underlying infrastructure isn’t thoroughly cleaned. When deleting a custom node from your cluster, ensure that you delete the underlying infrastructure for it, or run the corresponding uninstall script for the Kubernetes distribution installed on the node. See #41034:
  • Rancher v2.6.9:
    • Deleting a control plane node results in worker nodes also reconciling. See #39021.
  • Rancher v2.6.4:
    • Communication between the ingress controller and the pods doesn’t work when you create an RKE2 cluster with Cilium as the CNI and activate project network isolation. See documentation and #34275.
  • Rancher v2.6.3:
    • When provisioning clusters with an RKE2 cluster template, the rootSize for AWS EC2 provisioners doesn’t take an integer when it should, and an error is thrown. As a workaround, wrap the EC2 rootSize in quotes. See #40128.
  • Rancher v2.6.0:
    • Amazon ECR Private Registries don’t work from RKE2/K3s. See #33920.

Long-standing Known Issues - K3s Provisioning

  • Rancher v2.7.7:
    • Due to the backoff logic in various components, downstream provisioned K3s and RKE2 clusters may take longer to re-achieve Active status after a migration. If you see that a downstream cluster is still updating or in an error state immediately after a migration, please let it attempt to resolve itself. This might take up to an hour to complete. See #34518 and #42834.
  • Rancher v2.7.6:
    • Provisioning RKE2/K3s clusters with added (not built-in) custom node drivers causes provisioning to fail. As a workaround, fix the added node drivers after activating. See #37074.
  • Rancher v2.7.2:
    • Clusters remain in an Updating state even when they contain nodes in an Error state. See #39164.
    • Deleting nodes from custom RKE2/K3s clusters in Rancher v2.7.2 can cause unexpected behavior, if the underlying infrastructure isn’t thoroughly cleaned. When deleting a custom node from your cluster, ensure that you delete the underlying infrastructure for it, or run the corresponding uninstall script for the Kubernetes distribution installed on the node. See #41034:
  • Rancher v2.6.0:
    • Amazon ECR Private Registries don’t work from RKE2/K3s. See #33920.
    • Deleting a control plane node results in worker nodes also reconciling. See #39021

Long-standing Known Issues - Rancher App (Global UI)

  • Rancher 2.7.7:
    • When creating a cluster in the Rancher UI it does not allow the use of an underscore _ in the Cluster Name field. See #9416.
  • Rancher 2.7.2:
    • When creating a GKE cluster in the Rancher UI you will see provisioning failures as the clusterIpv4CidrBlock and clusterSecondaryRangeName fields conflict. See #8749.

Long-standing Known Issues - Hosted Rancher

  • Rancher v2.7.5:
    • The Cluster page shows the Registration tab when updating or upgrading a hosted cluster. See #8524.

Long-standing Known Issues - Docker Install

  • Rancher v2.6.4:
    • Single node Rancher won’t start on Apple M1 devices with Docker Desktop 4.3.0 or later. See #35930.
  • Rancher v2.6.3:
    • On a Docker install upgrade and rollback, Rancher logs repeatedly display the messages “Updating workload ingress-nginx/nginx-ingress-controller” and “Updating service frontend with public endpoints”. Ingresses and clusters are functional and active, and logs resolve eventually. See #35798 and #40257.
  • Rancher v2.5.0:
    • UI issues may occur due to longer startup times. When launching Docker for the first time, you’ll receive an error message stating, “Cannot read property endsWith of undefined”, as described in #28800. You’ll then be directed to a login screen. See #28798.

Long-standing Known Issues - Windows

  • Rancher v2.5.8:
    • Windows nodeAgents are not deleted when performing a helm upgrade after disabling Windows logging on a Windows cluster. See #32325.
    • If you deploy Monitoring V2 on a Windows cluster with win_prefix_path set, you must deploy Rancher Wins Upgrader to restart wins on the hosts. This will allow Rancher to start collecting metrics in Prometheus. See #32535.

Long-standing Known Issues - Windows Nodes in RKE2 Clusters

  • Rancher v2.6.4:
    • NodePorts do not work on Windows Server 2022 in RKE2 clusters due to a Windows kernel bug. See #159.

Long-standing Known Issues - AKS

  • Rancher v2.7.2:
    • Imported Azure Kubernetes Service (AKS) clusters don’t display workload level metrics. This bug affects Monitoring V1. A workaround is available. See #4658.
  • Rancher v2.6.x:
    • Windows node pools are not currently supported. See #32586.
  • Rancher v2.6.0:
    • When editing or upgrading an Azure Kubernetes Service (AKS) cluster, do not make changes from the Azure console or CLI at the same time. These actions must be done separately. See #33561.

Long-standing Known Issues - EKS

  • Rancher v2.7.0:
    • EKS clusters on Kubernetes v1.21 or below on Rancher v2.7 cannot be upgraded. See #39392.

Long-standing Known Issues - GKE

  • Rancher v2.5.8:
    • Basic authentication must be explicitly disabled in GCP before upgrading a GKE cluster to Kubernetes v1.19+ in Rancher. See #32312.

Long-standing Known Issues - Pod Security Standard (PSS) & Pod Security Admission (PSA)

  • Rancher v2.6.4:
    • The deployment’s securityContext section is missing when a new workload is created. This prevents pods from starting when Pod Security Policy (PSP) support is enabled. See #4815.

Long-standing Known Issues - Authentication

  • Rancher v2.7.7:
    • The SAML authentication pop-up throws a 404 error on high-availability RKE installations. Single node Docker installations aren’t affected. If you refresh the browser window and select Resend, the authentication request will succeed, and you will be able to log in. See #31163.
  • Rancher v2.6.2:
    • Users on certain LDAP setups don’t have permission to search LDAP. When they attempt to perform a search, they receive the error message, Result Code 32 "No Such Object". See #35259.

Long-standing Known Issues - Encryption

  • Rancher v2.5.4:
    • Rotating encryption keys with a custom encryption provider is not supported. See #30539.

Long-standing Known Issues - Rancher Webhook

  • Rancher v2.7.2:
    • A webhook is installed in all downstream clusters. There are several issues that users may encounter with this functionality:
      • If you rollback from a version of Rancher v2.7.2 or later, to a Rancher version earlier than v2.7.2, the webhooks will remain in downstream clusters. Since the webhook is designed to be 1:1 compatible with specific versions of Rancher, this can cause unexpected behaviors to occur downstream. The Rancher team has developed a script which should be used after rollback is complete (meaning after a Rancher version earlier than v2.7.2 is running). This removes the webhook from affected downstream clusters. See #40816.

Long-standing Known Issues - Harvester

  • Upgrades from Harvester v0.3.0 are not supported.
  • Rancher v2.8.4:
    • When provisioning a Harvester RKE1 cluster in Rancher, the vGPU field is not displayed under Cluster Management > Advanced Settings, this is not a supported feature. However, the vGPU field is available when provisioning a Harvester RKE2 cluster. See #10909.
    • When provisioning a multi-node Harvester RKE2 cluster in Rancher, you need to allocate one vGPU more than the number of nodes you have or provisioning will fail. See #11009 and v2.9.0 back-port issue #10989.
  • Rancher v2.7.2:
    • If you’re using Rancher v2.7.2 with Harvester v1.1.1 clusters, you won’t be able to select the Harvester cloud provider when deploying or updating guest clusters. The Harvester release notes contain instructions on how to resolve this. See #3750.
  • Rancher v2.6.1:
    • Deploying Fleet to Harvester clusters is not yet supported. Clusters, whether Harvester or non-Harvester, imported using the Virtualization Management page will result in the cluster not being listed on the Continuous Delivery page. See #35049.

Long-standing Known Issues - Continuous Delivery

  • Rancher v2.7.6:
    • Target customization can produce custom resources that exceed the Rancher API’s maximum bundle size. This results in Request entity too large errors when attempting to add a GitHub repo. Only target customizations that modify the Helm chart URL or version are affected. As a workaround, use multiple paths or GitHub repos instead of target customization. See #1650.
  • Rancher v2.6.1:
    • Deploying Fleet to Harvester clusters is not yet supported. Clusters, whether Harvester or non-Harvester, imported using the Virtualization Management page will result in the cluster not being listed on the Continuous Delivery page. See #35049.
  • Rancher v2.6.0:
    • Multiple fleet-agent pods may be created and deleted during initial downstream agent deployment, rather than just one. This resolves itself quickly, but is unintentional behavior. See #33293.

Long-standing Known Issues - Feature Charts

  • Rancher v2.6.5:
    • After installing an app from a partner chart repo, the partner chart will upgrade to feature charts if the chart also exists in the feature charts default repo. See #5655.

Long-standing Known Issues - CIS Scan

  • Rancher v2.8.3:
    • Some CIS checks related to file permissions fail on RKE and RKE2 clusters with CIS v1.7 and CIS v1.8 profiles. See #42971.
  • Rancher v2.7.2:
    • When running CIS scans on RKE and RKE2 clusters on Kubernetes v1.25, some tests will fail if the rke-profile-hardened-1.23 or the rke2-profile-hardened-1.23 profile is used. These RKE and RKE2 test cases failing is expected as they rely on PSPs, which have been removed in Kubernetes v1.25. See #39851.

Long-standing Known Issues - Backup/Restore

  • When migrating to a cluster with the Rancher Backup feature, the server-url cannot be changed to a different location. It must continue to use the same URL.

  • Rancher v2.7.7:

    • Due to the backoff logic in various components, downstream provisioned K3s and RKE2 clusters may take longer to re-achieve Active status after a migration. If you see that a downstream cluster is still updating or in an error state immediately after a migration, please let it attempt to resolve itself. This might take up to an hour to complete. See #34518 and #42834.
  • Rancher v2.6.3:

    • Because Kubernetes v1.22 drops the apiVersion apiextensions.k8s.io/v1beta1, trying to restore an existing backup file into a v1.22+ cluster will fail. The backup file contains CRDs with the apiVersion v1beta1. There are two workarounds for this issue: update the default resourceSet to collect the CRDs with the apiVersion v1, or update the default resourceSet and the client to use the new APIs internally. See the documentation and #34154.

Long-standing Known Issues - Istio

  • Istio v1.12 and below do not work on Kubernetes v1.23 clusters. To use the Istio charts, please do not update to Kubernetes v1.23 until the next charts’ release.

  • Rancher v2.6.4:

    • Applications injecting Istio sidecars, fail on SELinux RHEL 8.4 enabled clusters. A temporary workaround for this issue is to run the following command on each cluster node before creating a cluster: mkdir -p /var/run/istio-cni && semanage fcontext -a -t container_file_t /var/run/istio-cni && restorecon -v /var/run/istio-cni. See #33291.
  • Rancher v2.6.1:

    • Deprecated resources are not automatically removed and will cause errors during upgrades. Manual steps must be taken to migrate and/or cleanup resources before an upgrade is performed. See #34699.

Long-standing Known Issues - Logging

  • Rancher v2.5.8:
    • Windows nodeAgents are not deleted when performing a helm upgrade after disabling Windows logging on a Windows cluster. See #32325.

Long-standing Known Issues - Monitoring

  • Rancher v2.8.0:
    • Read-only project permissions and the View Monitoring role aren’t sufficient to view links on the Monitoring index page. Users won’t be able to see monitoring links. As a workaround, you can perform the following steps:

      1. If you haven’t already, install Monitoring on the project.
      2. Move the cattle-monitoring-system namespace into the project.
      3. Grant project users the View Monitoring (monitoring-ui-view) role, and read-only or higher permissions on at least one project in the cluster.

      See #4466.

Long-standing Known Issues - Project Monitoring

  • Rancher v2.5.8:
    • If you deploy Monitoring V2 on a Windows cluster with win_prefix_path set, you must deploy Rancher Wins Upgrader to restart wins on the hosts. This will allow Rancher to start collecting metrics in Prometheus. See #32535.