Rancher Release v2.7.4

Release v2.7.4

It is important to review the Install/Upgrade Notes below before upgrading to any Rancher version.

Rancher v2.7.4 is a security release to address the following issues:

Security Fixes for Rancher Vulnerabilities

This release addresses the following Rancher security issues:

  • Fixed an issue where users with update privileges on a namespace could move that namespace into a project they didn’t have access to. For more information, see CVE-2020-10676.
  • Fixed an issue where cross-site scripting (XSS) could allow a malicious user to inject code executed within another user’s browser. This would allow an attacker to steal sensitive information, manipulate web content, or perform other malicious activities on behalf of the victim. For more information, see CVE-2022-43760.
  • Fixed an issue that enabled Standard users or above to elevate their permissions to Administrator in the local cluster. For more information, see CVE-2023-22647.
  • Fixed an issue where changing a user’s permissions in Azure AD wasn’t reflected for users while they were logged in to the Rancher UI. This caused users to retain their previous permissions in Rancher, even if they changed groups on Azure AD. For more information, see CVE-2023-22648.

For more details, see the Security Advisories and CVEs page in Rancher’s documentation page or in Rancher’s GitHub repo.

Install/Upgrade Notes

Upgrade Requirements

  • Creating backups: We strongly recommend creating a backup before upgrading Rancher. To roll back Rancher after an upgrade, you must back up and restore Rancher to the previous Rancher version. Because Rancher will be restored to its state when a backup was created, any changes post upgrade will not be included after the restore. For more information, see the documentation on backing up Rancher.
  • Helm version: Rancher install or upgrade must occur with Helm 3.2.x+ due to the changes with the latest cert-manager release. See #29213.
  • CNI requirements:
    • For Kubernetes v1.19 and newer, we recommend disabling firewalld as it’s incompatible with various CNI plugins. See #28840.
    • If upgrading or installing a Linux distribution which uses nf_tables as the backend packet filter (such as SLES 15, RHEL 8, Ubuntu 20.10, Debian 10, or later), upgrade to RKE v1.19.2 or later to get Flannel version v0.13.0, which supports nf_tables. See Flannel #1317.
  • Requirements for air gapped environments:
    • When installing or upgrading Rancher in an air gapped environment, add the flag --no-hooks to the helm template command, to skip rendering files for Helm’s hooks. See #3226.
    • If using a proxy in front of an air-gapped Rancher instance, you must pass additional parameters to NO_PROXY. See the documentation and related issue #2725.
  • Requirements for Docker installs:
    • When starting the Rancher Docker container, you must use the privileged flag. See documentation.
    • When installing in an air-gapped environment, you must supply a custom registries.yaml file to the docker run command, as shown in the K3s documentation. If the registry has certificates, then you’ll also need to supply those. See #28969.
    • When upgrading a Docker installation, a panic may occur in the container, which causes it to restart. After restarting, the container will come up and work as expected. See #33685.

Rancher Behavior Changes

  • You must manually change the psp.enabled value in the chart install yaml when you install or upgrade v102.x.y charts on hardened RKE2 clusters. Instructions for updating the value are available. See #41018.
  • The Helm Controller in RKE2/K3s now respects the managedBy annotation. Project Monitoring V2 required a workaround in its initial release to set helmProjectOperator.helmController.enabled: false since the Helm Controller operated on a cluster-wide level and ignored the managedBy annotation. See #39724.
  • Rancher might retain resources from a disabled auth provider configuration in the local cluster, even after you configure another auth provider. To manually trigger cleanup for a disabled auth provider, add the management.cattle.io/auth-provider-cleanup annotation with the unlocked value to its auth config. See #40378.
  • Privilege escalation is now disabled by default when creating deployments. See #7165.
  • Rancher maintains a /v1/counts endpoint that the UI uses to display resource counts. The UI subscribes to changes to the counts for all resources through a websocket to receive the new counts for resources.
    • Rancher now aggregates the changed counts and only send a message every 5 seconds. This, in turn, requires the UI to update the counts at most once every 5 seconds, improving UI performance. Previously, Rancher would send a message each time the resource counts changed for a resource type. This lead to the UI needing to constantly stop other areas of processing to update the resource counts. See #36682.
    • Rancher now only sends back a count for a resource type if the count has changed from the previously known number, improving UI performance. Previously, each message from this socket would include all counts for every resource type in the cluster, even if the counts only changed for one specific resource type. This would cause the UI to need to re-update resource counts for every resource type at a high frequency, causing significant performance impact. See #36681.
  • When provisioning downstream clusters, the cluster name must now conform to RFC-1123. Previously, characters that did not follow the specification, such as ., were permitted and would result in clusters being provisioned without the necessary Fleet components. See #39248.
  • In previous versions, pods critical to running Rancher didn’t use a priority class. This could cause a cluster with limited resources to evict Rancher pods before other noncritical pods. A configurable priorityClass has been added to the Rancher pod and its feature charts. See #37927.
  • Rancher now defaults to using the bci-micro image for sidecar audit logging, instead of Busybox. See #35587.
  • Rancher no longer validates an app registration’s permissions to use Microsoft Graph on endpoint updates or initial setup. You should add Directory.Read.All permissions of type Application. If you configure a different set of permissions, Rancher may not have sufficient privileges to perform some necessary actions within Azure AD. This will cause errors.
  • Previously, only the global container registry was used when installing or upgrading an official Rancher Helm chart app for RKE2/K3s node driver clusters. The default behavior has been changed, so that if a private registry exists in the cluster configuration, that registry will be used for pulling images. If no cluster-scoped registry is found, the global container registry will be used. A custom default registry can be specified during the Helm chart install and upgrade workflows.

Versions

Please refer to the README for latest and stable versions.

Please review our version documentation for more details on versioning and tagging conventions.

Images

  • rancher/rancher:v2.7.4

Tools

Kubernetes Versions

  • v1.25.9 (Default)
  • v1.24.13
  • v1.23.17

Rancher Helm Chart Versions

Starting in 2.6.0, many of the Rancher Helm charts available in the Apps & Marketplace will start with a major version of 100. This was done to avoid simultaneous upstream changes and Rancher changes from causing conflicting version increments. This also brings us into compliance with semver, which is a requirement for newer versions of Helm. You can now see the upstream version of a chart in the build metadata, for example: 100.0.0+up2.1.0. See #32294.

Other Notes

Experimental Features

  • Dual-stack and IPv6-only support for RKE1 clusters using the Flannel CNI will be experimental starting in v1.23.x. See the upstream Kubernetes docs. Dual-stack is not currently supported on Windows. See #165.

Deprecated Upstream Projects

  • Microsoft has deprecated the Azure AD Graph API that Rancher had been using for authentication via Azure AD. A configuration update is necessary to make sure users can still use Rancher with Azure AD. See the docs and #29306 for details.

Removed Legacy Features

The following legacy features have been removed as of Rancher v2.7.0. The deprecation and removal of these features were announced in previous releases. See #6864.

UI and Backend

  • CIS Scans v1 (Cluster)
  • Pipelines (Project)
  • Istio v1 (Project)
  • Logging v1 (Project)
  • RancherD

UI

  • Multiclusterapps (Global) - Apps within Multicluster Apps section

Known Major Issues

  • Kubernetes Cluster Distributions:
    • Starting with Rancher v2.7.2, a webhook will now be installed in all downstream clusters. There are several issues that users may encounter with this functionality:
      • If you rollback from a version of Rancher >= v2.7.2 to a version < v2.7.2, you’ll experience an issue where the webhooks will remain in downstream clusters. Since the webhook is designed to be 1:1 compatible with specific versions of Rancher, this can cause unexpected behaviors to occur downstream. The Rancher team has developed a script which should be used after rollback is complete (meaning after Rancher version < v2.7.2 is running) to remove the webhook from affected downstream clusters. See #40816.
    • RKE:
      • Rotating encryption keys with a custom encryption provider is not supported. See #30539.
    • RKE and RKE2:
      • When running CIS scans on RKE and RKE2 clusters on Kubernetes v1.25, some tests will fail if the rke-profile-hardened-1.23 or the rke2-profile-hardened-1.23 profile is used. These RKE and RKE2 test cases failing is expected as they rely on PSPs, which have been removed in Kubernetes v1.25. See #39851.
    • RKE2:
      • Amazon ECR Private Registries are not functional. See #33920.
      • When provisioning using an RKE2 cluster template, the rootSize for AWS EC2 provisioners does not currently take an integer when it should, and an error is thrown. To work around this issue, wrap the EC2 rootSize in quotes. See #40128.
      • The communication between the ingress controller and the pods doesn’t work when you create an RKE2 cluster with Cilium as the CNI and activate project network isolation. See documentation and #34275.
      • The system-upgrade-controller Deployment may fail after Monitoring is enabled on an RKE2 v1.23 or v1.24 cluster with Windows nodes. See #38646.
    • RKE2 - Windows:
      • CSI Proxy for Windows will now work in an air-gapped environment.
      • NodePorts do not work on Windows Server 2022 in RKE2 clusters due to a Windows kernel bug. See #159.
      • When upgrading Windows nodes in RKE2 clusters via the Rancher UI, Windows worker nodes will require a reboot after the upgrade is completed. See #37645.
      • The fleet-agent pod fails to deploy on an upgraded RKE2 Windows Custom cluster. See #993.
    • RKE2 and K3s:
      • Deleting a control plane node results in worker nodes also reconciling. See #39021.
      • Deleting nodes from custom RKE2/K3s clusters in Rancher v2.7.2 can cause unexpected behavior, if the underlying infrastructure isn’t thoroughly cleaned. When deleting a custom node from your cluster, ensure that you delete the underlying infrastructure for it, or run the corresponding uninstall script for the Kubernetes distribution installed on the node. See #41034:
    • K3s:
      • Clusters are in an Updating state even when it contains nodes in an Error state. See #39164.
    • AKS:
      • Imported Azure Kubernetes Service (AKS) clusters don’t display workload level metrics. This bug affects Monitoring V1. A workaround is available. See #4658.
      • When editing or upgrading the AKS cluster, do not make changes from the Azure console or CLI at the same time. These actions must be done separately. See #33561.
      • Windows node pools are not currently supported. See #32586.
      • Azure Container Registry-based Helm charts cannot be added in Cluster Explorer, but do work in the Apps feature of Cluster Manager. Note that when using a Helm chart repository, the disableSameOriginCheck setting controls when credentials are attached to requests. See documentation and #34584 for more information.
    • GKE:
      • Basic authentication must be explicitly disabled in GCP before upgrading a GKE cluster to 1.19+ in Rancher. See #32312.
      • If you have downstream private GKE clusters, you might experience issues when interacting with the resources that the webhook validates, such as namespaces. This can cause problems with activities where Rancher needs to interact with those resources, such as when you install charts. As a workaround, add a firewall setting to allow traffic to the webhook. See #41142.
    • EKS:
      • EKS clusters on Kubernetes v1.21 or below on Rancher v2.7 cannot be upgraded. To see more detail about this issue and the workaround, please see this comment.
  • Infrastructures:
    • vSphere:
      • PersistentVolumes are unable to mount to custom vSphere hardened clusters using CSI charts. See #35173.
  • Harvester:
    • If you’re using Rancher v2.7.2 with Harvester v1.1.1 clusters, you won’t be able to select the Harvester cloud provider when deploying or updating guest clusters. The Harvester release notes contain instructions on how to resolve this. See #3750.
    • Upgrades from Harvester v0.3.0 are not supported.
    • Deploying Fleet to Harvester clusters is not yet supported. Clusters, whether Harvester or non-Harvester, imported using the Virtualization Management page will result in the cluster not being listed on the Continuous Delivery page. See #35049.
  • Cluster Tools:
    • Authentication:
      • Rancher might retain resources from a disabled auth provider configuration in the local cluster, even after configuring another auth provider. To manually trigger cleanup for a disabled auth provider, add the management.cattle.io/auth-provider-cleanup annotation with the unlocked value to its auth config. See #40378.
    • Fleet:
      • Multiple fleet-agent pods may be created and deleted during initial downstream agent deployment; rather than just one. This resolves itself quickly, but is unintentional behavior. See #33293.
    • Hardened clusters:
      • Not all cluster tools can currently be installed on a hardened cluster.
    • Rancher Backup:
      • When migrating to a cluster with the Rancher Backup feature, the server-url cannot be changed to a different location. It must continue to use the same URL.
      • Because Kubernetes v1.22 drops the apiVersion apiextensions.k8s.io/v1beta1, trying to restore an existing backup file into a v1.22+ cluster will fail because the backup file contains CRDs with the apiVersion v1beta1. There are two options to work around this issue: update the default resourceSet to collect the CRDs with the apiVersion v1, or update the default resourceSet and the client to use the new APIs internally. See documentation and #34154.
    • Monitoring:
      • Deploying Monitoring on a Windows cluster with win_prefix_path set requires users to deploy Rancher Wins Upgrader to restart wins on the hosts to start collecting metrics in Prometheus. See #32535.
    • Logging:
      • Windows nodeAgents are not deleted when performing helm upgrade after disabling Windows logging on a Windows cluster. See #32325.
    • Istio Versions:
      • Istio 1.12 and below do not work on Kubernetes 1.23 clusters. To use the Istio charts, please do not update to Kubernetes 1.23 until the next charts’ release.
      • Deprecated resources are not automatically removed and will cause errors during upgrades. Manual steps must be taken to migrate and/or cleanup resources before an upgrade is performed. See #34699.
      • Applications injecting Istio sidecars, fail on SELinux RHEL 8.4 enabled clusters. A temporary workaround for this issue is to run the following command on each cluster node before creating a cluster: mkdir -p /var/run/istio-cni && semanage fcontext -a -t container_file_t /var/run/istio-cni && restorecon -v /var/run/istio-cni. See #33291.
  • Docker Installations:
    • UI issues may occur due to a longer startup time. User will receive an error message when launching Docker for the first time #28800, and user is directed to username/password screen when accessing the UI after a Docker install of Rancher. See #28798.
    • On a Docker install upgrade and rollback, Rancher logs will repeatedly display the messages “Updating workload ingress-nginx/nginx-ingress-controller” and “Updating service frontend with public endpoints”. Ingresses and clusters are functional and active, and logs resolve eventually. See #35798.
    • Rancher single node wont start on Apple M1 devices with Docker Desktop 4.3.0 or newer. See #35930.
  • Rancher UI:
    • When you upgrade your Kubernetes cluster, you might see the following error: Cluster health check failed. During an upgrade, this is a benign error and will self-resolve. It’s caused by the Kubernetes API server becoming temporarily unavailable as it is being upgraded within your cluster. See #41012.
      • When enabling some custom node drivers, the Cloud Credential creation page does not show the correct default fields and has an uneditable foo key. See #8563.
    • You need to force-refresh the Rancher UI after initial Rancher setup, to trigger the prompt to accept the self-signed certificate. As a workaround, visit the Rancher portal, accept the self-signed certificate, and go through the setup process. Once done, go to the address bar of your browser and click the lock icon. Select the option to allow you to receive certificate errors for the Rancher website. You’ll then be prompted again to accept the new certificate. See #7867.
    • Once you configure a setting with an environmental variable, it can’t be updated through the Rancher API or the UI. It can only be updated through changing the value of the environmental variable. Setting the environmental variable to “” (the empty string) changes the value in the Rancher API but not in Kubernetes. As a workaround, run kubectl edit setting <setting-name>, then set the value and source fields to "", and re-deploy Rancher. See #37998.
    • After installing an app from a partner chart repo, the partner chart will upgrade to feature charts if the chart also exists in the feature charts default repo. See #5655.
    • In some instances under Users and Authentication, no users are listed and clicking Create to create a new user does not display the entire form. To work around this when encountered, perform a hard refresh to be able to log back in. See #37531.
    • Deployment securityContext section is missing when a new workload is created. This prevents pods from starting when Pod Security Policy Support is enabled. See #4815.
    • Remove legacy feature multi-cluster app. See #39525.
  • Legacy UI:
    • When using the Rancher UI to add a new port of type ClusterIP to an existing Deployment created using the legacy UI, the new port will not be created upon saving. To work around this issue, repeat the procedure to add the port again. Users will notice the Service Type field will display as Do not create a service. Change this to ClusterIP and upon saving, the new port will be created successfully during this subsequent attempt. See #4280.
1 Like