Rancher Release v2.5.16

Release v2.5.16

It is important to review the Install/Upgrade Notes below before upgrading to any Rancher version.

Security Fixes for Rancher Vulnerabilities

  • Increased entropy of CSRF (cross-site request forgery) token. See #15 and #419.
  • Fixed an issue where sensitive fields like passwords, API keys, and Rancher’s service account token were stored as plaintext on Kubernetes objects. Any user with read access to those objects in the Kubernetes API could retrieve the plaintext version of those sensitive data. For more information, see CVE-2021-36782.
  • Improved the sanitization (removal) of credentials from cluster template answers. Failure to sanitize data can lead to plaintext storage and exposure of credentials, passwords, and API tokens. For more information, see CVE-2021-36783.
  • Fixed an authorization logic flaw that allowed privilege escalation in downstream clusters through cluster role template binding (CRTB) and project role template binding (PRTB). For more information, see CVE-2022-31247.

Features and Enhancements

Azure Active Directory API Migration

Microsoft has deprecated the Azure AD Graph API that Rancher had been using for authentication via Azure AD. A configuration update is necessary to make sure users can still use Rancher with Azure AD. See the docs and #37228 for details.

  • Limitations
    • Attempts to log in will fail after rolling back a Docker install of Rancher if the following conditions have occurred:

      • Azure AD is enabled.
      • Before the rollback, admins committed to the Azure AD configuration update.

      This is because the Azure AD endpoints will not be rolled back if the rollback is not performed via the backup-restore operator. If you want to roll back Rancher to use the old Azure AD Graph API without using the backup-restore operator, follow this workaround to edit the AzureAD authconfig resource stored in the local cluster’s database. The old Azure AD Graph API endpoints will not be rolled back on a Rancher rollback. See #38025.

  • Other
    • Multi-factor authentication (MFA) now works with the Azure AD auth provider. Some Rancher setups might have had MFA enabled in Azure from before, but Rancher wasn’t working with it correctly. Be aware that on upgrade, if MFA is enabled for the Azure app, Rancher will require additional verification. See #38028.
    • Before starting the migration process or enabling Azure AD for the first time in v2.6.7+, ensure that you add the Azure app registration’s permissions of type Application and NOT Delegated for Microsoft Graph. Otherwise, you may not be able to login to Azure AD. This issue will persist even after you disable/re-enable Azure AD and will require an hour wait, or manual deletion of a cache value to resolve.

Major Bug Fixes

  • Prior to v2.5.16, if S3 or other kinds of credentials were added to a cluster after it was already created, the reference to the secret containing the credentials was lost because the cluster status cannot updated through the API. The references are now moved to the cluster Spec so that they can be updated added after creation. To repair a cluster after a upgrade to v2.5.16, edit the cluster and change the etcd snapshot configuration back to local and save it, then edit again to configure S3 snapshots again. See #38397.
  • Fixed a bug that overloaded the downstream Kubernetes API server when the Cluster Explorer dashboard was left open to a page for a downstream cluster for over 30 minutes and would start rapidly opening and closing watch requests perpetually. See #37986.
  • Fixed an issue where issues in a downstream cluster would cause a controller to frequently restart and eventually lead to a Goroutine leak. See #37965.
  • Updated an internal download link that was causing upgrades to fail. See #37859.

Install/Upgrade Notes

If you are installing Rancher for the first time, your environment must fulfill the installation requirements.

Upgrade Requirements

  • Creating backups:
    • We strongly recommend creating a backup before upgrading Rancher. To roll back Rancher after an upgrade, you must back up and restore Rancher to the previous Rancher version. Because Rancher will be restored to its state when a backup was created, any changes post upgrade will not be included after the restore. For more information, see the documentation on backing up Rancher.
  • Helm version:
    • Rancher install or upgrade must occur with Helm 3.2.x+ due to the changes with the latest cert-manager release. See #29213.
  • Kubernetes version:
    • The local Kubernetes cluster for the Rancher server should be upgraded to Kubernetes 1.17+ before installing Rancher 2.5+.
  • CNI requirements:
    • For Kubernetes v1.19 and newer, we recommend disabling firewalld as it has been found to be incompatible with various CNI plugins. See #28840.
    • If upgrading or installing to a Linux distribution that uses nf_tables as the backend packet filter, such as SLES 15, RHEL 8, Ubuntu 20.10, Debian 10, or newer, users should upgrade to RKE1 v1.19.2 or later to get Flannel version v0.13.0 that supports nf_tables. See Flannel #1317.
    • For users upgrading from >=v2.4.4 to v2.5.x with clusters where ACI CNI is enabled, note that upgrading Rancher will result in automatic cluster reconciliation. This is applicable for Kubernetes versions v1.17.16-rancher1-1, v1.17.17-rancher1-1, v1.17.17-rancher2-1, v1.18.14-rancher1-1, v1.18.15-rancher1-1, v1.18.16-rancher1-1, and v1.18.17-rancher1-1. Please refer to the workaround BEFORE upgrading to v2.5.x. See #32002.
  • Requirements for air-gapped environments:
    • For installing or upgrading Rancher in an air-gapped environment, please add the flag --no-hooks to the helm template command to skip rendering files for Helm’s hooks. See #3226.
    • If using a proxy in front of an air-gapped Rancher, you must pass additional parameters to NO_PROXY. See the documentation and #2725.
  • Cert-manager version requirements:
    • Recent changes to cert-manager require an upgrade if you have a high-availability install of Rancher using self-signed certificates. If you are using cert-manager older than v0.9.1, please see the documentation for information on how to upgrade cert-manager.
  • Requirements for Docker installs:
    • When starting the Rancher Docker container, the privileged flag must be used. See the documentation.
    • When installing in an air-gapped environment, you must supply a custom registries.yaml file to the docker run command as shown in the K3s documentation. If the registry has certs, then you will need to also supply those. See #28969.
    • When upgrading a Docker installation, a panic may occur in the container, which causes it to restart. After restarting, the container comes up and works as expected. See #33685.
  • RKE requirements:
    • For users upgrading from <=v2.4.8 (<= RKE v1.1.6) to v2.4.12+ (RKE v1.1.13+)/v2.5.0+ (RKE v1.2.0+), please note that Edit and Save cluster (even with no changes or a trivial change like cluster name) will result in cluster reconciliation and upgrading kube-proxy on all nodes because of a change in kube-proxy binds. This only happens on the first edit, and later edits shouldn’t affect the cluster. See #32216.
  • EKS requirements:
    • There was a setting for Rancher versions prior to 2.5.8 that allowed users to configure the length of refresh time in cron format: eks-refresh-cron. That setting is now deprecated and has been migrated to a standard seconds format in a new setting: eks-refresh. If previously set, the migration will happen automatically. See #31789.
  • Fleet-agent:
    • When upgrading <=v2.5.7 to >=v2.5.8, you may notice that in Apps & Marketplace, there is a fleet-agent release stuck at uninstalling. This is caused by migrating fleet-agent release name. It is safe to delete fleet-agent release as it is no longer used, and doing so should not delete the real fleet-agent deployment since it has been migrated. See #362.

Rancher Behavior Changes

  • Upgrades and Rollbacks:
    • Rancher supports both upgrade and rollback. Please note the version you would like to upgrade or roll back to change the Rancher version.
    • Please be aware when upgrading to v2.3.0+, any edits to a Rancher-launched Kubernetes cluster will cause all system components to restart due to added tolerations to Kubernetes system components. Plan accordingly.
    • Recent changes to cert-manager require an upgrade if you have an HA install of Rancher using self-signed certificates. If you are using cert-manager older than v0.9.1, please see the documentation on how to upgrade cert-manager.
    • Existing GKE clusters and imported clusters will continue to operate as is. Only new creations and registered clusters will use the new full lifecycle management.
    • The process to roll back Rancher has been updated for versions v2.5.0 and above. Refer to the documentation for the new instructions.
  • Important:
    • When rolling back, we are expecting you to roll back to the state at the time of your upgrade. Any changes post-upgrade would not be reflected.
  • The local cluster can no longer be turned off:
    • In older Rancher versions, the local cluster could be hidden to restrict admin access to the Rancher server’s local Kubernetes cluster, but that feature has been deprecated. The local Kubernetes cluster can no longer be hidden and all admins will have access to the local cluster. If you would like to restrict permissions to the local cluster, there is a new restricted-admin role that must be used. Access to the local cluster can now be disabled by setting hide_local_cluster to true from the v3/settings API. See the documentation and #29325. For more information on upgrading from Rancher with a hidden local cluster, see the documentation.

Versions

Please refer to the README for latest and stable versions.

Please review our version documentation for more details on versioning and tagging conventions.

Images

  • rancher/rancher:v2.5.16
  • rancher/rancher-agent:v2.5.16

Tools

Kubernetes Versions

  • 1.20.15 (Default)
  • 1.19.16
  • 1.18.20
  • 1.17.17

Other Notes

Deprecated Features

Feature Justification
Cluster Manager - Rancher Monitoring Monitoring in Cluster Manager UI has been replaced with a new monitoring chart available in the Apps & Marketplace in Cluster Explorer.
Cluster Manager - Rancher Alerts and Notifiers Alerting and notifiers functionality is now directly integrated with a new monitoring chart available in the Apps & Marketplace in Cluster Explorer.
Cluster Manager - Rancher Logging Functionality replaced with a new logging solution using a new logging chart available in the Apps & Marketplace in Cluster Explorer.
Cluster Manager - MultiCluster Apps Deploying to multiple clusters is now recommended to be handled with Rancher Continuous Delivery powered by Fleet available in Cluster Explorer.
Cluster Manager - Kubernetes CIS 1.4 Scanning Kubernetes CIS 1.5+ benchmark scanning is now replaced with a new scan tool deployed with a cis benchmarks chart available in the Apps & Marketplace in Cluster Explorer.
Cluster Manager - Rancher Pipelines Git-based deployment pipelines is now recommend to be handled with Rancher Continuous Delivery powered by Fleet available in Cluster Explorer.
Cluster Manager - Istio v1.5 The Istio project has ended support for Istio 1.5 and has recommended all users upgrade. Newer Istio versions are now available as a chart in the Apps & Marketplace in Cluster Explorer.
Cluster Manager - Provision Kubernetes v1.16 Clusters We have ended support for Kubernetes v1.16. Cluster Manager no longer provisions new v1.16 clusters. If you already have a v1.16 cluster, it is unaffected.

Experimental Features

RancherD was introduced as part of Rancher v2.5.4 through v2.5.10 as an experimental feature but is now deprecated. See #33423.

Duplicated Features in Cluster Manager and Cluster Explorer

  • Only one version of the feature may be installed at any given time due to potentially conflicting CRDs.
  • Each feature should only be managed by the UI that it was deployed from.
  • If you have installed a feature in Cluster Manager, you must uninstall it in Cluster Manager before attempting to install the new version in Cluster Explorer dashboard.

Cluster Explorer Feature Caveats and Upgrades

  • General:
    • Not all new features are currently installable on a hardened cluster.
    • New features are expected to be deployed using the Helm 3 CLI and not with the Rancher CLI.
  • UI Shell:
    • After closing the shell in the Rancher UI, be aware that the corresponding processes remain running indefinitely for each shell in the pod. See #16192.
  • Continuous Delivery:
    • Restricted admins are not able to create git repos from the Continuous Delivery option under Cluster Explorer; the screen will become stuck in a loading status. See #4909.
  • Rancher Backup:
    • When migrating to a cluster with the Rancher Backup feature, the server-url cannot be changed to a different location; it must continue to use the same URL.
  • Monitoring:
    • Monitoring sometimes errors on installation because it can’t identify CRDs. See #29171.
  • Istio:
    • Be aware that when upgrading from Istio v1.7.4 or earlier to any later version, there may be connectivity issues. See upgrade notes and #31811.
    • Starting in v1.8.x, DNS is supported natively. This means that the additional addon component istioCoreDNS is deprecated in v1.8.x and is not supported in v1.9x. If you are upgrading from v1.8.x to v1.9.x and you are using the istioCoreDNS addon, it is recommended that you disable it and switch to the natively supported DNS prior to upgrade. If you upgrade without disabling it, you will need to manually clean up your installation as it will not get removed automatically. See #31761 and #31265.
    • Istio v1.10 and earlier versions are now End-of-life but are required for the upgrade path in order to not skip a minor version. See #33824.

Cluster Manager Feature Caveats and Upgrades

  • GKE:
    • Basic authentication must be explicitly disabled in GCP before upgrading a GKE cluster to 1.19+ in Rancher. See #32312.
    • When creating GKE clusters in Terraform, the labels field cannot be empty: at least one label must be set. See #32553.
  • EKS & GKE:
    • When creating EKS and GKE clusters in Terraform, string fields cannot be set to empty. See #32440.

Known Major Issues

  • Kubernetes Cluster Distributions
    • RKE:
      • Rotating encryption keys with a custom encryption provider is not supported. See #30539.
      • After migrating from the in-tree vSphere cloud provider to the out-of-tree cloud provider, attempts to upgrade the cluster will not complete. This is due to nodes containing workloads with bound volumes before the migration failing to drain. Users will observe these nodes stuck in a draining state. Follow this workaround to continue with the upgrade. See #35102.
    • AKS:
      • AKS Kubernetes versions 1.20 and earlier have reached end of life. As Rancher v2.5 does not support Kubernetes greater than 1.20, it is not possible to provision new downstream AKS clusters. See #38406.
      • Azure Container Registry-based Helm charts cannot be added in Cluster Explorer but do work in the Apps feature of Cluster Manager. Note that when using a Helm chart repository, the disableSameOriginCheck setting controls when credentials are attached to requests. See documentation and #35940 for more information.
  • Cluster Tools
    • Hardened clusters:
      • Not all cluster tools can currently be installed on a hardened cluster.
    • Monitoring:
      • Deploying Monitoring V2 on a Windows cluster with win_prefix_path set requires users to deploy Rancher Wins Upgrader to restart wins on the hosts to start collecting metrics in Prometheus. See #32535.
      • Monitoring V2 fails to scrape ingress-nginx pods on any nodes except for the one Prometheus is deployed on, if the security group used by worker nodes blocks incoming requests to port 10254. The workaround for this issue is to open up port 10254 on all hosts. See #32563.
    • Logging:
      • Logging (Cluster Explorer): Windows nodeAgents are not deleted when performing Helm upgrade after disabling Windows logging on a Windows cluster. See #32325.
    • Istio versions:
      • Istio v1.5 is not supported in air-gapped environments. Please note that the Istio project has ended support for Istio v1.5.
      • Istio v1.10 support ended on January 7th, 2022.
    • Legacy Monitoring:
      • In air-gapped setups, the generated rancher-images.txt that is used to mirror images on private registries does not contain the images required to run Legacy Monitoring, also called Monitoring V1, which is compatible with Kubernetes 1.15 clusters. If you are running Kubernetes 1.15 clusters in an air-gapped environment, and you want to either install Monitoring V1 or upgrade Monitoring V1 to the latest that is offered by Rancher for Kubernetes 1.15 clusters, you will need to take one of the following actions:
        • Upgrade the Kubernetes version so that you can use v0.2.x of the Monitoring application Helm chart.
        • Manually import the necessary images into your private registry for the Monitoring application to use.
    • Installation requirements:
      • Importing a Kubernetes v1.21 cluster might not work properly and is unsupported.
    • Backup and Restore:
      • Reinstalling Rancher 2.5.x on the same cluster may fail due to a lingering rancher.cattle.io. MutatingWebhookConfiguration object from a previous installation. Manually deleting it will resolve the issue.
    • Docker installs:
      • UI issues may occur due to a longer startup time.
      • Users may receive an error message when logging in for the first time. See #28800.
      • Users may be redirected to the login screen before a password and default view have been set. See #28798.