Rancher Release v2.6.1

It is important to review the Install/Upgrade Notes below before upgrading to any Rancher version.

Features and Enhancements

Tech Preview - Harvester Integration with Rancher

Rancher 2.6.1 introduces Harvester v0.3.0 as a new integration. This integration only works with Harvester version 0.3.0 or higher. A feature flag, enabled by default, provides access to the integration. When enabled, users can manage Harvester clusters within Rancher. See #33605.

  • Import multiple Harvester clusters for multi-cluster/multi-site management.
  • Use Rancher RBAC and enterprise authentication to log into Harvester clusters.
  • Use Rancher Projects in Harvester so teams can keep their VMs separate and organized.
  • Provision Kubernetes clusters on Harvester. You can use Harvester’s built-in CSI driver and load balancer for your Kubernetes workloads (cloud provider). To set up the cloud provider with Harvester, refer here. See #4208.

Security Enhancements

  • RKE and RKE2 include new versions of Kubernetes that address an issue where symlink exchanges could allow host filesystem access. See CVE-2021-25741.
  • Role revisions are properly tracked for ClusterRoles associated to RoleBinding resources. This allows returning up-to-date schemas for users and not evaluating a user’s access from stale access states. See #34732.

Fleet

  • Fleet can now manage clusters running other instances of Fleet. This enables Rancher to manage Harvester clusters, other Rancher instances, or stand alone Fleet. Becasue of this change, it is possible two Fleet agents are in a cluster. See #34716.

Major Bug Fixes

  • AKS clusters now display applicable upgrade versions only. See #34161.
  • The AKS provisioner now correctly handles simultaneous Kubernetes version upgrades and node pool additions. See #33426.
  • Fixed conflict errors for the impersonation cluster role causing requests to occasionally fail with an unable to create impersonator account: error setting up impersonation for user u-xxxxx error. See #34671
  • The Istio chart now allows upgrades of Istio in an air gapped environment. Upgrading will be limited to versions available at time of build. See #30842.
  • The side navigation no longer shows an incorrect top-level “Istio” menu item after upgrading from Rancher v2.5.x with v1 Istio enabled. See #4072.
  • Resource deletion has been updated to address multiple cases. See #34761.
    • Nodes are correctly removed from the infrastructure provider when the cluster is deleted in Rancher. See #34542.
    • Nodes from a misconfigured node pool no longer remain in the cluster after being deleted. See #34713.
    • The deleted status of clusters are now synchronized across pages in Rancher. See #33774.
  • Fixed an issue where Fleet would infinitely create cluster management secrets when no system namespace exists. See #34776.
  • Migrating clusters to a different FleetWorkspace no longer results in the management cluster being deleted. See #34744.
  • The external IP address is now properly retrieved for Windows nodes in a custom RKE2 cluster. See Windows #91.

Install/Upgrade Notes

If you are installing Rancher for the first time, your environment must fulfill the installation requirements.

  • The namespace where the local Fleet agent runs has been changed to cattle-fleet-local-system. This change does not impact GitOps workflows.

Upgrade Requirements

  • Creating backups: We strongly recommend creating a backup before upgrading Rancher. To roll back Rancher after an upgrade, you must back up and restore Rancher to the previous Rancher version. Because Rancher will be restored to its state when a backup was created, any changes post upgrade will not be included after the restore. For more information, see the documentation on backing up Rancher.
  • Helm version: Rancher install or upgrade must occur with Helm 3.2.x+ due to the changes with the latest cert-manager release. See #29213.
  • Kubernetes version:
    • The local Kubernetes cluster for the Rancher server should be upgraded to Kubernetes 1.17+ before installing Rancher 2.5+.
    • When using Kubernetes v1.21 with Windows Server 20H2 Standard Core, the patch “2019-08 Servicing Stack Update for Windows Server” must be installed on the node. See #72.
  • CNI requirements:
    • For K8s 1.19 and newer, we recommend disabling firewalld as it has been found to be incompatible with various CNI plugins. See #28840.
    • If upgrading or installing to a Linux distribution which uses nf_tables as the backend packet filter, such as SLES 15, RHEL 8, Ubuntu 20.10, Debian 10, or newer, users should upgrade to RKE1 v1.19.2 or later to get Flannel version v0.13.0 that supports nf_tables. See Flannel #1317.
    • For users upgrading from >=v2.4.4 to v2.5.x with clusters where ACI CNI is enabled, note that upgrading Rancher will result in automatic cluster reconciliation. This is applicable for Kubernetes versions v1.17.16-rancher1-1, v1.17.17-rancher1-1, v1.17.17-rancher2-1, v1.18.14-rancher1-1, v1.18.15-rancher1-1, v1.18.16-rancher1-1, and v1.18.17-rancher1-1. Please refer to the workaround BEFORE upgrading to v2.5.x. See #32002.
  • Requirements for air gapped environments:
    • For installing or upgrading Rancher in an air gapped environment, please add the flag --no-hooks to the helm template command to skip rendering files for Helm’s hooks. See #3226.
    • If using a proxy in front of an air gapped Rancher, you must pass additional parameters to NO_PROXY. See the documentation and #2725.
  • Cert-manager version requirements: Recent changes to cert-manager require an upgrade if you have a high-availability install of Rancher using self-signed certificates. If you are using cert-manager older than v0.9.1, please see the documentation on how to upgrade cert-manager. See documentation.
  • Requirements for Docker installs:
    • When starting the Rancher Docker container, the privileged flag must be used. See the documentation.
    • When installing in an air gapped environment, you must supply a custom registries.yaml file to the docker run command as shown in the K3s documentation. If the registry has certs, then you will need to also supply those. See #28969.
    • When upgrading a Docker installation, a panic may occur in the container, which causes it to restart. After restarting, the container comes up and is working as expected. #33685

Rancher Behavior Changes

  • Legacy features are gated behind a feature flag. Users upgrading from Rancher <=v2.5.x will automatically have the --legacy feature flag enabled. New installations that require legacy features need to enable the flag on install or through the UI.
  • The local cluster can no longer be turned off. In older Rancher versions, the local cluster could be hidden to restrict admin access to the Rancher server’s local Kubernetes cluster, but that feature has been deprecated. The local Kubernetes cluster can no longer be hidden and all admins will have access to the local cluster. If you would like to restrict permissions to the local cluster, there is a new restricted-admin role that must be used. The access to local cluster can now be disabled by setting hide_local_cluster to true from the v3/settings API. See the documentation and #29325. For more information on upgrading from Rancher with a hidden local cluster, see the documentation.
  • Users must log in again. After upgrading to v2.6+, users will be automatically logged out of the old Rancher UI and must log in again to access Rancher and the new UI. See #34004.
  • Fleet is now always enabled. For users upgrading from v2.5.x to v2.6.x, note that Fleet will be enabled by default as it is required for operation in v2.6+. This will occur even if Fleet was disabled in v2.5.x. During the upgrade process, users will observe restarts of the rancher pods, which is expected. See #31044 and #32688.
  • The Fleet agent in the local cluster now lives in cattle-fleet-local-system. Starting with Rancher v2.6.1, Fleet allows for two agents in the local cluster for scenarios where “Fleet is managing Fleet”. The true local agent runs in the new cattle-fleet-local-system namespace. The agent downstream from another Fleet management cluster runs in cattle-fleet-system, similar to the agent pure downstream clusters. See #34716 and #531.
  • Editing and saving clusters can result in cluster reconciliation. For users upgrading from <=v2.4.8 (<= RKE v1.1.6) to v2.4.12+ (RKE v1.1.13+)/v2.5.0+ (RKE v1.2.0+) , please note that Edit and save cluster (even with no changes or a trivial change like cluster name) will result in cluster reconciliation and upgrading kube-proxy on all nodes because of a change in kube-proxy binds. This only happens on the first edit and later edits shouldn’t affect the cluster. See #32216.
  • The EKS cluster refresh interval setting changed. There is currently a setting allowing users to configure the length of refresh time in cron format: eks-refresh-cron. That setting is now deprecated and has been migrated to a standard seconds format in a new setting: eks-refresh. If previously set, the migration will happen automatically. See #31789.
  • System components will restart. Please be aware that upon an upgrade to v2.3.0+, any edits to a Rancher launched Kubernetes cluster will cause all system components to restart due to added tolerations to Kubernetes system components. Plan accordingly.
  • New GKE and AKS clusters will use Rancher’s new lifecycle management features. Existing GKE and AKS clusters and imported clusters will continue to operate as-is. Only new creations and registered clusters will use the new full lifecycle management.
  • New steps for rolling back Rancher. The process to roll back Rancher has been updated for versions v2.5.0 and above. New steps require scaling Rancher down to 0 replica before restoring the backup. Please refer to the documentation for the new instructions.
  • RBAC differences around Manage Nodes for RKE2 clusters. Due to the change of the provisioning framework, the Manage Nodes role will no longer be able to scale up/down machine pools. The user would need the ability to edit the cluster to manage the machine pools #34474.
  • New procedure to set up Azure cloud provider for RKE2. For RKE2, the process to set up an Azure cloud provider is different than for RKE1 clusters. Users should refer to the documentation for the new instructions. See #34367 for original issue.
  • Machines vs Kube Nodes. In previous versions, Rancher only displayed Nodes, but with v2.6, there are the concepts of machines and kube nodes. Kube nodes are the Kubernetes node objects and are only accessible if the Kubernetes API server is running and the cluster is active. Machines are the cluster’s machine object which defines what the cluster should be running.
  • Requirements for Docker installs.
    • When starting the Rancher Docker container, the privileged flag must be used. See the documentation.
    • When installing in an air gapped environment, you must supply a custom registries.yaml file to the docker run command as shown in the K3s documentation. If the registry has certs, then you will need to also supply those. See #28969.
  • Increased memory limit for Legacy Monitoring. The default value of the Prometheus memory limit in the legacy Rancher UI is now 2000Mi to prevent the pod from restarting due to a OOMKill. See #34850.
  • Increased memory limit for Monitoring. The default value of the Prometheus memory limit in the new Rancher UI is now 3000Mi to prevent the pod from restarting due to a OOMKill. See #34850.

Versions

Please refer to the README for latest and stable versions.

Please review our version documentation for more details on versioning and tagging conventions.

Images

  • rancher/rancher:v2.6.1

Tools

Rancher Helm Chart Versions

Starting in 2.6.0 many of the Rancher Helm charts available in the Apps & Marketplace will start with a major version of 100. This was done to avoid simultaneous upstream changes and Rancher changes from causing conflicting version increments. This also brings us into compliance with semver, which is a requirement for newer versions of Helm. You can now see the upstream version of a chart in the build metadata, for example: 100.0.0+up2.1.0. See #32294.

Other Notes

Feature Flags

Please review the new Harvester feature flag introduced into 2.6.1 and its default status. Additional feature flags introduced in 2.6.0 are shown also for reference.

Feature Flag Default Value Description
harvester true Used to manage access to the Harvester list page where users can navigate directly to Harvester host clusters and have the ability to import them.
fleet true The previous fleet feature flag is now required to be enabled as the fleet capabilities are leveraged within the new provisioning framework. If you had this feature flag disabled in earlier versions, upon upgrading to Rancher, the flag will automatically be enabled.
gitops true If you want to hide the “Continuous Delivery” feature from your users, then please use the newly introduced gitops feature flag, which hides the ability to leverage Continuous Delivery.
rke2 true We have introduced the ability to provision RKE2 clusters as tech preview. By default, this feature flag is enabled, which allows users to attempt to provision these type of clusters.
legacy false for new installs, true for upgrades There are a set of features from previous versions that are slowly being phased out of Rancher for newer iterations of the feature. This is a mix of deprecated features as well as features that will eventually be moved to newer variations in Rancher. By default, this feature flag is disabled for new installations. If you are upgrading from a previous version, this feature flag would be enabled.
token-hashing false for new installs, true for upgrades Used to enable new token-hashing feature. Once enabled, existing tokens will be hashed and all new tokens will be hashed automatically using the SHA256 algorithm. Once a token is hashed it cannot be undone. Once this feature flag is enabled it cannot be disabled.

Experimental Features

RancherD was introduced in 2.5 as an easy-to-use installation binary. With the introduction of RKE2 provisioning, this project is being re-written and will be available at a later time. See #33423.

Legacy Features

Legacy features are features hidden behind the legacy feature flag, which are various features/functionality of Rancher that was available in previous releases. These are features that Rancher doesn’t intend for new users to consume, but if you have been using past versions of Rancher, you’ll still want to use this functionality.

When you first start 2.6, there is a card in the Home page that outlines the location of where these features are now located.

The deprecated features from v2.5 are now behind the legacy feature flag. Please review our deprecation policy for questions.

The following legacy features are no longer supported on Kubernetes 1.21+ clusters:

  • Logging
  • CIS Scans
  • Istio 1.5
  • Pipelines

Known Major Issues

  • Kubernetes Cluster Distributions:
    • RKE
      • Rotating encryption keys with a custom encryption provider is not supported. See #30539.
    • RKE2 - Tech Preview: There are several known issues as this feature is in tech preview, but here are some major issues to consider before using RKE2.
      • There are issues around removing etcd nodes from a cluster. See #34488.
      • Amazon ECR Private Registries are not functional. See #33920.
      • Proxy behind individual downstream clusters is not working. See #32633.
      • Windows clusters do not yet support Rancher Apps & Marketplace. See #34405.
      • Windows clusters do not yet support upgrading RKE2 versions. See Windows #76.
    • AKS:
      • When editing or upgrading the AKS cluster, do not make changes from the Azure console or CLI at the same time. These actions must be done separately. See #33561
      • Windows node pools are not currently supported. See #32586.
    • GKE
      • Basic authentication must be explicitly disabled in GCP before upgrading a GKE cluster to 1.19+ in Rancher. See #32312.
  • Harvester - Tech Preview:
    • Harvester v0.3.0 doesn’t support air gap environment installation. See #34908.
    • Harvester v0.3.0 doesn’t support upgrade from v0.2.0 nor upgrade to future v1.0.0.
    • Deploying Fleet to Harvester clusters is not yet supported. Clusters, whether Harvester or non-Harvester, imported using the Virtualization Management page will result in the cluster not being listed on the Continuous Delivery page. See #35049.
  • Cluster Tools:
    • Fleet:
      • Multiple fleet-agent pods may be created and deleted during initial downstream agent deployment; rather than just one. This resolves itself quickly, but is unintentional behavior. See #33293.
    • Hardened clusters: Not all cluster tools can currently be installed on a hardened cluster.
    • Rancher Backup:
      • When migrating to a cluster with the Rancher Backup feature, the server-url cannot be changed to a different location. It must continue to use the same URL.
      • When running a newer version of the rancher-backup app to restore a backup made with an older version of the app, the resourceSet rancher-resource-set will be restored to an older version that might be different from the one defined in the current running rancher-backup app. The workaround is to edit the rancher-backup app to trigger a reconciliation. See #34495.
    • Monitoring:
      • Deploying Monitoring on a Windows cluster with win_prefix_path set requires users to deploy Rancher Wins Upgrader to restart wins on the hosts to start collecting metrics in Prometheus. See #32535.
    • Logging:
      • Windows nodeAgents are not deleted when performing helm upgrade after disabling Windows logging on a Windows cluster. See #32325.
    • Istio versions:
      • Istio v1.10.4 will be available with the Rancher Istio Helm chart v100.0.0+up1.10.4. A previous version of Istio, 1.10.2, included a security fix for CVE-2021-34824. See #32795.
      • Istio 1.9 support ended on October 8th, 2021.
      • Istio 1.5 is not supported in air gapped environments. Please note that the Istio project has ended support for Istio 1.5.
      • The Kiali dashboard bundled with 100.0.0+up1.10.2 errors on a page refresh. Instead of refreshing the page when needed, simply access Kiali using the dashboard link again. Everything else works in Kiali as expected, including the graph auto-fresh. See #33739.
      • Deprecated resources are not automatically removed and will cause errors during upgrades. Manual steps must be taken to migrate and/or cleanup resources before an upgrade is performed. See #34699.
    • Legacy Monitoring:
      • The Grafana instance inside Cluster Manager’s Monitoring is not compatible with Kubernetes 1.21. See #33465.
      • In air gapped setups, the generated rancher-images.txt that is used to mirror images on private registries does not contain the images required to run Legacy Monitoring, also called Monitoring V1, which is compatible with Kubernetes 1.15 clusters. If you are running Kubernetes 1.15 clusters in an air gapped environment, and you want to either install Monitoring V1 or upgrade Monitoring V1 to the latest that is offered by Rancher for Kubernetes 1.15 clusters, you will need to take one of the following actions:
        • Upgrade the Kubernetes version so that you can use v0.2.x of the Monitoring application Helm chart
        • Manually import the necessary images into your private registry for the Monitoring application to use
      • When deploying any downstream cluster, Rancher logs errors that seem to be related to Monitoring even when Monitoring is not installed onto either cluster; specifically, Rancher logs that it failed on subscribe to the Prometheus CRs in the cluster because it is unable to get the resource prometheus.meta.k8s.io. These logs appear in a similar fashion for other Prometheus CRs (namely Alertmanager, ServiceMonitors, and PrometheusRules), but do not seem to cause any other major impact in functionality. See #32978.
  • Docker installs: There are UI issues around startup time. See #28800 and #28798.
  • Login to Rancher using ActiveDirectory with TLS: See #34325.
    • Upon an upgrade to v2.6.0, authenticating via Rancher against an ActiveDirectory server using TLS can fail if the certificates on the AD server do not support SAN attributes. This is a check enabled by default in Golang 1.15.
    • The error received is “Error creating ssl connection: LDAP Result Code 200 “Network Error”: x509: certificate relies on legacy Common Name field, use SANs or temporarily enable Common Name matching with GODEBUG=x509ignoreCN=0”
    • To resolve this, the certificates on the AD server should be updated or replaced with new ones that support the SAN attribute. Alternatively this error can be ignored by setting “GODEBUG=x509ignoreCN=0” as an environment variable to Rancher server container.
  • Legacy UI:
    • When using the Rancher v2.6 UI to add a new port of type ClusterIP to an existing Deployment created using the legacy UI, the new port will not be created upon saving. To work around this issue, repeat the procedure to add the port again. Users will notice the Service Type field will display as Do not create a service. Change this to ClusterIP and upon saving, the new port will be created successfully during this subsequent attempt. See #4280.
    • When using the Rancher v2.6 UI to add a port to a Deployment that was created initially without any ports using the legacy UI, users may encounter a services <NAME> already exists error. To work around this issue, the legacy UI should be used to make updates to the configuration of Deployments created using the legacy UI. See #4159.
    • When workloads created using the legacy UI are deleted, the corresponding services are not automatically deleted. Users will need to manually remove these services. A message will be displayed notifying the user to manually delete the associated services when such a workload is deleted. See #34639.