import UpdateOverview from "../partials/embedded-cluster/_update-overview.mdx" import EcConfig from "../partials/embedded-cluster/_ec-config.mdx" import ShellCommand from "../partials/embedded-cluster/_shell-command.mdx"
This topic provides information about using Replicated Embedded Cluster, including how to get started, configure Embedded Cluster, access the cluster using kubectl, and more. For an introduction to Embedded Cluster, see Embedded Cluster Overview.
You can use the following steps to get started quickly with Embedded Cluster. More detailed documentation is available below.
-
Create a new customer or edit an existing customer and select the Embedded Cluster Enabled license option. Save the customer.
-
Create a new release that includes your application. In that release, create an Embedded Cluster Config that includes, at minimum, the Embedded Cluster version you want to use. See the Embedded Cluster GitHub repo to find the latest version.
Example Embedded Cluster Config:
-
Save the release and promote it to the channel the customer is assigned to.
-
Return to the customer page where you enabled Embedded Cluster. At the top right, click Install instructions and choose Embedded Cluster. A dialog appears with instructions on how to download the Embedded Cluster installation assets and install your application.
-
On your VM, run the commands in the Embedded Cluster install instructions dialog.
-
Enter an Admin Console password when prompted.
The Admin Console URL is printed when the installation finishes. Access the Admin Console to begin installing your application. During the installation process in the Admin Console, you have the opportunity to add nodes if you want a multi-node cluster. Then you can provide application config, run preflights, and deploy your application.
To install an application with Embedded Cluster, an Embedded Cluster Config must be present in the application release. The Embedded Cluster Config lets you define several characteristics about the cluster that will be created.
For more information, see Embedded Cluster Config.
This section provides an overview of installing applications with Embedded Cluster.
The following diagram demonstrates how Kubernetes and an application are installed into a customer environment using Embedded Cluster:
View a larger version of this image
As shown in the diagram above, the Embedded Cluster Config is included in the application release in the Replicated Vendor Portal and is used to generate the Embedded Cluster installation assets. Users can download these installation assets from the Replicated app service (replicated.app
) on the command line, then run the Embedded Cluster installation command to install Kubernetes and the KOTS Admin Console. Finally, users access the Admin Console to optionally add nodes to the cluster and to configure and install the application.
Embedded Cluster supports installations in online (internet-connected) environments and air gap environments with no outbound internet access.
For online installations, Embedded Cluster also supports installing behind a proxy server.
For more information about how to install with Embedded Cluster, see:
To install with Embedded Cluster, you can follow the customer-specific instructions provided on the Customer page in the Vendor Portal. For example:
View a larger version of this image
To install with Embedded Cluster, you need to download the Embedded Cluster installer binary and a license. Air gap installations also require an air gap bundle. Some vendors already have a portal where their customers can log in to access documentation or download artifacts. In cases like this, you can serve the Embedded Cluster installation essets yourself using the Replicated Vendor API, rather than having customers download the assets from the Replicated app service using a curl command during installation.
To serve Embedded Cluster installation assets with the Vendor API:
-
If you have not done so already, create an API token for the Vendor API. See Using the Vendor API v3.
-
Call the Get an Embedded Cluster release endpoint to download the assets needed to install your application with Embedded Cluster. Your customers must take this binary and their license and copy them to the machine where they will install your application.
Note the following:
-
(Recommended) Provide the
customerId
query parameter so that the customer’s license is included in the downloaded tarball. This mirrors what is returned when a customer downloads the binary directly using the Replicated app service and is the most useful option. Excluding thecustomerId
is useful if you plan to distribute the license separately. -
If you do not provide any query parameters, this endpoint downloads the Embedded Cluster binary for the latest release on the specified channel. You can provide the
channelSequence
query parameter to download the binary for a particular release.
-
During installation, Embedded Cluster automatically runs a default set of host preflight checks. The default host preflight checks are designed to verify that the installation environment meets the requirements for Embedded Cluster, such as:
- The system has sufficient disk space
- The system has at least 2G of memory and 2 CPU cores
- The system clock is synchronized
For Embedded Cluster requirements, see Embedded Cluster Installation Requirements. For the full default host preflight spec for Embedded Cluster, see host-preflight.yaml
in the embedded-cluster
repository in GitHub.
If any of the host preflight checks fail, installation is blocked and a message describing the failure is displayed. For more information about host preflight checks for installations on VMs or bare metal servers, see About Host Preflights.
Embedded Cluster host preflight checks have the following limitations:
- The default host preflight checks for Embedded Cluster cannot be modified, and vendors cannot provide their own custom host preflight spec for Embedded Cluster.
- Host preflight checks do not check that any application-specific requirements are met. For more information about defining preflight checks for your application, see Defining Preflight Checks.
You can skip host preflight checks by passing the --skip-host-preflights
flag with the Embedded Cluster install
command. For example:
sudo ./my-app install --license license.yaml --skip-host-preflights
When you skip host preflight checks, the Admin Console still runs any application-specific preflight checks that are defined in the release before the application is deployed.
:::note Skipping host preflight checks is not recommended for production installations. :::
This section describes managing nodes in multi-node clusters created with Embedded Cluster.
You can optionally define node roles in the Embedded Cluster Config. For multi-node clusters, roles can be useful for the purpose of assigning specific application workloads to nodes. If nodes roles are defined, users access the Admin Console to assign one or more roles to a node when it is joined to the cluster.
For more information, see roles in Embedded Cluster Config.
Users can add nodes to a cluster with Embedded Cluster from the Admin Console. The Admin Console provides the join command used to add nodes to the cluster.
For more information, see Managing Multi-Node Clusters with Embedded Cluster.
Multi-node clusters are not highly available by default. Enabling high availability (HA) requires that at least three controller nodes are present in the cluster. Users can enable HA when joining the third node.
For more information about creating HA multi-node clusters with Embedded Cluster, see Enable High Availability for Multi-Node Clusters (Alpha) in Managing Multi-Node Clusters with Embedded Cluster.
For more information about updating, see Performing Updates with Embedded Cluster.
With Embedded Cluster, end users rarely need to use the CLI. Typical workflows, like updating the application and the cluster, can be done through the Admin Console. Nonetheless, there are times when vendors or their customers need to use the CLI for development or troubleshooting.
:::note If you encounter a typical workflow where your customers have to use the Embedded Cluster shell, reach out to Alex Parker at [email protected]. These workflows might be candidates for additional Admin Console functionality. :::
Resetting a node removes the cluster and your application from that node. This is useful for iteration, development, and when mistakes are made, so you can reset a machine and reuse it instead of having to procure another machine.
If you want to completely remove a cluster, you need to reset each node individually.
When resetting a node, OpenEBS PVCs on the node are deleted. Only PVCs created as part of a StatefulSet will be recreated automatically on another node. To recreate other PVCs, the application will need to be redeployed.
To reset a node:
-
SSH onto the machine. Ensure that the Embedded Cluster binary is still available on that machine.
-
Run the following command to reset the node and automatically reboot the machine to ensure that transient configuration is also reset:
sudo ./APP_SLUG reset
Where
APP_SLUG
is the unique slug for the application.:::note Pass the
--no-prompt
flag to disable interactive prompts. Pass the--force
flag to ignore any errors encountered during the reset. :::
This section outlines some additional use cases for Embedded Cluster. These are not officially supported features from Replicated, but are ways of using Embedded Cluster that we or our customers have experimented with that might be useful to you.
The NVIDIA GPU Operator uses the operator framework within Kubernetes to automate the management of all NVIDIA software components needed to provision GPUs. For more information about this operator, see the NVIDIA GPU Operator documentation.
You can include the NVIDIA GPU Operator in your release as an additional Helm chart, or using Embedded Cluster Helm extensions. For information about adding Helm extensions, see extensions in Embedded Cluster Config.
Using the NVIDIA GPU Operator with Embedded Cluster requires configuring the containerd options in the operator as follows:
# Embedded Cluster Config
extensions:
helm:
repositories:
- name: nvidia
url: https://nvidia.github.io/gpu-operator
charts:
- name: gpu-operator
chartname: nvidia/gpu-operator
namespace: gpu-operator
version: "v24.9.1"
values: |
# configure the containerd options
toolkit:
env:
- name: CONTAINERD_CONFIG
value: /etc/k0s/containerd.d/nvidia.toml
- name: CONTAINERD_SOCKET
value: /run/k0s/containerd.sock
When the containerd options are configured as shown above, the NVIDIA GPU Operator automatically creates the required configurations in the /etc/k0s/containerd.d/nvidia.toml
file. It is not necessary to create this file manually, or modify any other configuration on the hosts.
:::note If you include the NVIDIA GPU Operator as a Helm extension, remove any existing containerd services that are running on the host (such as those deployed by Docker) before attempting to install the release with Embedded Cluster. If there are any containerd services on the host, the NVIDIA GPU Operator will generate an invalid containerd config, causing the installation to fail. :::