Как создать собственный дистрибутив Kubernetes I

от автора

k8s-release — инструмент для сборки воспроизводимых DEB/RPM-пакетов компонентов Kubernetes (kubelet, kube-apiserver, etcd, Flannel, Istio и др.) в герметичном Docker-окружении. Создаёт подписанные apt/yum-репозитории, SBOM, хеш-суммы и airgap-бандлы для закрытых, регулируемых и изолированных сред. В репозитории также лежит Terraform-модуль для развёртывания HA Kubernetes на микро-ВМ Firecracker: 3 control-plane, 1+ workers, Flannel, Istio, SELinux enforcing.

k8s-release собирает любые комбинации компонентов через флаг —component:

./k8s-release build v1.36.1 --component kubelet,kubectl --format deb,rpm./k8s-release build v1.36.1 --component all --format deb./k8s-release build 3.5.22 --component etcd --format rpm./k8s-release build 1.30.0 --component istio --format deb,rpm

Можно указать отдельные компоненты (kubelet, etcd, flannel, istio), all для всего сразу, и формат пакетов — deb, rpm или оба.

Как использовать

Подготовка

git clone https://github.com/ingresslabs/k8s-release.gitcd k8s-release/terraform/k2vm-ha-labcp terraform.tfvars.example terraform.tfvars

На удалённом хосте должны быть: Firecracker, Docker, ядро, initrd, модули и подготовленный rootfs.ext4 по путям из переменных.

terraform initterraform planterraform applyterraform destroy
variable "name"                      { type = string; default = "k8s-ha-lab" }variable "target_host"               { type = string; default = "" }variable "target_user"               { type = string; default = "root" }variable "github_repo"               { type = string; default = "ingresslabs/k8s-release" }variable "github_run_id"             { type = number; default = 26680027183 }variable "control_plane_count"       { type = number; default = 3 }variable "worker_count"              { type = number; default = 2 }variable "kubernetes_version"        { type = string; default = "v1.36.1" }variable "subnet_prefix"             { type = string; default = "198.19.2" }variable "network_plugin"            { type = string; default = "flannel" }variable "firecracker_binary"        { type = string; default = "/usr/local/bin/firecracker" }variable "kernel_path"               { type = string; default = "/opt/fc-lab/vmlinux-5.15-generic" }variable "initrd_path"               { type = string; default = "/opt/fc-lab/initrd-5.15-generic.img" }variable "kernel_modules_tar_path"   { type = string; default = "/opt/fc-lab/modules-5.15-generic.tar.gz" }variable "base_rootfs_path"          { type = string; default = "/opt/fc-lab/rootfs.ext4" }variable "vcpu_count"                { type = number; default = 2 }variable "guest_selinux_mode"        { type = string; default = "permissive" }variable "enable_istio"              { type = bool;   default = false }variable "redeploy_token"            { type = string; default = "" }
# main.tf — развёртывание Kubernetes HA на Firecracker через k8s-releaseterraform {  required_version = ">= 1.5.0"  required_providers {    local = { source = "hashicorp/local", version = "~> 2.5" }    null  = { source = "hashicorp/null", version = "~> 3.2" }  }}

Оба проекта — части одной экосистемы. Первая статья описывает самодельную Firecracker CI-платформу (загрузка за 10 мс, кеширование, снапшоты). Второй проект, k8s-release, использует ту же платформу для сборки и развёртывания Kubernetes на микро-ВМ Firecracker. Terraform-модуль в k8s-release автоматизирует развёртывание HA-кластера по тем же принципам, что описаны в статье.


Это руководство выполняется непосредственно на хосте Linux. Оно собирает одну переиспользуемую гостевую rootfs, клонирует её на несколько дисков микро-ВМ Firecracker, загружает:

  • 3 узла control-plane kubeadm

  • встроенный etcd, по одному участнику на узел control-plane

  • 2 рабочих узла по умолчанию

  • и предоставляет Kubernetes API через хост-прокси HAProxy.

Предварительные требования

Выполняйте всё ниже от root на хосте Linux, на котором будет запущен Firecracker. На хосте необходимы:

  • Docker

  • Firecracker

  • curl, ip, iptables, ssh, scp, ssh-keygen

  • unsquashfs из squashfs-tools

  • mkfs.ext4, e2fsck, resize2fs из e2fsprogs

  • mount, umount, truncate, chroot

  • исходящий доступ в интернет для:

  • загрузки образа ядра LinuxKit

  • скачивания Ubuntu rootfs из Firecracker CI

  • пакетов Kubernetes с pkgs.k8s.io

  • репозитория Docker containerd

  • скачивания манифеста Flannel

  • загрузки образа контейнера HAProxy Выберите подсеть, не пересекающуюся с другими мостами на хосте.

Экспорт переменных лаборатории

Это руководство создаёт 3 узла control-plane и 2 рабочих узла по умолчанию. Измените WORKER_COUNT, если нужно меньше или больше рабочих узлов.

export RUN_ROOT=/var/lib/firecracker-kubeadm-haexport CACHE_ROOT=/var/cache/firecracker-kubeadm-haexport CONTROL_PLANE_COUNT=3export WORKER_COUNT=2export NODE_COUNT="$((CONTROL_PLANE_COUNT + WORKER_COUNT))"export SUBNET_PREFIX=198.19.0export GATEWAY="${SUBNET_PREFIX}.1"export CIDR="${SUBNET_PREFIX}.0/24"export API_LB_IP="${SUBNET_PREFIX}.5"export API_LB_PORT=6443export PRIMARY_CONTROL_PLANE_IP="${SUBNET_PREFIX}.10"export BRIDGE_NAME=k8sha198export TAP_PREFIX=k8sha198export FIRECRACKER_BIN=/usr/local/bin/firecrackerexport FIRECRACKER_ARCH="$(uname -m)"export LINUXKIT_KERNEL_IMAGE=linuxkit/kernel:6.12.59export ROOTFS_SIZE_GIB=12export CONTROL_PLANE_MEM_MIB=2048export WORKER_MEM_MIB=1536export VCPU_COUNT=2export KUBERNETES_MINOR=v1.36export KUBERNETES_VERSION=v1.36.1export POD_CIDR=10.244.0.0/16export SERVICE_CIDR=10.96.0.0/12export CNI_PLUGINS_VERSION=v1.3.0export NETWORK_PLUGIN=flannelexport FLANNEL_MANIFEST_URL="https://github.com/flannel-io/flannel/releases/latest/download/kube-flannel.yml"export HAPROXY_IMAGE=haproxy:3.2.19-alpineexport API_LB_CONTAINER_NAME="kubeadm-ha-api-lb-${BRIDGE_NAME}"export GUEST_SSH_KEY="${CACHE_ROOT}/lab_ssh_key"export GUEST_SSH_PUB="${GUEST_SSH_KEY}.pub"export ARTIFACT_PREFIX=/var/log/kubeadm-ha-labexport KERNEL_BOOT_ARGS="console=ttyS0 reboot=k panic=1 pci=off root=/dev/vda rw random.trust_cpu=on systemd.mask=serial-getty@ttyS0.service systemd.mask=systemd-random-seed.service"

Проверка зависимостей хоста и создание гостевого SSH-ключа

command -v awk chroot curl docker e2fsck grep ip iptables mkfs.ext4 mount \  resize2fs sha256sum sort ssh scp ssh-keygen truncate umount unsquashfs >/dev/nulltest -x "${FIRECRACKER_BIN}"mkdir -p "${RUN_ROOT}" "${CACHE_ROOT}"if [[ ! -f "${GUEST_SSH_KEY}" || ! -f "${GUEST_SSH_PUB}" ]]; then  ssh-keygen -q -t ed25519 -N "" -f "${GUEST_SSH_KEY}"fi

Определение вспомогательных функций

node_role() {  if (( $1 < CONTROL_PLANE_COUNT )); then    echo control-plane  else    echo worker  fi}node_ip() {  printf '%s.%d\n' "${SUBNET_PREFIX}" "$((10 + $1))"}node_name() {  printf 'k8s-%02d\n' "$1"}node_tap() {  printf '%s%d\n' "${TAP_PREFIX}" "$1"}node_mac() {  printf '06:36:00:00:00:%02x\n' "$((16 + $1))"}node_mem() {  if [[ "$(node_role "$1")" == "control-plane" ]]; then    echo "${CONTROL_PLANE_MEM_MIB}"  else    echo "${WORKER_MEM_MIB}"  fi}SSH_OPTS=(-i "${GUEST_SSH_KEY}" -o BatchMode=yes -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o LogLevel=ERROR -o ConnectTimeout=5)node_ssh() {  local idx="$1"  shift  ssh "${SSH_OPTS[@]}" "root@$(node_ip "${idx}")" "$@"}server_ssh() {  node_ssh 0 "$@"}copy_guest_file() {  local remote_path="$1"  local local_path="$2"  if server_ssh "test -f $(printf '%q' "${remote_path}")" >/dev/null 2>&1; then    server_ssh "cat $(printf '%q' "${remote_path}")" > "${local_path}"  fi}wait_for_ssh() {  local ip="$1"  for _ in $(seq 1 120); do    if ssh "${SSH_OPTS[@]}" "root@${ip}" true >/dev/null 2>&1; then      return 0    fi    sleep 2  done  return 1}wait_for_node_ready() {  local name="$1"  for _ in $(seq 1 240); do    if server_ssh "kubectl --kubeconfig /etc/kubernetes/admin.conf get node ${name} --no-headers 2>/dev/null | awk '\$2==\"Ready\"{ok=1} END{exit(ok?0:1)}'"; then      return 0    fi    sleep 2  done  return 1}

Скачивание гостевого ядра и исходных файлов rootfs

В этом руководстве используется:

  • образ ядра LinuxKit

  • последняя Ubuntu squashfs из Firecracker CI для вашей версии Firecracker major/minor

Загрузка ядра LinuxKit и извлечение vmlinux

export LINUXKIT_KEY="$(printf '%s\n' "${LINUXKIT_KERNEL_IMAGE}" | sha256sum | awk '{print substr($1,1,16)}')"export LINUXKIT_CACHE_DIR="${CACHE_ROOT}/downloads/linuxkit-${LINUXKIT_KEY}"export KERNEL_PATH="${LINUXKIT_CACHE_DIR}/vmlinux"export KERNEL_BZIMAGE_PATH="${LINUXKIT_CACHE_DIR}/kernel"export KERNEL_MODULES_TAR_PATH="${LINUXKIT_CACHE_DIR}/kernel.tar"export KERNEL_DEV_TAR_PATH="${LINUXKIT_CACHE_DIR}/kernel-dev.tar"mkdir -p "${LINUXKIT_CACHE_DIR}"if [[ ! -f "${KERNEL_PATH}" || ! -f "${KERNEL_MODULES_TAR_PATH}" ]]; then  rm -rf "${LINUXKIT_CACHE_DIR}.tmp"  mkdir -p "${LINUXKIT_CACHE_DIR}.tmp"  docker pull "${LINUXKIT_KERNEL_IMAGE}" >/dev/null  cid="$(docker create "${LINUXKIT_KERNEL_IMAGE}" /bin/sh)"  docker cp "${cid}:/kernel" "${LINUXKIT_CACHE_DIR}.tmp/kernel"  docker cp "${cid}:/kernel.tar" "${LINUXKIT_CACHE_DIR}.tmp/kernel.tar"  docker cp "${cid}:/kernel-dev.tar" "${LINUXKIT_CACHE_DIR}.tmp/kernel-dev.tar"  docker rm -f "${cid}" >/dev/null  export LINUXKIT_HEADERS_DIR="$(    tar -tf "${LINUXKIT_CACHE_DIR}.tmp/kernel-dev.tar" \      | sed -n 's#^\(usr/src/linux-headers-[^/]*\)/scripts/extract-vmlinux$#\1#p' \      | head -n 1  )"  tar -xOf "${LINUXKIT_CACHE_DIR}.tmp/kernel-dev.tar" "${LINUXKIT_HEADERS_DIR}/scripts/extract-vmlinux" > "${LINUXKIT_CACHE_DIR}.tmp/extract-vmlinux"  chmod +x "${LINUXKIT_CACHE_DIR}.tmp/extract-vmlinux"  "${LINUXKIT_CACHE_DIR}.tmp/extract-vmlinux" "${LINUXKIT_CACHE_DIR}.tmp/kernel" > "${LINUXKIT_CACHE_DIR}.tmp/vmlinux"  mv "${LINUXKIT_CACHE_DIR}.tmp/kernel" "${KERNEL_BZIMAGE_PATH}"  mv "${LINUXKIT_CACHE_DIR}.tmp/vmlinux" "${KERNEL_PATH}"  mv "${LINUXKIT_CACHE_DIR}.tmp/kernel.tar" "${KERNEL_MODULES_TAR_PATH}"  mv "${LINUXKIT_CACHE_DIR}.tmp/kernel-dev.tar" "${KERNEL_DEV_TAR_PATH}"  rm -rf "${LINUXKIT_CACHE_DIR}.tmp"fi

Скачивание Ubuntu rootfs squashfs из Firecracker CI

export FIRECRACKER_CI_VERSION="$("${FIRECRACKER_BIN}" --version | awk '{print $2}' | sed 's/\.[0-9]*$//')"export UBUNTU_KEY="$(  curl -fsSL "https://s3.amazonaws.com/spec.ccfc.min?prefix=firecracker-ci/${FIRECRACKER_CI_VERSION}/${FIRECRACKER_ARCH}/ubuntu-&list-type=2" \    | grep -oP "(?<=<Key>)(firecracker-ci/${FIRECRACKER_CI_VERSION}/${FIRECRACKER_ARCH}/ubuntu-[0-9]+\.[0-9]+\.squashfs)(?=</Key>)" \    | sort -V | tail -1)"export ROOTFS_SQUASHFS_PATH="${CACHE_ROOT}/downloads/$(basename "${UBUNTU_KEY}")"mkdir -p "${CACHE_ROOT}/downloads"if [[ ! -f "${ROOTFS_SQUASHFS_PATH}" ]]; then  curl -fsSL "https://s3.amazonaws.com/spec.ccfc.min/${UBUNTU_KEY}" -o "${ROOTFS_SQUASHFS_PATH}"fi

Создание базовой ext4 rootfs

Преобразует squashfs в образ ext4 и внедряет гостевой SSH-ключ.

export BASE_KEY="$(  {    sha256sum "${ROOTFS_SQUASHFS_PATH}"    sha256sum "${GUEST_SSH_PUB}"  } | sha256sum | awk '{print substr($1,1,16)}')"export BASE_ROOTFS_PATH="${CACHE_ROOT}/base/ubuntu-${BASE_KEY}.ext4"mkdir -p "${CACHE_ROOT}/base"if [[ ! -s "${BASE_ROOTFS_PATH}" ]]; then  rm -rf "${CACHE_ROOT}/base/ubuntu-${BASE_KEY}.rootfs"  rm -f "${BASE_ROOTFS_PATH}.tmp"  mkdir -p "${CACHE_ROOT}/base/ubuntu-${BASE_KEY}.rootfs"  unsquashfs -d "${CACHE_ROOT}/base/ubuntu-${BASE_KEY}.rootfs/rootfs" "${ROOTFS_SQUASHFS_PATH}" >/dev/null  mkdir -p "${CACHE_ROOT}/base/ubuntu-${BASE_KEY}.rootfs/rootfs/root/.ssh"  chmod 700 "${CACHE_ROOT}/base/ubuntu-${BASE_KEY}.rootfs/rootfs/root/.ssh"  cp "${GUEST_SSH_PUB}" "${CACHE_ROOT}/base/ubuntu-${BASE_KEY}.rootfs/rootfs/root/.ssh/authorized_keys"  chmod 600 "${CACHE_ROOT}/base/ubuntu-${BASE_KEY}.rootfs/rootfs/root/.ssh/authorized_keys"  if [[ -f "${CACHE_ROOT}/base/ubuntu-${BASE_KEY}.rootfs/rootfs/etc/ssh/sshd_config" ]]; then    sed -i 's/^#\?PermitRootLogin.*/PermitRootLogin yes/' "${CACHE_ROOT}/base/ubuntu-${BASE_KEY}.rootfs/rootfs/etc/ssh/sshd_config" || true    sed -i 's/^#\?PubkeyAuthentication.*/PubkeyAuthentication yes/' "${CACHE_ROOT}/base/ubuntu-${BASE_KEY}.rootfs/rootfs/etc/ssh/sshd_config" || true    sed -i 's/^#\?PasswordAuthentication.*/PasswordAuthentication no/' "${CACHE_ROOT}/base/ubuntu-${BASE_KEY}.rootfs/rootfs/etc/ssh/sshd_config" || true  fi  chown -R root:root "${CACHE_ROOT}/base/ubuntu-${BASE_KEY}.rootfs/rootfs"  truncate -s 4G "${BASE_ROOTFS_PATH}.tmp"  mkfs.ext4 -d "${CACHE_ROOT}/base/ubuntu-${BASE_KEY}.rootfs/rootfs" -F "${BASE_ROOTFS_PATH}.tmp" >/dev/null  mv "${BASE_ROOTFS_PATH}.tmp" "${BASE_ROOTFS_PATH}"  rm -rf "${CACHE_ROOT}/base/ubuntu-${BASE_KEY}.rootfs"fi

Подготовка переиспользуемого гостевого образа с поддержкой kubeadm

Устанавливает:

  • модули ядра из архива ядра LinuxKit

  • containerd

  • kubelet, kubeadm, kubectl

  • плагины CNI

  • системные параметры Kubernetes и базовую настройку служб

export PREPARED_KEY="$(  {    sha256sum "${BASE_ROOTFS_PATH}" "${KERNEL_PATH}" "${GUEST_SSH_PUB}" "${KERNEL_MODULES_TAR_PATH}"    printf 'kubernetes_minor=%s\n' "${KUBERNETES_MINOR}"    printf 'kubernetes_version=%s\n' "${KUBERNETES_VERSION}"    printf 'cni_plugins=%s\n' "${CNI_PLUGINS_VERSION}"    printf 'rootfs_size_gib=%s\n' "${ROOTFS_SIZE_GIB}"    printf 'generation=manual-kubeadm-firecracker-ha-v1\n'  } | sha256sum | awk '{print substr($1,1,16)}')"export PREPARED_ROOTFS_PATH="${CACHE_ROOT}/prepared-${PREPARED_KEY}.ext4"if [[ ! -s "${PREPARED_ROOTFS_PATH}" ]]; then  cp --reflink=auto "${BASE_ROOTFS_PATH}" "${PREPARED_ROOTFS_PATH}.tmp" 2>/dev/null || cp "${BASE_ROOTFS_PATH}" "${PREPARED_ROOTFS_PATH}.tmp"  e2fsck -fy "${PREPARED_ROOTFS_PATH}.tmp" >/dev/null 2>&1 || true  truncate -s "${ROOTFS_SIZE_GIB}G" "${PREPARED_ROOTFS_PATH}.tmp"  resize2fs "${PREPARED_ROOTFS_PATH}.tmp" >/dev/null  export MNT="${CACHE_ROOT}/mnt-${PREPARED_KEY}"  mkdir -p "${MNT}"  mount -o loop "${PREPARED_ROOTFS_PATH}.tmp" "${MNT}"  mount -t proc proc "${MNT}/proc"  mount -t sysfs sysfs "${MNT}/sys"  mount --bind /dev "${MNT}/dev"  mount -t devpts devpts "${MNT}/dev/pts"  mount --bind /run "${MNT}/run"  rm -f "${MNT}/etc/resolv.conf"  printf 'nameserver 1.1.1.1\nnameserver 8.8.8.8\noptions timeout:2 attempts:3\n' > "${MNT}/etc/resolv.conf"  rm -rf "${MNT}/lib/modules"  mkdir -p "${MNT}/lib"  tar -xf "${KERNEL_MODULES_TAR_PATH}" -C "${MNT}" ./lib/modules  chmod 1777 "${MNT}/tmp"  mkdir -p "${MNT}/dev/pts" "${MNT}/var/cache/apt/archives/partial" "${MNT}/var/lib/apt/lists/partial" "${MNT}/var/log/apt"  touch "${MNT}/var/log/dpkg.log"  mkdir -p "${MNT}/root/.ssh"  chmod 700 "${MNT}/root/.ssh"  cp "${GUEST_SSH_PUB}" "${MNT}/root/.ssh/authorized_keys"  chmod 600 "${MNT}/root/.ssh/authorized_keys"  if [[ -f "${MNT}/etc/ssh/sshd_config" ]]; then    sed -i 's/^#\?PermitRootLogin.*/PermitRootLogin yes/' "${MNT}/etc/ssh/sshd_config" || true    sed -i 's/^#\?PubkeyAuthentication.*/PubkeyAuthentication yes/' "${MNT}/etc/ssh/sshd_config" || true    sed -i 's/^#\?PasswordAuthentication.*/PasswordAuthentication no/' "${MNT}/etc/ssh/sshd_config" || true  fi  chroot "${MNT}" apt-get -o Acquire::Retries=5 -o Acquire::http::Timeout=20 -o Acquire::https::Timeout=20 update  DEBIAN_FRONTEND=noninteractive chroot "${MNT}" apt-get -o Acquire::Retries=5 -o Acquire::http::Timeout=20 -o Acquire::https::Timeout=20 \    -o Dpkg::Options::=--force-confdef -o Dpkg::Options::=--force-confold install -y --no-install-recommends \    apt-transport-https ca-certificates conntrack curl ebtables ethtool gpg iptables ipset jq openssh-server socat tar xz-utils  mkdir -p "${MNT}/etc/apt/keyrings" "${MNT}/etc/apt/sources.list.d"  docker_arch="$(chroot "${MNT}" dpkg --print-architecture)"  docker_codename="$(awk -F= '/^VERSION_CODENAME=/{print $2}' "${MNT}/etc/os-release")"  chroot "${MNT}" curl -fsSL "https://download.docker.com/linux/ubuntu/gpg" | chroot "${MNT}" gpg --dearmor -o /etc/apt/keyrings/docker.gpg  printf 'deb [arch=%s signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/ubuntu %s stable\n' "${docker_arch}" "${docker_codename}" > "${MNT}/etc/apt/sources.list.d/docker.list"  chroot "${MNT}" curl -fsSL "https://pkgs.k8s.io/core:/stable:/${KUBERNETES_MINOR}/deb/Release.key" | chroot "${MNT}" gpg --dearmor -o /etc/apt/keyrings/kubernetes-apt-keyring.gpg  printf 'deb [signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg] https://pkgs.k8s.io/core:/stable:/%s/deb/ /\n' "${KUBERNETES_MINOR}" > "${MNT}/etc/apt/sources.list.d/kubernetes.list"  chroot "${MNT}" apt-get -o Acquire::Retries=5 -o Acquire::http::Timeout=20 -o Acquire::https::Timeout=20 update  DEBIAN_FRONTEND=noninteractive chroot "${MNT}" apt-get -o Acquire::Retries=5 -o Acquire::http::Timeout=20 -o Acquire::https::Timeout=20 \    -o Dpkg::Options::=--force-confdef -o Dpkg::Options::=--force-confold install -y containerd.io \      "kubelet=${KUBERNETES_VERSION#v}-*" "kubeadm=${KUBERNETES_VERSION#v}-*" "kubectl=${KUBERNETES_VERSION#v}-*"  chroot "${MNT}" apt-mark hold kubelet kubeadm kubectl >/dev/null 2>&1 || true  case "$(uname -m)" in    x86_64) cni_arch=amd64 ;;    aarch64) cni_arch=arm64 ;;    *) echo "неподдерживаемая архитектура CNI: $(uname -m)" >&2; exit 1 ;;  esac  export CNI_ARCHIVE="${CACHE_ROOT}/downloads/cni-plugins-linux-${cni_arch}-${CNI_PLUGINS_VERSION}.tgz"  if [[ ! -f "${CNI_ARCHIVE}" ]]; then    curl -fsSL "https://github.com/containernetworking/plugins/releases/download/${CNI_PLUGINS_VERSION}/cni-plugins-linux-${cni_arch}-${CNI_PLUGINS_VERSION}.tgz" -o "${CNI_ARCHIVE}"  fi  mkdir -p "${MNT}/opt/cni/bin"  tar -C "${MNT}/opt/cni/bin" -xzf "${CNI_ARCHIVE}"  pause_image="$(chroot "${MNT}" kubeadm config images list 2>/dev/null | awk '/pause/{print $1; exit}')"  mkdir -p "${MNT}/etc/containerd"  chroot "${MNT}" containerd config default > "${MNT}/etc/containerd/config.toml"  sed -i 's/SystemdCgroup = false/SystemdCgroup = true/' "${MNT}/etc/containerd/config.toml"  if [[ -n "${pause_image}" ]]; then    sed -i "s#sandbox_image = \".*\"#sandbox_image = \"${pause_image}\"#" "${MNT}/etc/containerd/config.toml"  fi  mkdir -p "${MNT}/etc/modules-load.d" "${MNT}/etc/sysctl.d" "${MNT}/etc/systemd/system/multi-user.target.wants"  cat > "${MNT}/etc/modules-load.d/kubernetes.conf" <<'EOF'overlaybr_netfilterEOF  cat > "${MNT}/etc/sysctl.d/99-kubernetes.conf" <<'EOF'net.bridge.bridge-nf-call-iptables=1net.bridge.bridge-nf-call-ip6tables=1net.ipv4.ip_forward=1EOF  touch "${MNT}/etc/cloud/cloud-init.disabled" 2>/dev/null || true  chroot "${MNT}" update-alternatives --set iptables /usr/sbin/iptables-legacy >/dev/null 2>&1 || true  chroot "${MNT}" update-alternatives --set ip6tables /usr/sbin/ip6tables-legacy >/dev/null 2>&1 || true  ln -sf /lib/systemd/system/ssh.service "${MNT}/etc/systemd/system/multi-user.target.wants/ssh.service"  ln -sf /lib/systemd/system/systemd-networkd.service "${MNT}/etc/systemd/system/multi-user.target.wants/systemd-networkd.service"  umount "${MNT}/dev/pts" || true  umount "${MNT}/proc" || true  umount "${MNT}/sys" || true  umount "${MNT}/dev" || true  umount "${MNT}/run" || true  umount "${MNT}" || true  mv "${PREPARED_ROOTFS_PATH}.tmp" "${PREPARED_ROOTFS_PATH}"fi

Подготовка моста, NAT и балансировщика API на стороне хоста

Явные правила FORWARD важны на хостах, где политика FORWARD по умолчанию — DROP.

docker rm -f "${API_LB_CONTAINER_NAME}" >/dev/null 2>&1 || trueip link add name "${BRIDGE_NAME}" type bridge 2>/dev/null || trueip addr add "${GATEWAY}/24" dev "${BRIDGE_NAME}" 2>/dev/null || trueip addr add "${API_LB_IP}/24" dev "${BRIDGE_NAME}" 2>/dev/null || trueip link set "${BRIDGE_NAME}" upsysctl -w net.ipv4.ip_forward=1 >/dev/nulliptables -C FORWARD -i "${BRIDGE_NAME}" -j ACCEPT 2>/dev/null || iptables -A FORWARD -i "${BRIDGE_NAME}" -j ACCEPTiptables -C FORWARD -o "${BRIDGE_NAME}" -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT 2>/dev/null || \  iptables -A FORWARD -o "${BRIDGE_NAME}" -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPTiptables -t nat -C POSTROUTING -s "${CIDR}" ! -o "${BRIDGE_NAME}" -j MASQUERADE 2>/dev/null || \  iptables -t nat -A POSTROUTING -s "${CIDR}" ! -o "${BRIDGE_NAME}" -j MASQUERADEcat > "${RUN_ROOT}/haproxy.cfg" <<EOFglobal  maxconn 2048defaults  mode tcp  timeout connect 5s  timeout client 60s  timeout server 60sfrontend kube_api  bind ${API_LB_IP}:${API_LB_PORT}  default_backend kube_apisbackend kube_apis  balance roundrobin  option tcp-check  default-server inter 2s fall 3 rise 2EOFfor i in $(seq 0 "$((CONTROL_PLANE_COUNT - 1))"); do  printf '  server cp%s %s:6443 check\n' "${i}" "$(node_ip "${i}")" >> "${RUN_ROOT}/haproxy.cfg"donedocker run -d --name "${API_LB_CONTAINER_NAME}" --network host \  -v "${RUN_ROOT}/haproxy.cfg:/usr/local/etc/haproxy/haproxy.cfg:ro" \  "${HAPROXY_IMAGE}" >/dev/null

Создание и загрузка микро-ВМ

Каждая ВМ получает собственный клонированный диск ext4, имя хоста, /etc/hosts и конфигурацию systemd-networkd.

for i in $(seq 0 "$((NODE_COUNT - 1))"); do  vm_dir="${RUN_ROOT}/nodes/node${i}"  mkdir -p "${vm_dir}"  cp --reflink=auto "${PREPARED_ROOTFS_PATH}" "${vm_dir}/rootfs.ext4" 2>/dev/null || cp "${PREPARED_ROOTFS_PATH}" "${vm_dir}/rootfs.ext4"  e2fsck -fy "${vm_dir}/rootfs.ext4" >/dev/null 2>&1 || true  mkdir -p "${vm_dir}/mnt"  mount -o loop "${vm_dir}/rootfs.ext4" "${vm_dir}/mnt"  printf '%s\n' "$(node_name "${i}")" > "${vm_dir}/mnt/etc/hostname"  {    printf '127.0.0.1 localhost\n'    printf '127.0.1.1 %s\n' "$(node_name "${i}")"    printf '%s api-lb\n' "${API_LB_IP}"    for j in $(seq 0 "$((NODE_COUNT - 1))"); do      printf '%s %s\n' "$(node_ip "${j}")" "$(node_name "${j}")"    done  } > "${vm_dir}/mnt/etc/hosts"  mkdir -p "${vm_dir}/mnt/etc/systemd/network" "${vm_dir}/mnt/etc/systemd/system/multi-user.target.wants"  cat > "${vm_dir}/mnt/etc/systemd/network/20-eth0.network" <<EOF[Match]Name=eth0[Network]Address=$(node_ip "${i}")/24Gateway=${GATEWAY}DNS=1.1.1.1DNS=8.8.8.8EOF  rm -f "${vm_dir}/mnt/etc/resolv.conf"  printf 'nameserver 1.1.1.1\nnameserver 8.8.8.8\n' > "${vm_dir}/mnt/etc/resolv.conf"  rm -f "${vm_dir}/mnt/etc/machine-id" "${vm_dir}/mnt/var/lib/dbus/machine-id" 2>/dev/null || true  touch "${vm_dir}/mnt/etc/machine-id"  rm -rf "${vm_dir}/mnt/etc/kubernetes" "${vm_dir}/mnt/var/lib/etcd" "${vm_dir}/mnt/var/lib/cni" "${vm_dir}/mnt/var/lib/kubelet" "${vm_dir}/mnt/var/lib/containerd" "${vm_dir}/mnt/etc/cni/net.d"  mkdir -p "${vm_dir}/mnt/var/lib/containerd" "${vm_dir}/mnt/etc/cni/net.d"  ln -sf /lib/systemd/system/ssh.service "${vm_dir}/mnt/etc/systemd/system/multi-user.target.wants/ssh.service"  ln -sf /lib/systemd/system/systemd-networkd.service "${vm_dir}/mnt/etc/systemd/system/multi-user.target.wants/systemd-networkd.service"  umount "${vm_dir}/mnt"  tap="$(node_tap "${i}")"  mac="$(node_mac "${i}")"  mem="$(node_mem "${i}")"  ip tuntap add dev "${tap}" mode tap 2>/dev/null || true  ip link set "${tap}" master "${BRIDGE_NAME}"  ip link set "${tap}" up  cat > "${vm_dir}/vm.json" <<EOF{"boot-source":{"kernel_image_path":"${KERNEL_PATH}","boot_args":"${KERNEL_BOOT_ARGS}"},"drives":[{"drive_id":"rootfs","path_on_host":"${vm_dir}/rootfs.ext4","is_root_device":true,"is_read_only":false}],"machine-config":{"vcpu_count":${VCPU_COUNT},"mem_size_mib":${mem}},"network-interfaces":[{"iface_id":"eth0","host_dev_name":"${tap}","guest_mac":"${mac}"}],"logger":{"log_path":"${vm_dir}/firecracker.log","level":"Info","show_level":true,"show_log_origin":true}}EOF  "${FIRECRACKER_BIN}" --api-sock "${vm_dir}/fc.sock" --config-file "${vm_dir}/vm.json" > "${vm_dir}/console.log" 2>&1 &  echo $! > "${vm_dir}/pid"done

Ожидание SSH и подготовка узлов

for i in $(seq 0 "$((NODE_COUNT - 1))"); do  wait_for_ssh "$(node_ip "${i}")"donefor i in $(seq 0 "$((NODE_COUNT - 1))"); do  node_ssh "${i}" "cat >/root/prepare-node.sh" <<'EOF'#!/usr/bin/env bashset -euo pipefailswapoff -a || truesed -i '/ swap / s/^/#/' /etc/fstab 2>/dev/null || truemodprobe overlay || truemodprobe br_netfilter || truesysctl --system >/dev/nullsystemctl daemon-reload || truesystemctl enable --now ssh containerd kubelet >/dev/null 2>&1 || truesystemctl restart containerdkubeadm reset -f >/dev/null 2>&1 || truerm -rf /etc/kubernetes /var/lib/etcd /var/lib/cni /var/lib/kubelet/* /etc/cni/net.d/*mkdir -p /etc/cni/net.dkubeadm config images pull --kubernetes-version="$(kubeadm version -o short)" >/dev/null 2>&1 || trueEOF  node_ssh "${i}" "chmod +x /root/prepare-node.sh && /root/prepare-node.sh"done

Инициализация первого узла control-plane

server_ssh "cat >/root/init-primary.sh" <<EOF#!/usr/bin/env bashset -euo pipefailversion="${KUBERNETES_VERSION}"kubeadm init \  --kubernetes-version "\${version}" \  --control-plane-endpoint "${API_LB_IP}:${API_LB_PORT}" \  --apiserver-advertise-address "${PRIMARY_CONTROL_PLANE_IP}" \  --pod-network-cidr "${POD_CIDR}" \  --service-cidr "${SERVICE_CIDR}" \  --upload-certs \  --ignore-preflight-errors=allmkdir -p /root/.kubecp /etc/kubernetes/admin.conf /root/.kube/configcertificate_key="\$(kubeadm init phase upload-certs --upload-certs 2>/dev/null | tail -n 1)"join_cmd="\$(kubeadm token create --print-join-command)"printf '%s\n' "\${certificate_key}" >/root/certificate-key.txtprintf '%s\n' "\${join_cmd}" >/root/join-command.txtEOFserver_ssh "chmod +x /root/init-primary.sh && /root/init-primary.sh"

Установка Flannel

server_ssh "kubectl --kubeconfig /etc/kubernetes/admin.conf apply -f $(printf '%q' "${FLANNEL_MANIFEST_URL}")"for _ in $(seq 1 180); do  flannel_ns="$(server_ssh "kubectl --kubeconfig /etc/kubernetes/admin.conf get daemonset -A --no-headers 2>/dev/null | awk '\$2==\"kube-flannel-ds\"{print \$1; exit}'" || true)"  if [[ -n "${flannel_ns}" ]]; then    if server_ssh "kubectl --kubeconfig /etc/kubernetes/admin.conf -n ${flannel_ns} rollout status daemonset/kube-flannel-ds --timeout=5s" >/dev/null 2>&1; then      break    fi  fi  sleep 2done

Подключение двух других узлов control-plane

join_cmd="$(server_ssh "cat /root/join-command.txt")"certificate_key="$(server_ssh "cat /root/certificate-key.txt")"for i in 1 2; do  ip="$(node_ip "${i}")"  node_ssh "${i}" "cat >/root/join-control-plane.sh" <<EOF#!/usr/bin/env bashset -euo pipefail${join_cmd} --control-plane --certificate-key ${certificate_key} --apiserver-advertise-address ${ip} --ignore-preflight-errors=allEOF  node_ssh "${i}" "chmod +x /root/join-control-plane.sh && /root/join-control-plane.sh"  wait_for_node_ready "$(node_name "${i}")"done

Подключение рабочих узлов

for i in $(seq "${CONTROL_PLANE_COUNT}" "$((NODE_COUNT - 1))"); do  node_ssh "${i}" "cat >/root/join-worker.sh" <<EOF#!/usr/bin/env bashset -euo pipefail${join_cmd} --ignore-preflight-errors=allEOF  node_ssh "${i}" "chmod +x /root/join-worker.sh && /root/join-worker.sh"  wait_for_node_ready "$(node_name "${i}")"done

Ожидание полной готовности кластера

for _ in $(seq 1 240); do  server_ssh "kubectl --kubeconfig /etc/kubernetes/admin.conf get nodes -o wide > ${ARTIFACT_PREFIX}.nodes 2>&1 || truekubectl --kubeconfig /etc/kubernetes/admin.conf get nodes --no-headers > ${ARTIFACT_PREFIX}.nodes-plain 2>&1 || truekubectl --kubeconfig /etc/kubernetes/admin.conf get endpoints kubernetes -o wide > ${ARTIFACT_PREFIX}.apiserver-endpoints 2>&1 || true" >/dev/null 2>&1 || true  ready_count="$(server_ssh "awk '\$2==\"Ready\"{c++} END{print c+0}' ${ARTIFACT_PREFIX}.nodes-plain 2>/dev/null || echo 0")"  control_plane_ready="$(server_ssh "awk '\$2==\"Ready\" && \$3 ~ /control-plane/ {c++} END{print c+0}' ${ARTIFACT_PREFIX}.nodes-plain 2>/dev/null || echo 0")"  endpoint_count="$(server_ssh "kubectl --kubeconfig /etc/kubernetes/admin.conf get endpoints kubernetes -o jsonpath='{range .subsets[*].addresses[*]}{.ip}{\"\\n\"}{end}' 2>/dev/null | awk 'NF{c++} END{print c+0}' || echo 0")"  if [[ "${ready_count}" -ge "${NODE_COUNT}" && "${control_plane_ready}" -ge "${CONTROL_PLANE_COUNT}" && "${endpoint_count}" -ge "${CONTROL_PLANE_COUNT}" ]]; then    break  fi  sleep 3doneserver_ssh "cat ${ARTIFACT_PREFIX}.nodes"

Ожидаемый результат:

k8s-00   Ready   control-planek8s-01   Ready   control-planek8s-02   Ready   control-planek8s-03   Ready   <none>k8s-04   Ready   <none>

Получение подтверждения членства etcd

etcd_pod="etcd-$(node_name 0)"server_ssh "kubectl --kubeconfig /etc/kubernetes/admin.conf -n kube-system exec ${etcd_pod} -- etcdctl --endpoints=https://127.0.0.1:2379 --cacert=/etc/kubernetes/pki/etcd/ca.crt --cert=/etc/kubernetes/pki/etcd/healthcheck-client.crt --key=/etc/kubernetes/pki/etcd/healthcheck-client.key member list -w table > ${ARTIFACT_PREFIX}.etcd-members 2>&1kubectl --kubeconfig /etc/kubernetes/admin.conf -n kube-system exec ${etcd_pod} -- etcdctl --endpoints=https://127.0.0.1:2379 --cacert=/etc/kubernetes/pki/etcd/ca.crt --cert=/etc/kubernetes/pki/etcd/healthcheck-client.crt --key=/etc/kubernetes/pki/etcd/healthcheck-client.key endpoint status -w table > ${ARTIFACT_PREFIX}.etcd-endpoints 2>&1"server_ssh "cat ${ARTIFACT_PREFIX}.etcd-members"

Должно отобразиться три участника etcd.

Запуск проверочного теста

server_ssh "cat >/root/run-smoke.sh" <<'EOF'#!/usr/bin/env bashset -euo pipefailkubectl --kubeconfig /etc/kubernetes/admin.conf create namespace smoke --dry-run=client -o yaml | kubectl --kubeconfig /etc/kubernetes/admin.conf apply -f -cat <<'YAML' | kubectl --kubeconfig /etc/kubernetes/admin.conf apply -f -apiVersion: apps/v1kind: DaemonSetmetadata:  name: node-smoke  namespace: smokespec:  selector:    matchLabels:      app: node-smoke  template:    metadata:      labels:        app: node-smoke    spec:      tolerations:        - operator: Exists      containers:        - name: node-smoke          image: busybox:1.36          command: ["sh", "-lc", "sleep 3600"]---apiVersion: apps/v1kind: Deploymentmetadata:  name: echo  namespace: smokespec:  replicas: 2  selector:    matchLabels:      app: echo  template:    metadata:      labels:        app: echo    spec:      tolerations:        - operator: Exists      containers:        - name: echo          image: nginx:1.27-alpine          ports:            - containerPort: 80---apiVersion: v1kind: Servicemetadata:  name: echo  namespace: smokespec:  selector:    app: echo  ports:    - port: 80      targetPort: 80YAMLkubectl --kubeconfig /etc/kubernetes/admin.conf -n smoke rollout status daemonset/node-smoke --timeout=240skubectl --kubeconfig /etc/kubernetes/admin.conf -n smoke rollout status deployment/echo --timeout=240skubectl --kubeconfig /etc/kubernetes/admin.conf -n kube-system rollout status deployment/coredns --timeout=240scat <<'YAML' | kubectl --kubeconfig /etc/kubernetes/admin.conf apply -f -apiVersion: batch/v1kind: Jobmetadata:  name: dns-http  namespace: smokespec:  backoffLimit: 0  template:    spec:      restartPolicy: Never      tolerations:        - operator: Exists      containers:        - name: dns-http          image: busybox:1.36          command:            - sh            - -lc            - |              for _ in $(seq 1 30); do                nslookup echo.smoke.svc.cluster.local && \                wget -qO- http://echo.smoke.svc.cluster.local >/tmp/echo.html && \                test -s /tmp/echo.html && exit 0                sleep 2              done              exit 1YAMLkubectl --kubeconfig /etc/kubernetes/admin.conf -n smoke wait --for=condition=complete job/dns-http --timeout=240skubectl --kubeconfig /etc/kubernetes/admin.conf get nodes -o wide > /var/log/kubeadm-ha-lab.nodes 2>&1kubectl --kubeconfig /etc/kubernetes/admin.conf get pods -A -o wide > /var/log/kubeadm-ha-lab.pods 2>&1kubectl --kubeconfig /etc/kubernetes/admin.conf get svc -A -o wide > /var/log/kubeadm-ha-lab.services 2>&1kubectl --kubeconfig /etc/kubernetes/admin.conf -n smoke get pods -o wide > /var/log/kubeadm-ha-lab.smoke-pods 2>&1kubectl --kubeconfig /etc/kubernetes/admin.conf -n smoke get svc echo -o wide > /var/log/kubeadm-ha-lab.smoke-service 2>&1kubectl --kubeconfig /etc/kubernetes/admin.conf -n smoke logs job/dns-http > /var/log/kubeadm-ha-lab.smoke-job.log 2>&1kubectl --kubeconfig /etc/kubernetes/admin.conf config view --raw > /var/log/kubeadm-ha-lab.kubeconfig 2>&1EOFserver_ssh "chmod +x /root/run-smoke.sh && /root/run-smoke.sh"server_ssh "cat ${ARTIFACT_PREFIX}.smoke-pods"server_ssh "cat ${ARTIFACT_PREFIX}.smoke-job.log"

Сохранение артефактов

mkdir -p "${RUN_ROOT}/artifacts"copy_guest_file "${ARTIFACT_PREFIX}.nodes" "${RUN_ROOT}/artifacts/nodes.txt"copy_guest_file "${ARTIFACT_PREFIX}.pods" "${RUN_ROOT}/artifacts/pods.txt"copy_guest_file "${ARTIFACT_PREFIX}.services" "${RUN_ROOT}/artifacts/services.txt"copy_guest_file "${ARTIFACT_PREFIX}.apiserver-endpoints" "${RUN_ROOT}/artifacts/apiserver-endpoints.txt"copy_guest_file "${ARTIFACT_PREFIX}.etcd-members" "${RUN_ROOT}/artifacts/etcd-members.txt"copy_guest_file "${ARTIFACT_PREFIX}.etcd-endpoints" "${RUN_ROOT}/artifacts/etcd-endpoints.txt"copy_guest_file "${ARTIFACT_PREFIX}.smoke-pods" "${RUN_ROOT}/artifacts/smoke-pods.txt"copy_guest_file "${ARTIFACT_PREFIX}.smoke-service" "${RUN_ROOT}/artifacts/smoke-service.txt"copy_guest_file "${ARTIFACT_PREFIX}.smoke-job.log" "${RUN_ROOT}/artifacts/smoke-job.log"copy_guest_file "${ARTIFACT_PREFIX}.kubeconfig" "${RUN_ROOT}/artifacts/kubeconfig.yaml"curl -ksSf --max-time 5 "https://${API_LB_IP}:${API_LB_PORT}/version" > "${RUN_ROOT}/artifacts/api-lb-version.json"

После этого:

export KUBECONFIG="${RUN_ROOT}/artifacts/kubeconfig.yaml"kubectl get nodes -o widekubectl get pods -A -o widekubectl -n smoke get pods -o wide

Демонтаж

docker rm -f "${API_LB_CONTAINER_NAME}" >/dev/null 2>&1 || truefor pid_file in "${RUN_ROOT}"/nodes/*/pid; do  [[ -f "${pid_file}" ]] || continue  kill "$(cat "${pid_file}")" 2>/dev/null || truedonesleep 1for pid_file in "${RUN_ROOT}"/nodes/*/pid; do  [[ -f "${pid_file}" ]] || continue  kill -9 "$(cat "${pid_file}")" 2>/dev/null || truedonefor i in $(seq 0 "$((NODE_COUNT - 1))"); do  ip link del "$(node_tap "${i}")" 2>/dev/null || truedoneiptables -D FORWARD -i "${BRIDGE_NAME}" -j ACCEPT 2>/dev/null || trueiptables -D FORWARD -o "${BRIDGE_NAME}" -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT 2>/dev/null || trueiptables -t nat -D POSTROUTING -s "${CIDR}" ! -o "${BRIDGE_NAME}" -j MASQUERADE 2>/dev/null || trueip link set "${BRIDGE_NAME}" down 2>/dev/null || trueip link del "${BRIDGE_NAME}" type bridge 2>/dev/null || truerm -rf "${RUN_ROOT}"

CACHE_ROOT намеренно не удаляется, чтобы можно было переиспользовать скачанное ядро, Ubuntu squashfs, базовый образ ext4 и подготовленный гостевой образ при следующем запуске.

ссылка на оригинал статьи https://habr.com/ru/articles/1046307/