Managed SSL for TCP Loadbalancer in GKE

At Altenar, we often use RabbitMQ as an entry point for our products. In this specific case, we will discuss a system that calculates sports data and provides it to other systems within the organization as well as outside the perimeter. When designing the system, the RabbitMQ component had the following requirements:

The endpoint is secured with an SSL certificate
RabbitMQ is hosted in Kubernetes. We use the RabbitMQ cluster operator and RabbitMQ messaging topology operator

For this project, we are using Google Cloud. Like Let’s Encrypt, you can use a Google Managed SSL certificate for any public HTTP load balancer in Google Cloud. However, it has several limitations, with the key ones being:

It only works for HTTP Load balancers.
It only works for public Load balancers.

According to RabbitMQ documentation, there are two common approaches for client connections:

Configure RabbitMQ to handle TLS connections
Use a proxy or load balancer (such as HAproxy) to perform TLS termination of client connections and use plain TCP connections to RabbitMQ nodes.

Our idea was to use a Google Managed certificate for TCP/SSL proxy load balancer in Google Cloud. However, due to the aforementioned limitations, a managed certificate is not supported for TCP load balancers. On the other hand, Google allows you to use an SSL certificate for several load balancers, which we decided to explore. Overall the blueprint would look like this:

We wanted to have a default dummy HTTP service that we would expose over port 443 and use for the sake of managing the SSL certificate and automatic renewals.
We would have a separate endpoint reusing the same IP address and SSL certificate.

Let’s examine each part of the solution in detail.

GKE Cluster

As the main hosting platform, we use GKE. We use the Google Terraform module for the private GKE cluster, configure Workload Identity and install an AutoNEG controller to the cluster.

In this article, we won’t delve into the specifics of how Workload Identity works with AutoNEG, as they are used as described in the official documentation.

With all preparations in place, we deploy a dummy “Hello World” service to the cluster and attach it to the Google Load balancer via AutoNEG. Below is the YAML configuration for that:

apiVersion: apps/v1 kind: Deployment metadata:   name: gcp-tls-certificate-issuer   labels:     app: gcp-tls-certificate-issuer   annotations:     deployment.kubernetes.io/revision: '1' spec:   replicas: 2   selector:     matchLabels:       app: gcp-tls-certificate-issuer   template:     metadata:       labels:         app: gcp-tls-certificate-issuer     spec:       containers:         - name: ok           image: assemblyline/ok:latest           ports:             - containerPort: 8888               protocol: TCP           imagePullPolicy: Always           securityContext:             capabilities:               drop:                 - ALL             runAsUser: 1000             runAsGroup: 3000             runAsNonRoot: true             readOnlyRootFilesystem: true       restartPolicy: Always       terminationGracePeriodSeconds: 30   strategy:     type: RollingUpdate     rollingUpdate:       maxUnavailable: 25%       maxSurge: 25%   revisionHistoryLimit: 10   progressDeadlineSeconds: 600 --- apiVersion: v1 kind: Service metadata:   name: gcp-tls-certificate-issuer   labels:     app: gcp-tls-certificate-issuer   annotations:     cloud.google.com/neg: '{"exposed_ports": {"8888":{"name": "gcp-tls-certificate-issuer"}}}'     controller.autoneg.dev/neg: '{"backend_services":{"8888":[{"name":"envcode-rabbit-https-backend-service","max_connections_per_endpoint":10000}]}}' spec:   ports:     - name: http       protocol: TCP       port: 8888       targetPort: 8888   selector:     app: gcp-tls-certificate-issuer   clusterIP: 10.10.12.130   clusterIPs:     - 10.10.12.130   type: ClusterIP

Notice the service’s annotation. This is how AutoNEG adds it to the Load balancer as a backend.

Google load balancer

The next part is managed outside GKE and created separately. Google Load balancer is not a single object but rather a set of different objects combined. Below is the Terraform code with comments:

resource "google_compute_managed_ssl_certificate" "rabbitmq" {   project = var.project   name    = "${var.environment_name}-google-managed-certificate-rabbitmq"    managed {     domains = ["rabitmq.example.com."]  # Replace with your domain   } }  # reserved IP address resource "google_compute_global_address" "default" {   project = var.project   name    = "tcp-proxy-xlb-ip" }  output "rabbitmq-ip" {   value = google_compute_global_address.default.address }  # forwarding rule for TCP Loadbalanser resource "google_compute_global_forwarding_rule" "default" {   project               = var.project   name                  = "${var.environment_name}-tcp-global-loadbalancer"   provider              = google   ip_protocol           = "TCP"   load_balancing_scheme = "EXTERNAL"   port_range            = "5671"   target                = google_compute_target_ssl_proxy.default.id   ip_address            = google_compute_global_address.default.id } # https://cloud.google.com/load-balancing/docs/ssl # When you use Google-managed SSL certificates with SSL Proxy Load Balancing, the frontend port for traffic must be 443 to enable the Google-managed SSL certificates to be provisioned and renewed.  # forwarding rule for HTTPS Loadbalanser resource "google_compute_global_forwarding_rule" "https" {   project               = var.project   name                  = "${var.environment_name}-https-global-loadbalancer"   provider              = google   ip_protocol           = "TCP"   load_balancing_scheme = "EXTERNAL"   port_range            = "443"   target                = google_compute_target_ssl_proxy.https.id   ip_address            = google_compute_global_address.default.id }  resource "google_compute_target_ssl_proxy" "default" {   project          = var.project   name             = "${var.environment_name}-global-loadbalancer-tcp-proxy"   backend_service  = google_compute_backend_service.default.id   ssl_certificates = [google_compute_managed_ssl_certificate.rabbitmq.id] }  resource "google_compute_target_ssl_proxy" "https" {   project          = var.project   name             = "${var.environment_name}-global-loadbalancer-https-proxy"   backend_service  = google_compute_backend_service.https.id   ssl_certificates = [google_compute_managed_ssl_certificate.rabbitmq.id] }  # backend service For RabbitMQ Autoneg resource "google_compute_backend_service" "default" {   project = var.project   name    = "${var.environment_name}-tcp-backend-service"   protocol              = "TCP"   port_name             = "tcp"   load_balancing_scheme = "EXTERNAL"   timeout_sec           = 10   health_checks         = [google_compute_health_check.default.id]   session_affinity      = "CLIENT_IP"    # We don't want TF to remove whatever was configured by AutoNEG   lifecycle {     ignore_changes = [backend]   } }  # backend service For HTTPS Autoneg resource "google_compute_backend_service" "https" {   project               = var.project                            #that's what you use in the service annotations   name                  = "${var.environment_name}-https-backend-service"     protocol              = "TCP"   port_name             = "tcp"   load_balancing_scheme = "EXTERNAL"   timeout_sec           = 10   health_checks         = [google_compute_health_check.https.id]    # We don't want TF to remove whatever was configured by AutoNEG   lifecycle {     ignore_changes = [backend]   } }  resource "google_compute_health_check" "default" {   project            = var.project   name               = "tcp-proxy-health-check"   description        = "Backend service for AutoNEG"   timeout_sec        = 1   check_interval_sec = 1    tcp_health_check {     port = "5672" #use container port   } }  resource "google_compute_health_check" "https" {   project            = var.project   name               = "https-proxy-health-check"   description        = "Backend service for AutoNEG"   timeout_sec        = 1   check_interval_sec = 1    tcp_health_check {     port = "8888" #use container port   } }

]

As you can see, we created one IP address and one SSL certificate and then used them in two forwarding rules. This allowed us to have a managed SSL certificate used for the TCP Load Balancer.

Don’t forget to configure DNS and point the IP to the right hostname for the whole thing to work.

Tip: GKE has thel7-default-backend deployment. Perhaps it would’ve been enough just creating a service with AutoNEG annotations and pointing it to that deployment’s pods. Try that and let me know in the comments if it works.

ссылка на оригинал статьи https://habr.com/ru/articles/863864/

Managed SSL for TCP Loadbalancer in GKE

GKE Cluster

Google load balancer

Комментарии

Добавить комментарий Отменить ответ