gpt4 book ai didi

amazon-web-services - Kubernetes 集群迁移

转载 作者:行者123 更新时间:2023-12-01 03:12:19 25 4
gpt4 key购买 nike

我目前有多个 AWS 账户,每个账户都有自己的 Kubernetes 集群。不幸的是,当最初使用 kops 部署集群时,创建的 VPC 具有重叠的 CIDR 块。这通常不会成为问题,因为每个集群本质上都存在于它自己的宇宙中。

事情发生了一些变化,现在我们想要实现跨账户 VPC 对等。这个想法是通过 VPN 连接的用户可以通过所述对等连接访问所有资源。我的理解是,实现对等互连时,CIDR 块重叠将成为一个主要问题。

似乎不能仅仅更改现有集群的 CIDR 块。使用 ark 之类的东西在新 VPC 中备份和恢复集群是我唯一的选择吗?有没有人经历过完整的集群迁移?我很好奇是否有更好的答案。

最佳答案

您的理解是正确的:使用 kops,您无法更改现有集群的 CIDR 块;它卡在创建它的 VPC 中,并且 you can't change the CIDR block of a VPC :

The IP address range of a VPC is made up of the CIDR blocks associated with it. You select one CIDR block when you create the VPC, and you can add or remove secondary CIDR blocks later. The CIDR block that you add when you create the VPC cannot be changed, but you can add and remove secondary CIDR blocks to change the IP address range of the VPC. (emphasis mine)



这将我们引向第二点:迁移您的集群。这可以分为两个阶段:
  • 迁移由 kops 管理的基础架构
  • 迁移集群上的工作负载

  • 1.迁移kops管理的基础设施

    您将需要迁移(即重新创建)kops 集群本身:ec2 实例,kops InstanceGroupsCluster对象、各种 AWS 基础设施等。为此,您可以使用 kops toolbox template命令:
    kops toolbox template --values /path/to/values.yaml --template /path/to/cluster/template.yaml > /path/to/output/cluster.yaml
    kops create -f /path/to/output/cluster.yaml

    这是一个类似 Helm 的工具,允许您模板化 kops 集群配置并传入不同的 values.yaml文件。您可能希望将此命令包含在一个小型 shell 脚本包装器或 Makefile 中,以创建一键式集群部署,从而轻松且可重复地设置您的 k8s 集群基础架构。

    示例集群 template.yaml 文件和 values.yaml 文件可能如下所示,其中包含 Cluster 的规范。 、master、worker 和自动缩放 InstanceGroup s。
    # template.yaml
    {{ $clusterSubdomain := (env "CLUSTER_SUBDOMAIN") }}
    {{ $subnetCidr := (env "SUBNET_CIDR") }}

    apiVersion: kops/v1alpha2
    kind: Cluster
    metadata:
    name: {{ $clusterSubdomain }}.k8s.example.io
    spec:
    hooks:
    - manifest: |
    [Unit]
    Description=Create example user
    ConditionPathExists=!/home/example/.ssh/authorized_keys

    [Service]
    Type=oneshot
    ExecStart=/bin/sh -c 'useradd example && echo "{{ .examplePublicKey }}" > /home/example/.ssh/authorized_keys'
    name: useradd-example.service
    roles:
    - Node
    - Master
    - manifest: |
    Type=oneshot
    ExecStart=/usr/bin/coreos-cloudinit --from-file=/home/core/cloud-config.yaml
    name: reboot-window.service
    roles:
    - Node
    - Master
    kubeAPIServer:
    authorizationRbacSuperUser: admin
    featureGates:
    TaintBasedEvictions: "true"
    kubeControllerManager:
    featureGates:
    TaintBasedEvictions: "true"
    horizontalPodAutoscalerUseRestClients: false
    kubeScheduler:
    featureGates:
    TaintBasedEvictions: "true"
    kubelet:
    featureGates:
    TaintBasedEvictions: "true"
    fileAssets:
    - content: |
    yes
    name: docker-1.12
    path: /etc/coreos/docker-1.12
    roles:
    - Node
    - Master
    - content: |
    #cloud-config
    coreos:
    update:
    reboot-strategy: "etcd-lock"
    locksmith:
    window-start: {{ .locksmith.windowStart }}
    window-length: {{ .locksmith.windowLength }}
    name: cloud-config.yaml
    path: /home/core/cloud-config.yaml
    roles:
    - Node
    - Master
    api:
    dns: {}
    authorization:
    rbac: {}
    channel: stable
    cloudProvider: aws
    configBase: s3://my-bucket.example.io/{{ $clusterSubdomain }}.k8s.example.io
    etcdClusters:
    - etcdMembers:
    - instanceGroup: master-{{ .zone }}
    name: a
    name: main
    - etcdMembers:
    - instanceGroup: master-{{ .zone }}
    name: a
    name: events
    iam:
    allowContainerRegistry: true
    legacy: false
    kubernetesApiAccess:
    - {{ .apiAccessCidr }}
    kubernetesVersion: {{ .k8sVersion }}
    masterPublicName: api.{{ $clusterSubdomain }}.k8s.example.io
    networkCIDR: {{ .vpcCidr }}
    networkID: {{ .vpcId }}
    networking:
    canal: {}
    nonMasqueradeCIDR: 100.64.0.0/10
    sshAccess:
    - {{ .sshAccessCidr }}
    subnets:
    - cidr: {{ $subnetCidr }}
    name: {{ .zone }}
    type: Public
    zone: {{ .zone }}
    topology:
    dns:
    type: Public
    masters: public
    nodes: public
    ---
    apiVersion: kops/v1alpha2
    kind: InstanceGroup
    metadata:
    labels:
    kops.k8s.io/cluster: {{ $clusterSubdomain }}.k8s.example.io
    name: master-{{ .zone }}
    spec:
    {{- if .additionalSecurityGroups }}
    additionalSecurityGroups:
    {{- range .additionalSecurityGroups }}
    - {{ . }}
    {{- end }}
    {{- end }}
    image: {{ .image }}
    machineType: {{ .awsMachineTypeMaster }}
    maxSize: 1
    minSize: 1
    nodeLabels:
    kops.k8s.io/instancegroup: master-{{ .zone }}
    role: Master
    subnets:
    - {{ .zone }}
    ---
    apiVersion: kops/v1alpha2
    kind: InstanceGroup
    metadata:
    labels:
    kops.k8s.io/cluster: {{ $clusterSubdomain }}.k8s.example.io
    name: nodes
    spec:
    {{- if .additionalSecurityGroups }}
    additionalSecurityGroups:
    {{- range .additionalSecurityGroups }}
    - {{ . }}
    {{- end }}
    {{- end }}
    image: {{ .image }}
    machineType: {{ .awsMachineTypeNode }}
    maxSize: {{ .nodeCount }}
    minSize: {{ .nodeCount }}
    nodeLabels:
    kops.k8s.io/instancegroup: nodes
    role: Node
    subnets:
    - {{ .zone }}
    ---
    apiVersion: kops/v1alpha2
    kind: InstanceGroup
    metadata:
    name: ag.{{ $clusterSubdomain }}.k8s.example.io
    labels:
    kops.k8s.io/cluster: {{ $clusterSubdomain }}.k8s.example.io
    spec:
    {{- if .additionalSecurityGroups }}
    additionalSecurityGroups:
    {{- range .additionalSecurityGroups }}
    - {{ . }}
    {{- end }}
    {{- end }}
    image: {{ .image }}
    machineType: {{ .awsMachineTypeAg }}
    maxSize: 10
    minSize: 1
    nodeLabels:
    kops.k8s.io/instancegroup: ag.{{ $clusterSubdomain }}.k8s.example.io
    role: Node
    subnets:
    - {{ .zone }}

    和 values.yaml 文件:
    # values.yaml:

    region: us-west-2
    zone: us-west-2a
    environment: staging
    image: ami-abc123
    awsMachineTypeNode: c5.large
    awsMachineTypeMaster: m5.xlarge
    awsMachineTypeAg: c5.large
    nodeCount: "2"
    k8sVersion: "1.9.3"
    vpcId: vpc-abc123
    vpcCidr: 172.23.0.0/16
    apiAccessCidr: <e.g. office ip>
    sshAccessCidr: <e.g. office ip>
    additionalSecurityGroups:
    - sg-def234 # kubernetes-standard
    - sg-abc123 # example scan engine targets
    examplePublicKey: "ssh-rsa ..."
    locksmith:
    windowStart: Mon 16:00 # 8am Monday PST
    windowLength: 4h

    2. 迁移集群上的工作负载

    对 Ark 没有任何实际操作经验, it does seem to fit your use case well :

    Cluster migration

    Using Backups and Restores

    Heptio Ark can help you port your resources from one cluster to another, as long as you point each Ark Config to the same cloud object storage. In this scenario, we are also assuming that your clusters are hosted by the same cloud provider. Note that Heptio Ark does not support the migration of persistent volumes across cloud providers.

    (Cluster 1) Assuming you haven’t already been checkpointing your data with the Ark schedule operation, you need to first back up your

    entire cluster (replacing as desired):

    ark backup create <BACKUP-NAME>

    The default TTL is 30 days (720 hours); you can use the --ttl flag to change this as necessary.

    (Cluster 2) Make sure that the persistentVolumeProvider and backupStorageProvider fields in the Ark Config match the ones from

    Cluster 1, so that your new Ark server instance is pointing to the same bucket.

    (Cluster 2) Make sure that the Ark Backup object has been created. Ark resources are synced with the backup files available in cloud

    storage.

    (Cluster 2) Once you have confirmed that the right Backup (<BACKUP-NAME>) is now present, you can restore everything with:

    ark restore create --from-backup <BACKUP-NAME>


    在 AWS 集群上配置方舟似乎很简单: https://github.com/heptio/ark/blob/master/docs/aws-config.md .

    通过使用 kops 工具箱脚本和 Ark 配置进行一些初始设置,您应该有一种干净、可重复的方式来迁移您的集群并将您的宠物变成牛,就像模因一样。

    关于amazon-web-services - Kubernetes 集群迁移,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/51143672/

    25 4 0
    Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
    广告合作:1813099741@qq.com 6ren.com