灾难恢复
此功能允许您进行备份,如果发生灾难,可以还原 Kubernetes 集群。在创建备份时,集群的状态包括包服务配置和任何现有 Kubernetes 资源。您可以通过两个dcos kubernetes
子命令来使用此功能: restore
和 backup
。
现在,备份工件存储在 AWS S3 bucket 中。因此,必须安装 AWS CLI 并且需要执行一些步骤。
- 创建 IAM 用户:
aws iam create-user --user-name heptio-ark.
- 附上政策以为
heptio-ark
用户提供必需的权限:
aws iam attach-user-policy \
--policy-arn arn:aws:iam::aws:policy/AmazonS3FullAccess \
--user-name heptio-ark
aws iam attach-user-policy \
--policy-arn arn:aws:iam::aws:policy/AmazonEC2FullAccess \
--user-name heptio-ark
- 为用户创建访问密钥:
aws iam create-access-key --user-name heptio-ark
备份集群
要备份您的集群,请使用 dcos kubernetes backup
命令:
usage: dcos kubernetes backup [<flags>]
Flags:
-h, --help Show context-sensitive help.
-v, --verbose Enable extra logging of requests/responses
--name="kubernetes" Name of the service instance to query
--aws-secret-access-key="" AWS secret access key
--aws-access-key-id="" AWS access key id
--aws-region="" AWS S3 region
--aws-bucket="" AWS bucket name
--backup-ttl=720h How long before backup can be garbage collected
--timeout=1200s Maximum time to wait for the backup process to complete
标签 --aws-region
、--aws-bucket
、--aws-access-key-id
和 --aws-secret-access-key
为必须。
$ dcos kubernetes backup --aws-region=us-east1-d --aws-bucket=my_bucket --aws-access-key-id=ABC --aws-secret-access-key=XYZ
Backup creation: [COMPLETE]
Backup has been successfully created!
恢复集群
子命令 restore
从 S3 检索备份工件,并将保存的状态导入到
新配置的集群。
usage: dcos kubernetes restore [<flags>]
Flags:
-h, --help Show context-sensitive help.
-v, --verbose Enable extra logging of requests/responses
--name="kubernetes" Name of the service instance to query
--aws-secret-access-key="" AWS secret access key
--aws-access-key-id="" AWS access key id
--aws-region="" AWS S3 region
--aws-bucket="" AWS bucket name
--backup-name="kubernetes-backup" The name of the backup
--timeout=1200s Maximum time to wait for the restore process completion
--yes Disable interactive mode and assume "yes" is the answer to all prompts
标签 --aws-region
、--aws-bucket
、--aws-access-key-id
和 --aws-secret-access-key
为必须。
$ dcos kubernetes restore --aws-region=us-east1-d --aws-bucket=my_bucket --aws-access-key-id=ABC --aws-secret-access-key=XYZ
Backup restore: [COMPLETE]
Backup successfully restored!
如何进行测试?
- 在运行中的 Kubernetes 集群上,部署几个 pod:
$ kubectl create -f ./artifacts/nginx/nginx-deployment.yaml
$ kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
default nginx-6c54bd5869-pt62l 1/1 Running 0 39s
default nginx-6c54bd5869-xt82y 1/1 Running 0 39s
- 创建集群备份:
$ dcos kubernetes backup --aws-region=us-east1-d --aws-bucket=my_bucket --aws-access-key-id=ABC --aws-secret-access-key=XYZ
- 删除之前创建的部署:
$ kubectl delete -f ./artifacts/nginx/nginx-deployment.yaml
- 恢复备份并验证 pod 是否再次运行:
$ dcos kubernetes restore --aws-region=us-east1-d --aws-bucket=my_bucket --aws-access-key-id=ABC --aws-secret-access-key=XYZ