arm64适配系列文章-第一章-arm64环境上kubesphere和k8s的部署
ARM64适配系列文章
第一章 arm64环境上kubesphere和k8s的部署
文章目录
- ARM64适配系列文章
- 前言
- 一、机器信息获取
- 1.1 芯片信息
- 1.2 操作系统版本信息
- 1.3 硬盘分区信息
- 1.4 内核信息检查
- 二、升级内核
- 2.1 使用阿里云的arm源
- 2.2 检查升级后BPF支持能力
- 三、安装基础环境包
- 四、准备安装工具
- 4.1 下载kk工具
- 4.2 准备config-sample.yaml文件
- 4.3 开始部署
- 4.4 部署后替换backend镜像
- 五、部署完毕,访问网页
- 六、部署中遇到的问题
- 6.1 bpf导致的calico无法启动问题
- 6.2 default-http-backend 启动失败问题
- 总结
前言
手里运维的业务平台要部署到用户环境,对方是华为910B的机器,单位目前没有,只有老的arm64架构的机器,反正先适配着,防止后续现抓麻爪了。
一、机器信息获取
1.1 芯片信息
lscpu
Architecture: aarch64
Byte Order: Little Endian
CPU(s): 40
On-line CPU(s) list: 0-39
Thread(s) per core: 1
Core(s) per socket: 40
Socket(s): 1
NUMA node(s): 1
Model: 1
CPU max MHz: 2500.0000
CPU min MHz: 600.0000
BogoMIPS: 40.00
L1d cache: unknown size
L1i cache: unknown size
L2 cache: unknown size
L3 cache: unknown size
NUMA node0 CPU(s): 0-39
Flags: fp asimd evtstrm aes pmull sha1 sha2 crc32 cpuid asimdrdm
1.2 操作系统版本信息
hostnamectl
Static hostname: datax3Icon name: computer-serverChassis: serverMachine ID: 570e6fdcda17439886d6364f7a3ba217Boot ID: c6b431eb288d4de4b62a823a7f383e7bOperating System: CentOS Linux 7 (AltArch)CPE OS Name: cpe:/o:centos:centos:7Kernel: Linux 4.14.0-115.el7a.0.1.aarch64Architecture: arm64
1.3 硬盘分区信息
lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 1.8T 0 disk
├─sda2 8:2 0 1G 0 part /boot
├─sda3 8:3 0 1.8T 0 part
│ ├─centos-swap 253:1 0 15.9G 0 lvm
│ ├─centos-home 253:2 0 1.8T 0 lvm /home
│ └─centos-root 253:0 0 50G 0 lvm /
└─sda1 8:1 0 200M 0 part /boot/efi
1.4 内核信息检查
主要是检查当前内核的BPF支持能力
cat /boot/config-$(uname -r) |grep BPF
CONFIG_BPF=y
# CONFIG_BPF_SYSCALL is not set
CONFIG_NETFILTER_XT_MATCH_BPF=m
CONFIG_NET_CLS_BPF=m
# CONFIG_NET_ACT_BPF is not set
CONFIG_BPF_JIT=y
CONFIG_LWTUNNEL_BPF=y
CONFIG_HAVE_EBPF_JIT=y
# CONFIG_TEST_BPF is not set
发现问题,内核不支持CONFIG_BPF_SYSCALL,需要升级
二、升级内核
2.1 使用阿里云的arm源
阿里云arm源地址: https://developer.aliyun.com/mirror/centos-altarch/?spm=a2c6h.13651104.d-2001.3.40cd320cKIvAMX
# 获取repo文件
wget http://mirrors.aliyun.com/repo/Centos-altarch-7.repo -O /etc/yum.repos.d/CentOS-Base.repo
# 升级内核
yum clean all
yum makecache
yum list kernel
yum update -y kernel
reboot
2.2 检查升级后BPF支持能力
Static hostname: datax3Icon name: computer-serverChassis: serverMachine ID: 570e6fdcda17439886d6364f7a3ba217Boot ID: c6b431eb288d4de4b62a823a7f383e7bOperating System: CentOS Linux 7 (AltArch)CPE OS Name: cpe:/o:centos:centos:7Kernel: Linux 4.18.0-348.20.1.el7.aarch64Architecture: arm64
CONFIG_CGROUP_BPF=y
CONFIG_BPF=y
CONFIG_BPF_LSM=y
CONFIG_BPF_SYSCALL=y
CONFIG_ARCH_WANT_DEFAULT_BPF_JIT=y
CONFIG_BPF_JIT_ALWAYS_ON=y
CONFIG_BPF_JIT_DEFAULT_ON=y
# CONFIG_BPF_PRELOAD is not set
CONFIG_NETFILTER_XT_MATCH_BPF=m
# CONFIG_BPFILTER is not set
CONFIG_NET_CLS_BPF=m
CONFIG_NET_ACT_BPF=m
CONFIG_BPF_JIT=y
CONFIG_BPF_STREAM_PARSER=y
CONFIG_LWTUNNEL_BPF=y
CONFIG_HAVE_EBPF_JIT=y
CONFIG_BPF_EVENTS=y
CONFIG_TEST_BPF=m
可以了
三、安装基础环境包
这里要配置上docker的dns信息
yum install -y yum-utils
yum-config-manager --add-repo https://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo
mkdir -p /home/data/docker_data/docker/
ln -s /home/data/docker_data/docker/ /var/lib/
sudo yum install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin
cat > /etc/docker/daemon.json <<EOF
{"dns": ["8.8.8.8","114.114.114.114"],"exec-opts":["native.cgroupdriver=systemd"],"log-driver":"json-file","log-opts":{"max-size":"100m"}
}
EOFservice docker start
systemctl enable docker
四、准备安装工具
4.1 下载kk工具
kk工具下载地址:https://github.com/kubesphere/kubekey/releases/tag/v3.1.8
wget https://github.com/kubesphere/kubekey/releases/download/v3.1.8/kubekey-v3.1.8-linux-arm64.tar.gz
4.2 准备config-sample.yaml文件
tar -xvf kubekey-v3.1.8-linux-arm64.tar
chmod a+x ./kk
# 创建配置文件
./kk create config --with-kubernetes v1.23.17 --with-kubesphere
修改配置文件中的机器信息,增加架构配置<arch: arm64>
apiVersion: kubekey.kubesphere.io/v1alpha2
kind: Cluster
metadata:name: sample
spec:hosts:- {name: datax3, address: xxx.xxx.103.6, internalAddress: xxx.xxx.103.6, user: root, arch: arm64 ,password: "smartcore"}roleGroups:etcd:- datax3control-plane: - datax3worker:- datax3
增加架构信息的官方文档:https://kubesphere.io/zh/docs/v4.1/03-installation-and-upgrade/02-install-kubesphere/02-install-kubernetes-and-kubesphere/
4.3 开始部署
export KKZONE=cn
./kk create cluster -f /home/k8s-one-node/config-sample.yaml -y --debug
4.4 部署后替换backend镜像
# 使用国内源下载
sudo docker pull hub.fast360.xyz/mirrorgooglecontainers/defaultbackend-arm64:1.4
# 下载后修改tag
docker tag hub.fast360.xyz/mirrorgooglecontainers/defaultbackend-arm64:1.4 mirrorgooglecontainers/defaultbackend-arm64:1.4
# 修改信息
kubectl set image deployment/default-http-backend default-http-backend=mirrorgooglecontainers/defaultbackend-arm64:1.4 -n kubesphere-controls-system
kubectl rollout restart deployment/default-http-backend -n kubesphere-controls-system
五、部署完毕,访问网页
确认可以访问,没有问题
六、部署中遇到的问题
6.1 bpf导致的calico无法启动问题
异常提示:Error from server (BadRequest): pod ks-installer-ddbcf44f8-8zhb5 does not have a host assigned
进行定位,定位到是calico的问题
kubectl get pod -A
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system calico-kube-controllers-6f996c8485-7f6rf 0/1 Pending 0 20m
kube-system calico-node-q82bk 0/1 Init:0/3 0 20m
kube-system coredns-5667b47695-qsd6f 0/1 Pending 0 20m
kube-system coredns-5667b47695-rttmr 0/1 Pending 0 20m
kube-system kube-apiserver-datax3 1/1 Running 0 21m
kube-system kube-controller-manager-datax3 1/1 Running 0 21m
kube-system kube-proxy-2h4xf 1/1 Running 0 20m
kube-system kube-scheduler-datax3 1/1 Running 0 21m
kube-system nodelocaldns-bjfm7 1/1 Running 0 20m
kube-system openebs-localpv-provisioner-7bbcf865cd-pmk7s 0/1 Pending 0 20m
kubesphere-system ks-installer-ddbcf44f8-8zhb5 0/1 Pending 0 20m
查看pod:calico-kube-controllers-6f996c8485-7f6rf
kubectl describe pods calico-kube-controllers-6f996c8485-7f6rf -n kube-system
Events:Type Reason Age From Message---- ------ ---- ---- -------Warning FailedScheduling 31s (x22 over 22m) default-scheduler 0/1 nodes are available: 1 node(s) had taint {node.kubernetes.io/not-ready: }, that the pod didn't tolerate.
查看pod:calico-node-q82bk
kubectl describe pods calico-node-q82bk -n kube-systemWarning FailedMount 7s (x2 over 2m25s) kubelet (combined from similar events): Unable to attach or mount volumes: unmounted volumes=[bpffs], unattached volumes=[var-run-calico bpffs kube-api-access-f8tww xtables-lock policysync host-local-net-dir cni-bin-dir cni-log-dir sys-fs nodeproc lib-modules cni-net-dir var-lib-calico]: timed out waiting for the conditionWarning FailedMount 2s (x19 over 22m) kubelet MountVolume.SetUp failed for volume "bpffs" : hostPath type check failed: /sys/fs/bpf is not a directory
查看kernel 对BPF 的支持情况,确保CONFIG_BPF、CONFIG_BPFSYSCALL 是yes的。
eBPF 在 Linux 3.18 版本以后引入。
这个问题就需要升级内核来解决。由于阿里源里面有4.18的内核版本,我就没有手动搞,直接yum升级了,也幸亏升级之后就好使了。
6.2 default-http-backend 启动失败问题
异常提示:
kubesphere-controls-system default-http-backend-659cc67b6b-652n7 0/1 CrashLoopBackOff 5 (87s ago) 6m6s
进行定位,查看pod:default-http-backend-659cc67b6b-652n7
kubectl describe pods default-http-backend-659cc67b6b-652n7 -n kubesphere-controls-systemNormal Scheduled 8m26s default-scheduler Successfully assigned kubesphere-controls-system/default-http-backend-659cc67b6b-652n7 to datax3Warning FailedCreatePodSandBox 8m8s (x2 over 8m16s) kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to start sandbox container for pod "default-http-backend-659cc67b6b-652n7": Error response from daemon: failed to create task for container: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error setting cgroup config for procHooks process: unable to freeze: unknownNormal SandboxChanged 8m7s (x2 over 8m15s) kubelet Pod sandbox changed, it will be killed and re-created.Normal Pulling 7m56s kubelet Pulling image "registry.cn-beijing.aliyuncs.com/kubesphereio/defaultbackend-amd64:1.4"Normal Pulled 7m19s kubelet Successfully pulled image "registry.cn-beijing.aliyuncs.com/kubesphereio/defaultbackend-amd64:1.4" in 5.626141615s (37.059010311s including waiting)Normal Pulled 6m23s (x3 over 7m12s) kubelet Container image "registry.cn-beijing.aliyuncs.com/kubesphereio/defaultbackend-amd64:1.4" already present on machineNormal Created 6m22s (x4 over 7m18s) kubelet Created container default-http-backendNormal Started 6m18s (x4 over 7m13s) kubelet Started container default-http-backendWarning BackOff 3m14s (x23 over 7m5s) kubelet Back-off restarting failed container
镜像不对,去论坛找帖子
地址:https://ask.kubesphere.com.cn/forum/d/8874-arm-default-http-backend-elasticsearch-logging-curator/11
按照帖子里面的方式处理
# 使用国内源下载
sudo docker pull hub.fast360.xyz/mirrorgooglecontainers/defaultbackend-arm64:1.4
# 下载后修改tag
docker tag hub.fast360.xyz/mirrorgooglecontainers/defaultbackend-arm64:1.4 mirrorgooglecontainers/defaultbackend-arm64:1.4
# 进行替换
kubectl set image deployment/default-http-backend default-http-backend=mirrorgooglecontainers/defaultbackend-arm64:1.4 -n kubesphere-controls-system
kubectl rollout restart deployment/default-http-backend -n kubesphere-controls-system
替换后查看集群状态
kubectl get pod -A
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system calico-kube-controllers-6f996c8485-7b7cw 1/1 Running 0 24m
kube-system calico-node-qljdf 1/1 Running 0 24m
kube-system coredns-7bfd7cb54c-ctcps 1/1 Running 0 24m
kube-system coredns-7bfd7cb54c-nb7xz 1/1 Running 0 24m
kube-system kube-apiserver-datax3 1/1 Running 0 24m
kube-system kube-controller-manager-datax3 1/1 Running 0 24m
kube-system kube-proxy-s4scz 1/1 Running 0 24m
kube-system kube-scheduler-datax3 1/1 Running 0 24m
kube-system nodelocaldns-pxmfx 1/1 Running 0 24m
kube-system openebs-localpv-provisioner-7bbcf865cd-qr8qq 1/1 Running 0 24m
kube-system snapshot-controller-0 1/1 Running 0 19m
kubesphere-controls-system default-http-backend-658d66d59f-mvxmf 1/1 Running 0 2m23s
kubesphere-controls-system kubectl-admin-7966644f4b-9rdj6 1/1 Running 0 7m
kubesphere-monitoring-system alertmanager-main-0 2/2 Running 0 12m
kubesphere-monitoring-system kube-state-metrics-856b7b8fdd-f4ltb 3/3 Running 0 13m
kubesphere-monitoring-system node-exporter-h9dgm 2/2 Running 0 13m
kubesphere-monitoring-system notification-manager-deployment-6cd86468dc-f99jx 2/2 Running 0 10m
kubesphere-monitoring-system notification-manager-operator-b9d6bf9d4-4n8wx 2/2 Running 0 12m
kubesphere-monitoring-system prometheus-k8s-0 2/2 Running 0 13m
kubesphere-monitoring-system prometheus-operator-684988fc5c-c6dbn 2/2 Running 0 13m
kubesphere-system ks-apiserver-68648cb47c-9sg6w 1/1 Running 0 16m
kubesphere-system ks-console-777b56767b-vl8sp 1/1 Running 0 16m
kubesphere-system ks-controller-manager-86f56844c-jwnzb 1/1 Running 0 16m
kubesphere-system ks-installer-ddbcf44f8-6scmx 1/1 Running 0 23m
pod均正常,尝试访问页面,页面访问正常
总结
很好使,和x86上部署体验感几乎相同。