Kubernetes 多主多从集群部署完整文档
好久不见呀!今天给大家整点干货尝尝(其实是自己的总结),主打的就是全程无尿点。
Kubernetes 多主多从集群部署完整文档
1. 机器列表
PS: master,lb,nfs机器均为CentOS 7,其他为Ubuntu 22.04 LTS
机器名称 | IP地址 | 备注 |
---|---|---|
lb1 | 192.168.1.120 | 负载均衡器1 |
lb2 | 192.168.1.119 | 负载均衡器2 |
master-a | 192.168.1.74 | 主节点1 |
master-b | 192.168.1.93 | 主节点2 |
master-c | 192.168.1.107 | 主节点3 |
node01 | 192.168.1.13 | 工作节点1 |
master01 | 192.168.1.53 | 原单主集群节点(现为工作节点) |
vip | 192.168.1.150 | 虚拟IP地址 |
2. 负载均衡器配置
2.1 lb1配置
2.1.1 基础环境配置
# 备份原有yum源
cp /etc/yum.repos.d/CentOS-Base.repo /etc/yum.repos.d/CentOS-Base.repo.bak# 设置阿里云yum源
curl -o /etc/yum.repos.d/CentOS-Base.repo https://mirrors.aliyun.com/repo/Centos-7.repo# 清理并重建yum缓存
yum clean all
yum makecache# 设置主机名
hostnamectl set-hostname lb1# 设置时区
timedatectl set-timezone Asia/Shanghai# 关闭并禁用防火墙
systemctl stop firewalld
systemctl disable firewalld# 关闭SELinux
setenforce 0
sed -i 's/^SELINUX=enforcing$/SELINUX=disabled/' /etc/selinux/config
2.1.2 安装必要组件
# 安装基础工具
yum install -y curl socat conntrack ebtables ipset ipvsadm# 安装负载均衡组件
yum install -y keepalived haproxy psmisc
2.1.3 配置HAProxy
编辑配置文件/etc/haproxy/haproxy.cfg
:
globallog /dev/log local0 warningchroot /var/lib/haproxypidfile /var/run/haproxy.pidmaxconn 4000user haproxygroup haproxydaemonstats socket /var/lib/haproxy/statsdefaultslog globaloption httplogoption dontlognulltimeout connect 5000timeout client 50000timeout server 50000frontend kube-apiserverbind *:6443mode tcpoption tcplogdefault_backend kube-apiserverbackend kube-apiservermode tcpoption tcplogoption tcp-checkbalance roundrobindefault-server inter 10s downinter 5s rise 2 fall 2 slowstart 60s maxconn 250 maxqueue 256 weight 100server kube-apiserver-1 192.168.1.74:6443 checkserver kube-apiserver-2 192.168.1.93:6443 checkserver kube-apiserver-3 192.168.1.107:6443 check
启动HAProxy服务:
systemctl restart haproxy
systemctl enable haproxy
2.1.4 配置Keepalived
编辑配置文件/etc/keepalived/keepalived.conf
:
global_defs {notification_email {}router_id LVS_DEVELvrrp_skip_check_adv_addrvrrp_garp_interval 0vrrp_gna_interval 0
}vrrp_script chk_haproxy {script "killall -0 haproxy"interval 2weight 2
}vrrp_instance haproxy-vip {state BACKUPpriority 100interface ens192 # 根据实际网卡名称修改virtual_router_id 60advert_int 1authentication {auth_type PASSauth_pass 1111}unicast_src_ip 192.168.1.120 # 本机IPunicast_peer {192.168.1.119 # 对端lb2的IP}virtual_ipaddress {192.168.1.150/24 # VIP地址}track_script {chk_haproxy}
}
启动Keepalived服务:
systemctl restart keepalived
systemctl enable keepalived
2.2 lb2配置
lb2的配置与lb1基本相同,主要区别在于:
- 主机名设置为lb2
- Keepalived配置中的unicast_src_ip改为192.168.1.119
- Keepalived配置中的unicast_peer改为192.168.1.120
完整配置步骤参考lb1的配置。
详细配置
# 备份原有yum源
cp /etc/yum.repos.d/CentOS-Base.repo /etc/yum.repos.d/CentOS-Base.repo.bak# 设置阿里云yum源
curl -o /etc/yum.repos.d/CentOS-Base.repo https://mirrors.aliyun.com/repo/Centos-7.repo# 清理并重建yum缓存
yum clean all
yum makecache# 设置主机名
hostnamectl set-hostname lb2# 设置时区
timedatectl set-timezone Asia/Shanghai# 关闭并禁用防火墙
systemctl stop firewalld
systemctl disable firewalld# 关闭SELinux
setenforce 0
sed -i 's/^SELINUX=enforcing$/SELINUX=disabled/' /etc/selinux/config
2.2.2 安装必要组件
# 安装基础工具
yum install -y curl socat conntrack ebtables ipset ipvsadm# 安装负载均衡组件
yum install -y keepalived haproxy psmisc
2.2.3 配置HAProxy
编辑配置文件/etc/haproxy/haproxy.cfg
:
globallog /dev/log local0 warningchroot /var/lib/haproxypidfile /var/run/haproxy.pidmaxconn 4000user haproxygroup haproxydaemonstats socket /var/lib/haproxy/statsdefaultslog globaloption httplogoption dontlognulltimeout connect 5000timeout client 50000timeout server 50000frontend kube-apiserverbind *:6443mode tcpoption tcplogdefault_backend kube-apiserverbackend kube-apiservermode tcpoption tcplogoption tcp-checkbalance roundrobindefault-server inter 10s downinter 5s rise 2 fall 2 slowstart 60s maxconn 250 maxqueue 256 weight 100server kube-apiserver-1 192.268.1.74:6443 checkserver kube-apiserver-2 192.268.1.93:6443 checkserver kube-apiserver-3 192.268.1.107:6443 check
启动HAProxy服务:
systemctl restart haproxy
systemctl enable haproxy
2.2.4 配置Keepalived
编辑配置文件/etc/keepalived/keepalived.conf
:
global_defs {notification_email {}router_id LVS_DEVELvrrp_skip_check_adv_addrvrrp_garp_interval 0vrrp_gna_interval 0
}vrrp_script chk_haproxy {script "killall -0 haproxy"interval 2weight 2
}vrrp_instance haproxy-vip {state BACKUPpriority 100interface ens192 # 根据实际网卡名称修改virtual_router_id 60advert_int 1authentication {auth_type PASSauth_pass 1111}unicast_src_ip 192.268.1.119 # 本机IPunicast_peer {192.268.1.120 # 对端lb2的IP}virtual_ipaddress {192.268.1.150/24 # VIP地址}track_script {chk_haproxy}
}
启动Keepalived服务:
systemctl restart keepalived
systemctl enable keepalived
3. NFS服务器配置
# 解压NFS安装包
tar zxf nfs/nfs.tar.gz# 安装NFS服务
yum -y localinstall nfs-rpm/*.rpm# 配置NFS共享目录
cat > /etc/exports << EOF
/nfs-data/data *(rw,sync,no_root_squash,no_subtree_check)
EOF# 启动NFS服务
systemctl start nfs
systemctl enable nfs
systemctl restart nfs# 在其他节点安装NFS客户端工具
yum install -y nfs-utils
4. Master节点配置
4.1 master-a配置
4.1.1 基础环境配置
# 设置阿里云yum源
cp /etc/yum.repos.d/CentOS-Base.repo /etc/yum.repos.d/CentOS-Base.repo.bak
curl -o /etc/yum.repos.d/CentOS-Base.repo https://mirrors.aliyun.com/repo/Centos-7.repo
yum clean all
yum makecache# 设置主机名
hostnamectl set-hostname master-a# 设置时区
timedatectl set-timezone Asia/Shanghai# 关闭防火墙和SELinux
systemctl stop firewalld
systemctl disable firewalld
setenforce 0
sed -i 's/^SELINUX=enforcing$/SELINUX=disabled/' /etc/selinux/config# 配置hosts文件
cat >> /etc/hosts << EOF
192.168.1.53 master01
192.168.1.13 node01
192.168.1.81 ai3
192.168.1.74 master-a
192.168.1.93 master-b
192.168.1.107 master-c
EOF# 配置时间同步
yum install -y ntpdate
ntpdate cn.pool.ntp.org
echo "*/5 * * * * root /usr/sbin/ntpdate cn.pool.ntp.org &>/dev/null" >> /etc/crontab
4.1.2 安装Docker
# 解压Docker安装包
tar xf docker-20.10.23.tgz -C /usr/local# 复制Docker二进制文件
cp /usr/local/docker/* /usr/bin/# 创建Docker服务文件
cat > /usr/lib/systemd/system/docker.service <<EOF
[Unit]
Description=Docker Application Container Engine
Documentation=https://docs.docker.com
After=network-online.target firewalld.service
Wants=network-online.target[Service]
Type=notify
ExecStart=/usr/bin/dockerd
ExecReload=/bin/kill -s HUP \$MAINPID
LimitNOFILE=infinity
LimitNPROC=infinity
TimeoutStartSec=0
Delegate=yes
KillMode=process
Restart=on-failure
StartLimitBurst=3
StartLimitInterval=60s[Install]
WantedBy=multi-user.target
EOF# 配置Docker
mkdir -p /etc/docker
cat > /etc/docker/daemon.json << EOF
{ "insecure-registries":["192.168.1.13:5000"],"exec-opts": ["native.cgroupdriver=systemd"],"data-root": "/home/docker","log-opts": {"max-size": "10m","max-file": "3"}
}
EOF# 启动Docker服务
systemctl daemon-reload
systemctl start docker
systemctl enable docker
docker -v
4.1.3 安装Kubernetes组件
# 解压Kubernetes安装包
tar -xvf k8s-v1.23.16.tar# 关闭swap
swapoff -a
sed -i 's/.*swap.*/#&/g' /etc/fstab# 配置内核参数
cat > /etc/sysctl.d/k8s.conf <<EOF
net.bridge.bridge-nf-call-iptables = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.ipv4.ip_forward = 1
vm.swappiness = 0
net.ipv6.conf.all.disable_ipv6 = 0
net.ipv6.conf.default.disable_ipv6 = 0
net.ipv6.conf.lo.disable_ipv6 = 0
net.ipv6.conf.all.forwarding = 1
EOF
sysctl -p /etc/sysctl.d/k8s.conf# 允许转发
iptables -P FORWARD ACCEPT# 安装Kubernetes RPM包
yum -y localinstall k8s-rpm/*.rpm安装nfs
mkdir tmp
tar zxf nfs/nfs.tar.gz -C tmp
yum -y localinstall tmp/nfs-rpm/*.rpm
systemctl start nfs
systemctl enable nfs
4.1.4 初始化Kubernetes集群
创建kubeadm配置文件kubeadm-config.yaml
:
apiVersion: kubeadm.k8s.io/v1beta3
bootstrapTokens:
- groups:- system:bootstrappers:kubeadm:default-node-tokentoken: abcdef.0123456789abcdefttl: 24h0m0susages:- signing- authentication
kind: InitConfiguration
localAPIEndpoint:advertiseAddress: 192.168.1.74bindPort: 6443
nodeRegistration:criSocket: /var/run/dockershim.sockimagePullPolicy: IfNotPresentname: master-ataints: null
---
apiServer:certSANs:- master-a- master-b- master-ctimeoutForControlPlane: 4m0s
apiVersion: kubeadm.k8s.io/v1beta3
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controlPlaneEndpoint: 192.168.1.150:6443
controllerManager: {}
dns: {}
etcd:local:dataDir: /var/lib/etcd
imageRepository: 192.168.1.13:5000
kind: ClusterConfiguration
kubernetesVersion: 1.23.17
networking:dnsDomain: cluster.localserviceSubnet: 10.96.0.0/12podSubnet: 10.244.0.0/16
scheduler: {}
初始化集群:
kubeadm init --config kubeadm-config.yaml --upload-certs
配置kubectl:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
安装网络插件:
kubectl apply -f flannel/kube-flannel.yml
打包并分发集群配置文件:
cd /etc/kubernetes
tar zcf /root/k8s_conf.tar.gz pki/ca.crt pki/ca.key pki/sa.key pki/sa.pub pki/front-proxy-ca.crt pki/front-proxy-ca.key pki/etcd/ca.crt pki/etcd/ca.key admin.conf
scp /root/k8s_conf.tar.gz root@192.168.1.93:~/
scp /root/k8s_conf.tar.gz root@192.168.1.107:~/
生成加入集群命令:
kubeadm token create --print-join-command
4.2 master-b配置
master-b的配置与master-a类似,主要区别在于:
- 主机名设置为master-b
- 初始化时使用从master-a获取的配置文件
- 使用加入命令而非初始化命令
具体步骤:
# 设置阿里云yum源
cp /etc/yum.repos.d/CentOS-Base.repo /etc/yum.repos.d/CentOS-Base.repo.bak
curl -o /etc/yum.repos.d/CentOS-Base.repo https://mirrors.aliyun.com/repo/Centos-7.repo
yum clean all
yum makecache# 设置主机名
hostnamectl set-hostname master-a# 设置时区
timedatectl set-timezone Asia/Shanghai# 关闭防火墙和SELinux
systemctl stop firewalld
systemctl disable firewalld
setenforce 0
sed -i 's/^SELINUX=enforcing$/SELINUX=disabled/' /etc/selinux/config# 配置hosts文件
cat >> /etc/hosts << EOF
192.168.1.53 master01
192.168.1.13 node01
192.168.1.81 ai3
192.168.1.74 master-a
192.168.1.93 master-b
192.168.1.107 master-c
EOF# 配置时间同步
yum install -y ntpdate
ntpdate time.windows.com
4.2.2 安装Docker
# 解压Docker安装包
tar xf docker-20.10.23.tgz -C /usr/local# 复制Docker二进制文件
cp /usr/local/docker/* /usr/bin/# 创建Docker服务文件
cat > /usr/lib/systemd/system/docker.service <<EOF
[Unit]
Description=Docker Application Container Engine
Documentation=https://docs.docker.com
After=network-online.target firewalld.service
Wants=network-online.target[Service]
Type=notify
ExecStart=/usr/bin/dockerd
ExecReload=/bin/kill -s HUP \$MAINPID
LimitNOFILE=infinity
LimitNPROC=infinity
TimeoutStartSec=0
Delegate=yes
KillMode=process
Restart=on-failure
StartLimitBurst=3
StartLimitInterval=60s[Install]
WantedBy=multi-user.target
EOF# 配置Docker
mkdir -p /etc/docker
cat > /etc/docker/daemon.json << EOF
{ "insecure-registries":["192.168.1.13:5000"],"exec-opts": ["native.cgroupdriver=systemd"],"data-root": "/home/docker","log-opts": {"max-size": "10m","max-file": "3"}
}
EOF# 启动Docker服务
systemctl daemon-reload
systemctl start docker
systemctl enable docker
docker -v
4.2.3 安装Kubernetes组件
# 解压Kubernetes安装包
tar -xvf k8s-v1.23.16.tar# master-a配置解压到k8s目录
mkdir -p /etc/kubernetes
tar zxf /root/k8s_conf.tar.gz -C /etc/kubernetes/# 关闭swap
swapoff -a
sed -i 's/.*swap.*/#&/g' /etc/fstab# 配置内核参数
cat > /etc/sysctl.d/k8s.conf <<EOF
net.bridge.bridge-nf-call-iptables = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.ipv4.ip_forward = 1
vm.swappiness = 0
net.ipv6.conf.all.disable_ipv6 = 0
net.ipv6.conf.default.disable_ipv6 = 0
net.ipv6.conf.lo.disable_ipv6 = 0
net.ipv6.conf.all.forwarding = 1
EOF
sysctl -p /etc/sysctl.d/k8s.conf# 允许转发
iptables -P FORWARD ACCEPT# 安装Kubernetes RPM包
yum -y localinstall k8s-rpm/*.rpmkubeadm join 192.168.1.150:6443 --token abcdef.0123456789abcdef --discovery-token-ca-cert-hash sha256:8e13ce8a9e6ce68c4ba9b6b01ca98cff62a61d4d6c9b6063bd6b37aca19f7890 --control-plane
systemctl restart kubelet
systemctl status kubelet
查看端口占用
lsof -i:10250mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config安装nfs
mkdir tmp
tar zxf nfs/nfs.tar.gz -C tmp
yum -y localinstall tmp/nfs-rpm/*.rpm
systemctl start nfs
systemctl enable nfs外网安装
yum install nfs-utils
4.3 master-c配置
master-c的配置与master-b完全相同。
5. 工作节点配置
5.1 Ubuntu节点配置
# 更新系统并安装必要工具
apt-get update && apt-get install -y apt-transport-https ca-certificates curl software-properties-common# 添加Docker官方GPG密钥
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -# 添加Docker仓库
add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable"# 安装Docker
apt-get update && apt-get install -y docker-ce=5:20.10.23~3-0~ubuntu-$(lsb_release -cs) docker-ce-cli=5:20.10.23~3-0~ubuntu-$(lsb_release -cs) containerd.io# 配置Docker
mkdir -p /etc/docker
cat > /etc/docker/daemon.json <<EOF
{"exec-opts": ["native.cgroupdriver=systemd"],"log-driver": "json-file","log-opts": {"max-size": "100m"},"storage-driver": "overlay2"
}
EOF# 重启Docker
systemctl daemon-reload
systemctl restart docker
systemctl enable docker# 安装kubeadm、kubelet和kubectl
apt-get update && apt-get install -y apt-transport-https
curl https://mirrors.aliyun.com/kubernetes/apt/doc/apt-key.gpg | apt-key add -
cat <<EOF >/etc/apt/sources.list.d/kubernetes.list
deb https://mirrors.aliyun.com/kubernetes/apt/ kubernetes-xenial main
EOF
apt-get update
apt-get install -y kubelet=1.23.16-00 kubeadm=1.23.16-00 kubectl=1.23.16-00# 关闭swap
swapoff -a
sed -ri 's/.*swap.*/#&/' /etc/fstab# 关闭防火墙
ufw disable# 加入集群
kubeadm join 192.168.1.150:6443 --token 661ic1.vgfsbtnxte96nldg \--discovery-token-ca-cert-hash sha256:8e13ce8a9e6ce68c4ba9b6b01ca98cff62a61d4d6c9b6063bd6b37aca19f7890安装nfs
apt-get install nfs-common
6. 常见问题解决方案
6.1 kubelet启动超时
error execution phase kubelet-start: error uploading crisocket: timed out waiting for the condition To see the stack trace of this error execute with --v=5 or higher
kubeadm reset -f
docker rm -f $(docker ps -a -q)
rm -rf /var/lib/cni/
systemctl daemon-reload
systemctl restart kubelet
iptables -F && iptables -t nat -F && iptables -t mangle -F && iptables -X
6.2 CNI网络插件问题
~~
(combined from similar events): Failed to create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container “f4aac82f1a810b98057c8bb838deec809eb0750d703abcfb4a505ddcfb8406cd” network for pod “eip-nfs-nfs-client-6478c978c9-tqxld”: networkPlugin cni failed to set up pod “eip-nfs-nfs-client-6478c978c9-tqxld_kube-system” network: failed to delegate add: failed to set bridge addr: “cni0” already has an IP address different from 10.244.4.1/24
~~
rm -rf /etc/cni
ip link set cni0 down
ip link delete cni0
6.3 HAProxy端口绑定失败
高可用集群安装haproxy启动报告“Starting frontend api: cannot bind socket [0.0.0.0:6443]”的错误,可以执行以下命令
setsebool -P haproxy_connect_any=1
systemctl restart haproxy
6.4 Kubernetes证书过期
6.4.1 检查证书过期情况
kubeadm certs check-expiration
6.4.2 更新所有证书
kubeadm certs renew all
6.4.3 更新kubeconfig文件
cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
6.4.4 重启相关服务
systemctl restart kubelet
docker ps | grep -E 'k8s_kube-apiserver|k8s_kube-controller-manager|k8s_kube-scheduler|k8s_etcd_etcd' | awk '{print $1}' | xargs docker restart
7. 验证集群状态
在所有master节点上执行:
kubectl get nodes
kubectl get pods --all-namespaces
kubectl get cs
预期输出应显示所有节点状态为Ready,所有系统Pod运行正常,组件状态均为Healthy。
8. H3C路由器配置VIP
配置路径
高级选项 --->> 策略路由
详细配置参数
配置项 | 参数值 | 说明 |
---|---|---|
接口 | VLAN1 | 指定策略应用的VLAN接口 |
协议类型 | IP | 选择IP协议 |
源IP地址段 | 192.168.1.150-192.168.1.150 | 精确匹配单个源IP |
目的IP地址段 | 192.168.1.119-192.168.1.120 | 匹配目标IP范围 |
源端口 | 空 | 不限制源端口 |
目的端口 | 空 | 不限制目的端口 |
生效时间 | 空 | 全天生效 |
优先级 | 自动 | 由系统自动分配优先级 |
出接口 | WAN1 | 指定流量出口 |
是否启用 | 启用 | 启用该策略 |
描述 | LB-VIP,代理的是kube-api:6443端口 | 策略用途说明 |
配置说明
- 此策略将来自
192.168.1.150
访问192.168.1.119-120
的流量强制从WAN1接口转发 - 适用于k8s API Server(6443端口)的VIP代理场景
- 保持源/目的端口为空表示匹配所有端口流量
- 优先级自动分配可确保策略正常排序
注意事项
- 确保WAN1接口已正确配置
- 检查VLAN1接口状态是否UP
- 配置后建议测试连通性