选型

二进制安装?

方便了解各个模块运行机制,同时支持各种版本的系统

使用cfssl做为证书工具

生成标准的命名方式,方便统一,json做为配置文件,方便阅读

为何不使用supervisord管理进程,而使用systemd

目前主流的系统都是systemd作为进程管理,具有通用性。

supervisord基于python开发进程管理组件,需要单独安装

版本说明

kubernetes 1.23 + 不支持docker kubernetes Open Container Initiative(OCI)支持 lxc、runc 和 rkt 目前主流的三种容器 runtime lxc 是 Linux 上老牌的容器 runtime。Docker 最初也是用 lxc 作为 runtime。 runc 是 Docker 自己开发的容器 runtime,符合 oci 规范,也是现在 Docker 的默认 runtime。 rkt 是 CoreOS 开发的容器 runtime,符合 oci 规范,因而能够运行 Docker 的容器。

服务器准备

IP   节点     配置   组件  说明 
10.0.26.180 k8s-master-01 2核2G kube-apiserver,kube-controller-manager,kube-scheduler,kubectl,etcd 主控节点
10.0.26.190 k8s-node-01 2核2G flannel,docker,kubelet,kube-proxy 运算节点
10.0.26.191 k8s-node-02 2核2G flannel,docker,kubelet,kube-proxy 运算节点
  • 客户端包:kubernetes-client-linux-amd64.tar.gz (kubectl和kubectl-convert)
  • node端包:kubernetes-node-linux-amd64.tar.gz (包含客户端包+kubeadm+kubelet+kube-proxy)
  • 服务端包:kubernetes-server-linux-amd64.tar.gz (包含客户端包+node端包+kube-apiserver+kube-controller-manager+kube-scheduler+mounter+镜像文件)

证书准备

将证书软件安装到相关服务器上面:比如独立的证书服务器,本文放在master上面用于测试

安装软件

其实就是三个二进制文件

wget -O /usr/bin/cfssl https://github.com/cloudflare/cfssl/releases/download/v1.5.0/cfssl_1.5.0_linux_amd64
wget -O /usr/bin/cfssljson https://github.com/cloudflare/cfssl/releases/download/v1.5.0/cfssljson_1.5.0_linux_amd64
wget -O /usr/bin/cfssl-certinfo https://github.com/cloudflare/cfssl/releases/download/v1.5.0/cfssl-certinfo_1.5.0_linux_amd64
chmod +x /usr/bin/cfssl*

生成根证书

# 生成默认配置
# cfssl print-defaults config > ca-config.json
# cfssl print-defaults csr > ca-csr.json
mkdir -pv /etc/kubernetes/pki
cd /etc/kubernetes/pki

# 创建生成CA证书的JSON配置文件
cat > ca-config.json << 'EOF'
{
 "signing": {
     "default": {
         "expiry": "175200h"
     },
     "profiles": {
         "server": {
             "expiry": "175200h",
             "usages": [
                 "signing",
                 "key encipherment",
                 "server auth"
             ]
         },
         "client": {
             "expiry": "175200h",
             "usages": [
                 "signing",
                 "key encipherment",
                 "client auth"
             ]
         },
         "peer": {
             "expiry": "175200h",
             "usages": [
                 "signing",
                 "key encipherment",
                 "server auth",
                 "client auth"
             ]
         }
     }
 }
}
EOF

# 创建生成CA证书签名请求(csr)的JSON配置文件
cat > ca-csr.json << 'EOF'
{
    "CN": "Zaza Root CA",
    "hosts": [
    ],
    "key": {
        "algo": "rsa",
        "size": 2048
    },
    "names": [
        {
            "C": "CN",
            "ST": "SiChuan",
            "L": "ChengDu",
            "O": "Internet Zaza Ltd",
            "OU": "IT"
        }
    ],
    "ca": {
        "expiry": "175200h"
    }
}
EOF

# 生成私钥、根证书
cfssl gencert -initca ca-csr.json | cfssljson -bare ca -

生成各个服务证书

cd /etc/kubernetes/pki

#################################################################
# etcd证书请求文件
cat > etcd-peer-csr.json << 'EOF'
{
    "CN": "etcd-peer",
    "hosts": [
        "10.0.26.180"
    ],
    "key": {
        "algo": "rsa",
        "size": 2048
    },
    "names": [
        {
            "C": "CN",
            "ST": "SiChuan",
            "L": "ChengDu",
            "O": "Internet Zaza Ltd",
            "OU": "IT"
        }
    ]
}
EOF

# 生成证书
cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=peer etcd-peer-csr.json | cfssljson -bare etcd-peer

##################################################################
# client证书请求文件
cat > client-csr.json << 'EOF'
{
    "CN": "k8s-node",
    "hosts": [
    ],
    "key": {
        "algo": "rsa",
        "size": 2048
    },
    "names": [
        {
            "C": "CN",
            "ST": "SiChuan",
            "L": "ChengDu",
            "O": "Internet Zaza Ltd",
            "OU": "CN"
        }
    ]
}
EOF

# 生成证书
cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=client client-csr.json | cfssljson -bare client

##################################################################
# apiserver证书请求文件
cat > apiserver-csr.json << 'EOF'
{
    "CN": "k8s-apiserver",
    "hosts": [
        "127.0.0.1",
        "10.26.0.1",
        "10.0.26.180",
        "kubernetes.default",
        "kubernetes.default.svc",
        "kubernetes.default.svc.cluster",
        "kubernetes.default.svc.cluster.local"
    ],
    "key": {
        "algo": "rsa",
        "size": 2048
    },
    "names": [
        {
            "C": "CN",
            "ST": "SiChuan",
            "L": "ChengDu",
            "O": "Internet Zaza Ltd",
            "OU": "system"
        }
    ]
}
EOF

# 生成证书
cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=server apiserver-csr.json | cfssljson -bare apiserver

##################################################################
# kube-controller-manager证书请求文件
# 注:hosts 列表包含所有 kube-controller-manager 节点 IP;CN 为 system:kube-controller-manager、O 为 system:kube-controller-manager,kubernetes 内置的 ClusterRoleBindings system:kube-controller-manager 赋予 kube-controller-manager 工作所需的权限。
cat > kube-controller-manager-csr.json << 'EOF'
{
    "CN": "system:kube-controller-manager",
    "hosts": [
        "127.0.0.1",
        "10.26.0.1",
        "10.0.26.180",
        "kubernetes.default",
        "kubernetes.default.svc",
        "kubernetes.default.svc.cluster",
        "kubernetes.default.svc.cluster.local"
    ],
    "key": {
        "algo": "rsa",
        "size": 2048
    },
    "names": [
        {
            "C": "CN",
            "ST": "SiChuan",
            "L": "ChengDu",
            "O": "system:kube-controller-manager",
            "OU": "system"
        }
    ]
}
EOF

# 生成证书
cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=peer kube-controller-manager-csr.json | cfssljson -bare kube-controller-manager

##################################################################
# kube-scheduler证书请求文件
cat > kube-scheduler-csr.json << 'EOF'
{
    "CN": "system:kube-scheduler",
    "hosts": [
        "127.0.0.1",
        "10.26.0.1",
        "10.0.26.180",
        "kubernetes.default",
        "kubernetes.default.svc",
        "kubernetes.default.svc.cluster",
        "kubernetes.default.svc.cluster.local"
    ],
    "key": {
        "algo": "rsa",
        "size": 2048
    },
    "names": [
        {
            "C": "CN",
            "ST": "SiChuan",
            "L": "ChengDu",
            "O": "system:kube-scheduler",
            "OU": "system"
        }
    ]
}
EOF

# 生成证书
cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=peer kube-scheduler-csr.json | cfssljson -bare kube-scheduler

##################################################################
# kubectl证书请求文件(需要拷贝到每个node节点)
cat > admin-csr.json << 'EOF'
{
    "CN": "admin",
    "hosts": [],
    "key": {
        "algo": "rsa",
        "size": 2048
    },
    "names": [
        {
            "C": "CN",
            "ST": "SiChuan",
            "L": "ChengDu",
            "O": "system:masters",
            "OU": "system"
        }
    ]
}
EOF

# 生成证书
cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=peer admin-csr.json | cfssljson -bare admin

##################################################################
# kubelet证书请求文件
# 注意主机列表,有新增ip则重新生成即可
cat > kubelet-csr.json << 'EOF'
{
    "CN": "k8s-kubelet",
    "hosts": [
        "127.0.0.1",
        "10.0.26.180",
        "10.0.26.190",
        "10.0.26.191"
    ],
    "key": {
        "algo": "rsa",
        "size": 2048
    },
    "names": [
        {
            "C": "CN",
            "ST": "SiChuan",
            "L": "ChengDu",
            "O": "Internet Zaza Ltd",
            "OU": "IT"
        }
    ]
}
EOF

# 生成证书(需要拷贝到每个node节点)
cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=server kubelet-csr.json | cfssljson -bare kubelet

##################################################################
# kube-proxy证书请求文件
# 注:这里CN对应的是k8s中的角色名
cat > kube-proxy-csr.json << 'EOF'
{
    "CN": "system:kube-proxy",
    "key": {
        "algo": "rsa",
        "size": 2048
    },
    "names": [
        {
            "C": "CN",
            "ST": "SiChuan",
            "L": "ChengDu",
            "O": "Internet Zaza Ltd",
            "OU": "IT"
        }
    ]
}
EOF

# 生成证书(需要拷贝到每个node节点)
cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=client kube-proxy-csr.json | cfssljson -bare kube-proxy-client

证书列表

# 18个证书
root@k8s-master-01:/etc/kubernetes/pki# ll *pem
-rw------- 1 root root 1675 May 24 09:22 admin-key.pem
-rw-r--r-- 1 root root 1675 May 24 09:22 admin.pem
-rw------- 1 root root 1679 May 24 08:16 apiserver-key.pem
-rw-r--r-- 1 root root 1614 May 24 08:16 apiserver.pem
-rw------- 1 root root 1679 May 24 08:10 ca-key.pem
-rw-r--r-- 1 root root 1342 May 24 08:10 ca.pem
-rw------- 1 root root 1679 May 24 08:16 client-key.pem
-rw-r--r-- 1 root root 1407 May 24 08:16 client.pem
-rw------- 1 etcd etcd 1675 May 24 08:13 etcd-peer-key.pem
-rw-r--r-- 1 root root 1480 May 24 08:13 etcd-peer.pem
-rw------- 1 root root 1675 May 24 09:10 kube-controller-manager-key.pem
-rw-r--r-- 1 root root 1675 May 24 09:10 kube-controller-manager.pem
-rw------- 1 root root 1675 May 24 08:21 kubelet-key.pem
-rw-r--r-- 1 root root 1436 May 24 08:21 kubelet.pem
-rw------- 1 root root 1679 May 24 08:23 kube-proxy-client-key.pem
-rw-r--r-- 1 root root 1419 May 24 08:23 kube-proxy-client.pem
-rw------- 1 root root 1679 May 24 09:12 kube-scheduler-key.pem
-rw-r--r-- 1 root root 1651 May 24 09:12 kube-scheduler.pem

master各个组件安装

etcd、kube-apiserver、kube-controller-manager、kube-scheduler、kubectl

环境准备

hostnamectl set-hostname k8s-master-01

grep -q k8s-master-01 /etc/hosts || cat >> /etc/hosts << 'EOF'
10.0.26.180 k8s-master-01
10.0.26.190 k8s-node-01
10.0.26.191 k8s-node-02
EOF

etcd

安装

# 下载
cd /usr/local/src/
wget https://github.com/etcd-io/etcd/releases/download/v3.4.16/etcd-v3.4.16-linux-amd64.tar.gz

# 安装
[ -d /usr/local/etcd ] || (tar zxf etcd-v3.4.16-linux-amd64.tar.gz && mv -v etcd-v3.4.16-linux-amd64 /usr/local/etcd)

# 创建软连接
[ -L /usr/bin/etcd ] || ln -sv /usr/local/etcd/etcd /usr/bin/etcd
[ -L /usr/bin/etcdctl ] || ln -sv /usr/local/etcd/etcdctl /usr/bin/etcdctl

systemd服务单元配置

cat > /etc/systemd/system/etcd.service << 'EOF'
[Unit]
Description=etcd - highly-available key value store
Documentation=https://github.com/coreos/etcd
Documentation=man:etcd
After=network.target
Wants=network-online.target

[Service]
EnvironmentFile=-/usr/local/etcd/etc/etcd.conf
Type=notify
User=etcd
PermissionsStartOnly=true
ExecStart=/usr/bin/etcd $ETCD_ARGS
Restart=on-abnormal
#RestartSec=10s
LimitNOFILE=65536

[Install]
WantedBy=multi-user.target
EOF

配置启动依赖

环境参数文件(参数里面不要有双引号,否则数据会被截断)

enable-v2=true 主要用于flannel

mkdir /usr/local/etcd/etc
cd /usr/local/etcd/etc && touch etcd.conf
grep -q ETCD_ARGS etcd.conf || cat >> etcd.conf << 'EOF'
ETCD_ARGS="
    --name=etcd-server-5-180
    --data-dir=/data/etcd/etcd-server
    --listen-client-urls=https://10.0.26.180:2379,http://127.0.0.1:2379
    --advertise-client-urls=https://10.0.26.180:2379,http://127.0.0.1:2379
    --listen-peer-urls=https://10.0.26.180:2380
    --initial-advertise-peer-urls=https://10.0.26.180:2380
    --initial-cluster=etcd-server-5-189=https://10.0.26.180:2380
    --quota-backend-bytes=8000000000
    --cert-file=/etc/kubernetes/pki/etcd-peer.pem
    --key-file=/etc/kubernetes/pki/etcd-peer-key.pem
    --peer-cert-file=/etc/kubernetes/pki/etcd-peer.pem
    --peer-key-file=/etc/kubernetes/pki/etcd-peer-key.pem
    --trusted-ca-file=/etc/kubernetes/pki/ca.pem
    --peer-trusted-ca-file=/etc/kubernetes/pki/ca.pem
    --log-outputs=stdout
    --logger=zap
    --enable-v2=true
"
EOF

创建用户、创建目录、授权

useradd -M -s /usr/sbin/nologin etcd
mkdir -p /data/etcd /data/logs/etcd-server /data/etcd/etcd-server
chown -R etcd: /data/etcd /data/logs/etcd-server /data/etcd/etcd-server
chown etcd: /etc/kubernetes/pki/etcd-peer-key.pem

配置alias方便使用

grep -q etcdctl ~/.bashrc || cat >> ~/.bashrc << 'EOF'
export ENDPOINTS=https://10.0.26.180:2379
# alias etcdctl='etcdctl --endpoints=${ENDPOINTS} --user=root:123456'
alias etcdctl='ETCDCTL_API=3 etcdctl --cacert=/etc/kubernetes/pki/ca.pem --cert=/etc/kubernetes/pki/etcd-peer.pem --key=/etc/kubernetes/pki/etcd-peer-key.pem --endpoints=${ENDPOINTS}'
# 这里需要完整路径,否者etcdctl会被解析成alias的etcdctl
alias etcdctl2='ETCDCTL_API=2 /usr/bin/etcdctl --ca-file=/etc/kubernetes/pki/ca.pem --cert-file=/etc/kubernetes/pki/etcd-peer.pem --key-file=/etc/kubernetes/pki/etcd-peer-key.pem --endpoints=${ENDPOINTS}'
EOF
source ~/.bashrc

启动

# 开机自启动
systemctl enable etcd.service

# 启动
systemctl start etcd.service

# 检测状态
systemctl status etcd.service

状态检测

# 查看集群状态信息
etcdctl --write-out=table endpoint status

# 查看集群状态
etcdctl endpoint health

kubernetes-server

下载安装

cd /usr/local/src/
wget https://dl.k8s.io/v1.21.1/kubernetes-server-linux-amd64.tar.gz
[ -d /usr/local/kubernetes ] || tar xzvf kubernetes-server-linux-amd64.tar.gz -C /usr/local/

# 创建软连接
[ -L /usr/bin/kubectl ] || ln -sv /usr/local/kubernetes/server/bin/kubectl /usr/bin/kubectl

kube-apiserver配置

systemd服务单元配置
cat > /etc/systemd/system/kube-apiserver.service << 'EOF'
[Unit]
Description=Kubernetes API Service
Documentation=https://github.com/GoogleCloudPlatform/kubernetes
After=network.target
After=etcd.service

[Service]
EnvironmentFile=-/etc/kubernetes/apiserver
ExecStart=/usr/local/kubernetes/server/bin/kube-apiserver $KUBE_API_ARGS
Restart=on-failure
Type=notify
LimitNOFILE=65536

[Install]
WantedBy=multi-user.target
EOF
配置启动依赖

环境参数文件(参数里面不要有双引号,否则数据会被截断)

touch /etc/kubernetes/apiserver
grep -q KUBE_API_ARGS /etc/kubernetes/apiserver || cat >> /etc/kubernetes/apiserver << 'EOF'
KUBE_API_ARGS="
  --audit-log-path /data/logs/kubernetes/kube-apiserver/audit.log
  --audit-policy-file /etc/kubernetes/audit.yaml
  --token-auth-file /etc/kubernetes/token.csv
  --authorization-mode RBAC
  --client-ca-file /etc/kubernetes/pki/ca.pem
  --requestheader-client-ca-file /etc/kubernetes/pki/ca.pem
  --enable-admission-plugins NamespaceLifecycle,LimitRanger,ServiceAccount,DefaultStorageClass,DefaultTolerationSeconds,MutatingAdmissionWebhook,ValidatingAdmissionWebhook,ResourceQuota
  --etcd-cafile /etc/kubernetes/pki/ca.pem
  --etcd-certfile /etc/kubernetes/pki/client.pem
  --etcd-keyfile /etc/kubernetes/pki/client-key.pem
  --etcd-servers https://10.0.26.180:2379
  --service-account-key-file /etc/kubernetes/pki/ca-key.pem
  --service-account-signing-key-file /etc/kubernetes/pki/ca-key.pem
  --service-account-issuer=https://kubernetes.default.svc.cluster.local
  --service-cluster-ip-range 10.26.0.0/16
  --service-node-port-range 3000-29999
  --target-ram-mb=1024
  --kubelet-client-certificate /etc/kubernetes/pki/client.pem
  --kubelet-client-key /etc/kubernetes/pki/client-key.pem
  --log-dir  /data/logs/kubernetes/kube-apiserver
  --tls-cert-file /etc/kubernetes/pki/apiserver.pem
  --tls-private-key-file /etc/kubernetes/pki/apiserver-key.pem
  --v 2
"
EOF

创建审计日志配置

mkdir -pv /etc/kubernetes

cat > /etc/kubernetes/audit.yaml << 'EOF'
apiVersion: audit.k8s.io/v1 # This is required.
kind: Policy
# Don't generate audit events for all requests in RequestReceived stage.
omitStages:
  - "RequestReceived"
rules:
  # Log pod changes at RequestResponse level
  - level: RequestResponse
    resources:
    - group: ""
      # Resource "pods" doesn't match requests to any subresource of pods,
      # which is consistent with the RBAC policy.
      resources: ["pods"]
  # Log "pods/log", "pods/status" at Metadata level
  - level: Metadata
    resources:
    - group: ""
      resources: ["pods/log", "pods/status"]
  # Don't log requests to a configmap called "controller-leader"
  - level: None
    resources:
    - group: ""
      resources: ["configmaps"]
      resourceNames: ["controller-leader"]
  # Don't log watch requests by the "system:kube-proxy" on endpoints or services
  - level: None
    users: ["system:kube-proxy"]
    verbs: ["watch"]
    resources:
    - group: "" # core API group
      resources: ["endpoints", "services"]
  # Don't log authenticated requests to certain non-resource URL paths.
  - level: None
    userGroups: ["system:authenticated"]
    nonResourceURLs:
    - "/api*" # Wildcard matching.
    - "/version"
  # Log the request body of configmap changes in kube-system.
  - level: Request
    resources:
    - group: "" # core API group
      resources: ["configmaps"]
    # This rule only applies to resources in the "kube-system" namespace.
    # The empty string "" can be used to select non-namespaced resources.
    namespaces: ["kube-system"]
  # Log configmap and secret changes in all other namespaces at the Metadata level.
  - level: Metadata
    resources:
    - group: "" # core API group
      resources: ["secrets", "configmaps"]
  # Log all other resources in core and extensions at the Request level.
  - level: Request
    resources:
    - group: "" # core API group
    - group: "extensions" # Version of group should NOT be included.
  # A catch-all rule to log all other requests at the Metadata level.
  - level: Metadata
    # Long-running requests like watches that fall under this rule will not
    # generate an audit event in RequestReceived.
    omitStages:
      - "RequestReceived"
EOF

创建token文件

cat > /etc/kubernetes/token.csv << EOF
$(head -c 16 /dev/urandom | od -An -t x | tr -d ' '),kubelet-bootstrap,10001,"system:kubelet-bootstrap"
EOF

创建依赖目录

mkdir -pv /data/logs/kubernetes/kube-apiserver
启动
# 开机自启动
systemctl enable kube-apiserver.service

# 启动
systemctl start kube-apiserver.service

# 检测状态
systemctl status kube-apiserver.service

kube-controller-manager配置

生成kubeconfig证书(用于TLS认证)

kubeconfig是一个配置文件

cd /etc/kubernetes/pki

# 设置集群参数:初始化cluster基础配置文件,包含连接kube-apiserver的地址和证书
kubectl config set-cluster kubernetes --certificate-authority=ca.pem --embed-certs=true --server=https://10.0.26.180:6443 --kubeconfig=kube-controller-manager.kubeconfig

# 设置客户端认证参数:设置users信息,包含客户端证书信息
kubectl config set-credentials system:kube-controller-manager --client-certificate=kube-controller-manager.pem --client-key=kube-controller-manager-key.pem --embed-certs=true --kubeconfig=kube-controller-manager.kubeconfig

# 设置上下文参数:设置contexts信息
kubectl config set-context system:kube-controller-manager --cluster=kubernetes --user=system:kube-controller-manager --kubeconfig=kube-controller-manager.kubeconfig

# 设置默认上下文:设置current-context的值
kubectl config use-context system:kube-controller-manager --kubeconfig=kube-controller-manager.kubeconfig
systemd服务单元配置
cat > /etc/systemd/system/kube-controller-manager.service << 'EOF'
[Unit]
Description=Kubernetes Controller Manager
Documentation=https://github.com/GoogleCloudPlatform/kubernetes

[Service]
EnvironmentFile=-/etc/kubernetes/controller-manager
ExecStart=/usr/local/kubernetes/server/bin/kube-controller-manager $DAEMON_ARGS
Restart=on-failure
LimitNOFILE=65536

[Install]
WantedBy=multi-user.target
EOF
配置启动依赖

环境参数文件(参数里面不要有双引号,否则数据会被截断)

cat > /etc/kubernetes/controller-manager << 'EOF'
DAEMON_ARGS="
  --bind-address=127.0.0.1
  --cluster-cidr=172.26.0.0/16
  --leader-elect=true
  --log-dir=/data/logs/kubernetes/kube-controller-manager
  --kubeconfig=/etc/kubernetes/pki/kube-controller-manager.kubeconfig
  --cluster-name=kubernetes
  --cluster-signing-cert-file=/etc/kubernetes/pki/ca.pem
  --service-account-private-key-file=/etc/kubernetes/pki/ca-key.pem
  --service-cluster-ip-range=10.26.0.0/16
  --root-ca-file=/etc/kubernetes/pki/ca.pem
  --cluster-signing-cert-file=/etc/kubernetes/pki/ca.pem
  --cluster-signing-key-file=/etc/kubernetes/pki/ca-key.pem 
  --tls-cert-file=/etc/kubernetes/pki/kube-controller-manager.pem
  --tls-private-key-file=/etc/kubernetes/pki/kube-controller-manager-key.pem
  --controllers=*,bootstrapsigner,tokencleaner
  --use-service-account-credentials=true
  --alsologtostderr=true
  --logtostderr=false
  --cluster-signing-duration=87600h0m0s
  --v=2
"
EOF

创建目录

mkdir -pv /data/logs/kubernetes/kube-controller-manager
启动
# 开机自启动
systemctl enable kube-controller-manager.service

# 启动
systemctl start kube-controller-manager.service

# 检测状态
systemctl status kube-controller-manager.service

kube-scheduler配置

生成kubeconfig证书(用于TLS认证)
cd /etc/kubernetes/pki

# 设置集群参数
kubectl config set-cluster kubernetes --certificate-authority=ca.pem --embed-certs=true --server=https://10.0.26.180:6443 --kubeconfig=kube-scheduler.kubeconfig

# 设置客户端认证参数
kubectl config set-credentials system:kube-scheduler --client-certificate=kube-scheduler.pem --client-key=kube-scheduler-key.pem --embed-certs=true --kubeconfig=kube-scheduler.kubeconfig

# 设置上下文参数
kubectl config set-context system:kube-scheduler --cluster=kubernetes --user=system:kube-scheduler --kubeconfig=kube-scheduler.kubeconfig

# 设置默认上下文
kubectl config use-context system:kube-scheduler --kubeconfig=kube-scheduler.kubeconfig
systemd服务单元配置
cat > /etc/systemd/system/kube-scheduler.service << 'EOF'
[Unit]
Description=Kubernetes Scheduler Plugin
Documentation=https://github.com/GoogleCloudPlatform/kubernetes

[Service]
EnvironmentFile=-/etc/kubernetes/scheduler
ExecStart=/usr/local/kubernetes/server/bin/kube-scheduler $DAEMON_ARGS
Restart=on-failure
LimitNOFILE=65536

[Install]
WantedBy=multi-user.target
EOF
配置启动依赖

环境参数文件(参数里面不要有双引号,否则数据会被截断)

cat > /etc/kubernetes/scheduler << 'EOF'
DAEMON_ARGS="
  --bind-address=127.0.0.1
  --leader-elect=true
  --log-dir=/data/logs/kubernetes/kube-scheduler
  --kubeconfig=/etc/kubernetes/pki/kube-scheduler.kubeconfig
  --tls-cert-file=/etc/kubernetes/pki/kube-scheduler.pem
  --tls-private-key-file=/etc/kubernetes/pki/kube-scheduler-key.pem
  --v=2
"
EOF

创建目录

mkdir -pv /data/logs/kubernetes/kube-scheduler
启动
# 开机自启动
systemctl enable kube-scheduler.service

# 启动
systemctl start kube-scheduler.service

# 检测状态
systemctl status kube-scheduler.service

kubectl配置

生成kubeconfig证书(用于TLS认证)
cd /etc/kubernetes/pki

# 设置集群参数
kubectl config set-cluster kubernetes --certificate-authority=ca.pem --embed-certs=true --server=https://10.0.26.180:6443 --kubeconfig=kube.config

# 设置客户端认证参数
kubectl config set-credentials admin --client-certificate=admin.pem --client-key=admin-key.pem --embed-certs=true --kubeconfig=kube.config

# 设置上下文参数
kubectl config set-context kubernetes --cluster=kubernetes --user=admin --kubeconfig=kube.config

# 设置默认上下文
kubectl config use-context kubernetes --kubeconfig=kube.config

# 复制配置文件
mkdir ~/.kube
cp kube.config ~/.kube/config

# 授权 k8s-node 用户访问kubelet api权限
# kubectl exec报错的话,则需要删除重建
# error: unable to upgrade connection: Forbidden (user=kubernetes, verb=create, resource=nodes, subresource=proxy)
# kubectl delete clusterrolebinding kube-apiserver:kubelet-apis
kubectl create clusterrolebinding kube-apiserver:kubelet-apis --clusterrole=system:kubelet-api-admin --user k8s-node
kubectl自动补全
# yum install -y bash-completion
grep -q kubectl ~/.bashrc || cat >> ~/.bashrc << 'EOF'
# .bashrc
source <(kubectl completion bash)
EOF

# 加载
source ~/.bashrc
检查主控节点状态
kubectl get cs
kubectl cluster-info

dashboard

# 部署 Dashboard UI
kubectl apply -f https://raw.githubusercontent.com/kubernetes/dashboard/v2.2.0/aio/deploy/recommended.yaml

# 暴露端口,修改type: ClusterIP->type: NodePort
# kubectl -n kubernetes-dashboard edit service kubernetes-dashboard

# 查看生效(获取端口)
# kubectl -n kubernetes-dashboard get service kubernetes-dashboard
kubectl get pods,svc -n kubernetes-dashboard

# 授权
kubectl create serviceaccount dashboard-admin -n kube-system
kubectl create clusterrolebinding dashboard-admin --clusterrole=cluster-admin --serviceaccount=kube-system:dashboard-admin

# 查看 token
kubectl describe secrets -n kube-system $(kubectl -n kube-system get secret | awk '/dashboard-admin/{print $1}')

# 访问
https://10.0.26.180:31709/

node各个组件安装

flannel、docker、 kubelet、kube-proxy

环境准备

hostnamectl set-hostname k8s-node-01

grep -q k8s-master-01 /etc/hosts || cat >> /etc/hosts << 'EOF'
10.0.26.180 k8s-master-01
10.0.26.190 k8s-node-01
10.0.26.191 k8s-node-02
EOF

# 开启桥接功能
modprobe br_netfilter
cat > /etc/sysctl.d/k8s.conf << EOF
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_forward = 1
EOF

sysctl -p /etc/sysctl.d/k8s.conf

证书准备

# 证书中心下载(master)
# cd /etc/kubernetes/pki && sz -b ca.pem kubelet*.pem client*.pem

# node
mkdir -pv /etc/kubernetes/pki
cd /etc/kubernetes/pki
rz -b

安装flannel

实现跨主机的Pod可以互通

下载安裝

cd /usr/local/src/
wget https://github.com/flannel-io/flannel/releases/download/v0.13.0/flannel-v0.13.0-linux-amd64.tar.gz

[ -f /usr/local/flannel/flanneld ] || (mkdir -pv /usr/local/flannel && tar xzvf flannel-v0.13.0-linux-amd64.tar.gz -C /usr/local/flannel/)

运行原理

flanneld 优先读取 –subnet-file 参数定义的配置文件,如果不存在则读取 etcd 配置的网络信息,随机生成子网到 –subnet-file=/etc/kubernetes/flanneld_subnet.env,内容如下:

FLANNEL_NETWORK=172.26.0.0/16 FLANNEL_SUBNET=172.26.25.1/24 FLANNEL_MTU=1500 FLANNEL_IPMASQ=false

为了自定义FLANNEL_SUBNET,可以手动生成flanneld_subnet.env,但是最大的问题是,如果FLANNEL_SUBNET冲突,会导致服务器被顶替,所以建议自动生成?

mk-docker-opts.sh:读取 /run/flannel/subnet.env 文件生成 /run/docker_opts.env,其实就是参数转换 参数 -k DOCKER_NETWORK_OPTIONS (用于声明参数名称),最终生成:DOCKER_NETWORK_OPTIONS =" –bip=172.33.68.1/24 –ip-masq=true –mtu=1500" dockerd 追加 $DOCKER_NETWORK_OPTIONS 即可

# /usr/local/flannel/flanneld --help|less
# --etcd-prefix etcd prefix (default "/coreos.com/network") 默认注册地址
# --iface 为当前宿主机对外网卡
# --public-ip 为本机IP
# --subnet-file 指定网络配置

systemd服务单元配置

RequiredBy=docker.service 代表docker依赖此服务

cat > /etc/systemd/system/flanneld.service << 'EOF'
[Unit]
Description=flannel - Network fabric for containers (System Application Container)
Documentation=https://github.com/coreos/flannel
After=network.target
After=network-online.target
Wants=network-online.target
After=etcd.service
Before=docker.service

[Service]
Type=notify
Restart=always
RestartSec=10s
TimeoutStartSec=300
LimitNOFILE=40000
LimitNPROC=1048576
EnvironmentFile=-/etc/kubernetes/flanneld
ExecStart=/usr/local/flannel/flanneld $FLANNEL_OPTIONS
ExecStartPost=/usr/local/flannel/mk-docker-opts.sh -f /etc/kubernetes/flanneld_subnet.env -k DOCKER_NETWORK_OPTIONS -d /etc/kubernetes/flanneld_docker.env

[Install]
WantedBy=multi-user.target
RequiredBy=docker.service
EOF

配置说明

注意:各个机器上bip网段不一致,bip中间两段与宿主机最后两段相同,目的是方便定位问题。bip根据宿主机ip变化 :

bip 172.26.180.1/24 对应:10.0.26.180

bip 172.26.190.1/24 对应:10.0.26.190

bip 172.26.191.1/24 对应:10.0.26.191

配置启动依赖

环境参数文件(参数里面不要有双引号,否则数据会被截断)

# 注意修改:public-ip、iface
cat > /etc/kubernetes/flanneld << 'EOF'
FLANNEL_OPTIONS="
  --public-ip=10.0.26.190
  --iface=enp0s3
  --healthz-port=2401
  --etcd-endpoints=https://10.0.26.180:2379
  --etcd-keyfile=/etc/kubernetes/pki/client-key.pem
  --etcd-certfile=/etc/kubernetes/pki/client.pem
  --etcd-cafile=/etc/kubernetes/pki/ca.pem
  --subnet-file=/etc/kubernetes/flanneld_subnet.env
"
EOF

flanneld网络配置文件,为了自定义子网,手动生成此文件即可

如果冲突的话,PublicIP会被顶替到,所以自动生成是不是最好的选择呢?查看方法:

etcdctl2 get /coreos.com/network/subnets/172.26.190.0-24

服务器ip后两位 对应 172中间两位: 10.0.26.190 ==> 172.26.190.1/24

FLANNEL_NETWORK:10.0.26.187 ==> 172.26.0.0/16

# 建议 flanneld 通过读取etcd后,自动分配网段
cat > /etc/kubernetes/flanneld_subnet.env << 'EOF'
FLANNEL_NETWORK=172.26.0.0/16
FLANNEL_SUBNET=172.26.190.1/24
FLANNEL_MTU=1500
FLANNEL_IPMASQ=false
EOF

flanneld etcd依赖数据(master)

执行一次即可

# 这里需要完整路径,否者etcdctl会被解析成alias的etcdctl
# alias etcdctl2='ETCDCTL_API=2 /usr/bin/etcdctl --ca-file=/etc/kubernetes/pki/ca.pem --cert-file=/etc/kubernetes/pki/etcd-peer.pem --key-file=/etc/kubernetes/pki/etcd-peer-key.pem --endpoints=https://10.0.26.180:2379'
# 设置网络及后端类型
etcdctl2 set /coreos.com/network/config '{"Network": "172.26.0.0/16", "Backend": {"Type": "host-gw"}}'

# 查看数据
etcdctl2 get /coreos.com/network/config
etcdctl2 ls /coreos.com/network/subnets
# 删除
# etcdctl2 rm /coreos.com/network/subnets/172.26.25.0-24

启动

启动的时候实际上只是添加了本地的防火墙FORWARD的权限和生成/etc/kubernetes/flanneld_docker.env

# 开机自启动
systemctl enable flanneld.service

# 启动
systemctl start flanneld.service

# 检测状态
systemctl status flanneld.service

安装docker

下载安装

# https://download.docker.com/linux/static/stable/x86_64/
cd /usr/local/src
wget https://download.docker.com/linux/static/stable/x86_64/docker-20.10.6.tgz
[ -f /usr/bin/docker ] || (tar xzf docker-20.10.6.tgz && cp -v docker/* /usr/bin/)

systemd服务单元配置

cat > /usr/lib/systemd/system/docker.service << 'EOF'
[Unit]
Description=Docker Application Container Engine
Documentation=https://docs.docker.com
After=network-online.target firewalld.service
Wants=network-online.target
[Service]
Type=notify
EnvironmentFile=-/etc/kubernetes/flanneld_docker.env
ExecStart=/usr/bin/dockerd $DOCKER_NETWORK_OPTIONS
ExecReload=/bin/kill -s HUP $MAINPID
LimitNOFILE=infinity
LimitNPROC=infinity
LimitCORE=infinity
TimeoutStartSec=0
Delegate=yes
KillMode=process
Restart=on-failure
StartLimitBurst=3
StartLimitInterval=60s
[Install]
WantedBy=multi-user.target
EOF

配置启动依赖

配置文件

mkdir -pv /etc/docker /data/docker

# bip 通过 $DOCKER_NETWORK_OPTIONS 读取
cat > /etc/docker/daemon.json << 'EOF'
{
  "graph": "/data/docker",
  "storage-driver": "overlay2",
  "registry-mirrors": ["https://registry.docker-cn.com"],
  "exec-opts": ["native.cgroupdriver=systemd"],
  "live-restore": true
}
EOF

启动

修改配置文件后:systemctl daemon-reload

# 开机自启动
systemctl enable docker

# 启动
systemctl start docker

# 查看状态
systemctl status docker

状态检测

# 检查启动情况
docker -v

# 检查网络
ip a |grep docker0

# 查看容器运行是否符合配置
docker pull busybox
docker run -it --rm busybox /bin/sh

/ # ip add
# 容器IP为172.26.190.2,符合设置。这样,根据容器ip就可以很容易定位到容器所属主机节点。

kubernetes-node

下载安装

cd /usr/local/src/
wget https://dl.k8s.io/v1.21.1/kubernetes-node-linux-amd64.tar.gz
[ -d /usr/local/kubernetes ] || tar xzvf kubernetes-node-linux-amd64.tar.gz -C /usr/local

# 创建软连接
[ -f /usr/bin/kubectl ] || ln -sv /usr/local/kubernetes/node/bin/kube* /usr/bin/

kubelet配置

配置kubeconfig(master)

在master上面配置,方便所有节点同步

# set-cluster** 创建需要连接的集群信息,可以创建多个k8s集群信息
cd /etc/kubernetes/pki
kubectl config set-cluster myk8s \
  --certificate-authority=/etc/kubernetes/pki/ca.pem \
  --embed-certs=true \
  --server=https://10.0.26.180:6443 \
  --kubeconfig=/etc/kubernetes/pki/kubelet.kubeconfig
  
# set-credentials**  创建用户账号,即用户登陆使用的客户端私有和证书,可以创建多个证书
kubectl config set-credentials k8s-node \
  --client-certificate=/etc/kubernetes/pki/client.pem \
  --client-key=/etc/kubernetes/pki/client-key.pem \
  --embed-certs=true \
  --kubeconfig=/etc/kubernetes/pki/kubelet.kubeconfig

# set-context**  设置context,即确定账号和集群对应关系
kubectl config set-context myk8s-context --cluster=myk8s --user=k8s-node  --kubeconfig=/etc/kubernetes/pki/kubelet.kubeconfig
 
# use-context**  设置当前使用哪个context
kubectl config use-context myk8s-context --kubeconfig=/etc/kubernetes/pki/kubelet.kubeconfig

# 拷贝kube-proxy.kubeconfig 到 node的目录下
sz -b kubelet.kubeconfig

# node
cd /etc/kubernetes/pki && rz -b
授权k8s-node用户(master上操作)

授权k8s-node用户绑定集群角色 system:node ,让 k8s-node 成为具备运算节点的权限。

cd /etc/kubernetes
cat > k8s-node.yaml << 'EOF'
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: k8s-node
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: system:node
subjects:
- apiGroup: rbac.authorization.k8s.io
  kind: User
  name: k8s-node
EOF

应用资源配置文件,并检查

# 应用
kubectl apply -f k8s-node.yaml
# clusterrolebinding.rbac.authorization.k8s.io/k8s-node created

# 查看
kubectl get clusterrolebinding k8s-node
# NAME       ROLE                      AGE
# k8s-node   ClusterRole/system:node   17s

# 查看
kubectl get clusterrolebinding k8s-node -o yaml
systemd服务单元配置
cat > /etc/systemd/system/kubelet.service << 'EOF'
[Unit]
Description=Kubernetes Kubelet Server
Documentation=https://github.com/GoogleCloudPlatform/kubernetes
After=docker.service
Requires=docker.service

[Service]
WorkingDirectory=/data/kubelet
EnvironmentFile=-/etc/kubernetes/kubelet
ExecStart=/usr/bin/kubelet $KUBELET_ARGS
Restart=on-failure
RestartSec=10

[Install]
WantedBy=multi-user.target
EOF
配置启动依赖

环境参数文件(参数里面不要有双引号,否则数据会被截断)

cat > /etc/kubernetes/kubelet << 'EOF'
KUBELET_ARGS="
  --anonymous-auth=false
  --cgroup-driver systemd
  --cluster-dns 10.26.0.2
  --cluster-domain cluster.local
  --runtime-cgroups=/systemd/system.slice
  --kubelet-cgroups=/systemd/system.slice
  --fail-swap-on=false
  --client-ca-file /etc/kubernetes/pki/ca.pem
  --tls-cert-file /etc/kubernetes/pki/kubelet.pem
  --tls-private-key-file /etc/kubernetes/pki/kubelet-key.pem
  --hostname-override 10.0.26.190
  --image-gc-high-threshold 20
  --image-gc-low-threshold 10
  --kubeconfig /etc/kubernetes/pki/kubelet.kubeconfig
  --log-dir /data/logs/kubernetes/kube-kubelet
  --pod-infra-container-image kubernetes/pause:latest
  --root-dir /data/kubelet
  --authorization-mode Webhook
"
EOF

准备pause基础镜像,kubelet启动依赖 一切运行时都是基于这个上面的(不安装的话,kubelet将无法正常启动)

docker pull kubernetes/pause

# 新增tag:k8s.gcr.io/pause:3.4.1
# 创建pod的时候拉取的k8s.gcr.io/pause:3.4.1镜像
docker tag kubernetes/pause:latest k8s.gcr.io/pause:3.4.1
docker images

创建目录

mkdir -pv /data/logs/kubernetes/kube-kubelet /data/kubelet
启动
# 开机自启动
systemctl enable kubelet.service

# 启动
systemctl start kubelet.service

# 检测状态
systemctl status kubelet.service

# 查看日志
# journalctl -xeu kubelet
验证节点是否加入(master)
# 查看节点信息是否加入(加入节点需要一定的时间)
kubectl get nodes
kubectl get nodes -o wide

# 打标签
kubectl label nodes k8s-node-01 node-role.kubernetes.io/node=node/10.0.26.191 labeled
kubectl label nodes k8s-node-01 node-role.kubernetes.io/worker=node/10.0.26.191 labeled
kubectl get nodes

kube-proxy配置

配置kubeconfig(master生成)

master方便所有节点同步

cd /etc/kubernetes/pki
kubectl config set-cluster myk8s \
--certificate-authority=/etc/kubernetes/pki/ca.pem \
--embed-certs=true \
--server=https://10.0.26.180:6443 \
--kubeconfig=kube-proxy.kubeconfig

kubectl config set-credentials kube-proxy \
  --client-certificate=/etc/kubernetes/pki/kube-proxy-client.pem \
  --client-key=/etc/kubernetes/pki/kube-proxy-client-key.pem \
  --embed-certs=true \
  --kubeconfig=kube-proxy.kubeconfig

kubectl config set-context myk8s-context \
--cluster=myk8s \
--user=kube-proxy \
--kubeconfig=kube-proxy.kubeconfig

kubectl config use-context myk8s-context --kubeconfig=kube-proxy.kubeconfig

# 拷贝kube-proxy.kubeconfig 到 node的目录下
sz -b kube-proxy.kubeconfig

# node
cd /etc/kubernetes/pki && rz -b
systemd服务单元配置
cat > /etc/systemd/system/kube-proxy.service << 'EOF'
[Unit]
Description=Kubernetes Kube-Proxy Server
Documentation=https://github.com/GoogleCloudPlatform/kubernetes
After=network.target

[Service]
EnvironmentFile=-/etc/kubernetes/proxy
ExecStartPre=bash /etc/kubernetes/ipvs.sh
ExecStart=/usr/local/kubernetes/node/bin/kube-proxy $KUBE_PROXY_ARGS
Restart=on-failure
LimitNOFILE=65536

[Install]
WantedBy=multi-user.target
EOF
配置启动依赖

注:kube-proxy共有3种流量调度模式,分别是 namespace,iptables,ipvs,其中ipvs性能最好。这里设置ipvs,如果不设置则使用iptables

安装依赖包

apt-get -y install ipset ipvsadm

环境参数文件(参数里面不要有双引号,否则数据会被截断)

cat > /etc/kubernetes/proxy << 'EOF'
KUBE_PROXY_ARGS="
  --cluster-cidr 172.26.0.0/16
  --bind-address 10.0.26.191
  --hostname-override 10.0.26.190
  --proxy-mode=ipvs
  --ipvs-scheduler=nq
  --kubeconfig /etc/kubernetes/pki/kube-proxy.kubeconfig
"
EOF

加载ipvs模块

lsmod | grep ip_vs_nq

# 加载脚本(关键模块:ip_vs_nq)
cat > /etc/kubernetes/ipvs.sh << 'EOF'
#!/bin/bash
ipvs_mods_dir="/usr/lib/modules/$(uname -r)/kernel/net/netfilter/ipvs"
for i in $(ls $ipvs_mods_dir|grep -o "^[^.]*")
do
  /sbin/modinfo -F filename $i &>/dev/null
  if [ $? -eq 0 ];then
    /sbin/modprobe $i
  fi
done
EOF

# 执行加载(配置到kube-proxy.service进行自动加载) 
bash /etc/kubernetes/ipvs.sh

# 查看(多了很多)
# 启动报错:Can't use the IPVS proxier: error getting ipset version, error: executable file not found in $PATH
# apt-get install ipset ipvsadm
lsmod | grep ip_vs_nq
启动
# 开机自启动
systemctl enable kube-proxy.service

# 启动
systemctl start kube-proxy.service

# 检测状态
systemctl status kube-proxy.service
状态检查
root@k8s-node-01:~# ipvsadm -Ln
# IP Virtual Server version 1.2.1 (size=4096)
# Prot LocalAddress:Port Scheduler Flags
#   -> RemoteAddress:Port           Forward Weight ActiveConn InActConn
# TCP  10.26.0.1:443 nq
#   -> 10.0.26.180:6443              Masq    1      0          0      

# kube-proxy启动日志关键字:Using ipvs Proxier.
# I0520 14:43:34.790409    9233 server_others.go:274] Using ipvs Proxier.

# master查看svc
kubectl get svc

验证

查看节点是否加入

kubectl describe nodes k8s-node-01

创建pod和service进行测试(master)

mkdir ~/k8s-test
cd ~/k8s-test

# 生成配置?
kubectl run pod-ng --image=nginx:stable --dry-run=client -o yaml > pod-ng.yaml

# 添加镜像拉取策略:imagePullPolicy: IfNotPresent
vim pod-ng.yaml

#  - image: nginx:stable
    imagePullPolicy: IfNotPresent
#    name: pod-ng

# 应用策略
# 会自动下载相关镜像
kubectl apply -f pod-ng.yaml

# 查看容器状态
kubectl get pod

# ContainerCreating,Pending状态具体查看
# 查看最后的Events即可(即:整个创建流程)
kubectl describe pod pod-ng

# 重新启动容器
# https://segmentfault.com/a/1190000020675199
# kubectl get pod {podname} -n {namespace} -o yaml | kubectl replace --force -f -
# kubectl get pod pod-ng -n default -o yaml | kubectl replace --force -f -

# 查看容器IP信息
kubectl get pod -o wide

# svc-ng(快速创建service)
kubectl expose pod pod-ng --name=svc-ng --port=80

# 查看 CLUSTER-IP
kubectl get svc svc-ng

# node上访问测试,IP获取(kubectl get pod -o wide)
curl -I 172.26.190.2

新节点加入

组件:docker、 kubelet、kube-proxy

191为例

kubelet证书刷新

hosts新增10.0.26.191,并重新生成证书

cd /etc/kubernetes/pki

# kubelet证书请求文件
# 注意主机列表,有新增ip则重新生成即可
cat > kubelet-csr.json << 'EOF'
{
    "CN": "k8s-kubelet",
    "hosts": [
        "127.0.0.1",
        "10.0.26.180",
        "10.0.26.190",
        "10.0.26.191"
    ],
    "key": {
        "algo": "rsa",
        "size": 2048
    },
    "names": [
        {
            "C": "CN",
            "ST": "SiChuan",
            "L": "ChengDu",
            "O": "Internet Zaza Ltd",
            "OU": "IT"
        }
    ]
}
EOF

# 生成证书(需要拷贝到每个node节点)
cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=server kubelet-csr.json | cfssljson -bare kubelet

# 更新kubelet证书信息
sz -b kubelet.pem kubelet-key.pem

同步pki目录文件

列表如下

# 证书中心下载(master)
# cd /etc/kubernetes/pki && sz -b ca.pem kubelet*.pem client*.pem

# 节点上传
mkdir -pv /etc/kubernetes/pki
cd /etc/kubernetes/pki
rz -b

# root@k8s-node-02:/etc/kubernetes/pki# ll
# total 44
# drwxr-xr-x 2 root root 4096 May 26 09:06 ./
# drwxr-xr-x 3 root root 4096 May 26 08:59 ../
# -rw-r--r-- 1 root root 1342 May 25 08:29 ca.pem
# -rw-r--r-- 1 root root 1679 May 25 08:29 client-key.pem
# -rw-r--r-- 1 root root 1407 May 25 08:29 client.pem
# -rw-r--r-- 1 root root 1675 May 26 08:50 kubelet-key.pem
# -rw-r--r-- 1 root root 6252 May 25 08:43 kubelet.kubeconfig
# -rw-r--r-- 1 root root 1476 May 26 08:50 kubelet.pem
# -rw-r--r-- 1 root root 6272 May 25 08:45 kube-proxy.kubeconfig

环境准备

hostnamectl set-hostname k8s-node-02

grep -q k8s-master-01 /etc/hosts || cat >> /etc/hosts << 'EOF'
10.0.26.180 k8s-master-01
10.0.26.191 k8s-node-01
10.0.26.190 k8s-node-02
EOF

# 开启桥接功能
modprobe br_netfilter
cat > /etc/sysctl.d/k8s.conf << EOF
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_forward = 1
EOF

sysctl -p /etc/sysctl.d/k8s.conf

安装flannel

flannel:跨主机的Pod可以互通

下载安裝

cd /usr/local/src/
wget https://github.com/flannel-io/flannel/releases/download/v0.13.0/flannel-v0.13.0-linux-amd64.tar.gz

[ -f /usr/local/flannel/flanneld ] || (mkdir -pv /usr/local/flannel && tar xzvf flannel-v0.13.0-linux-amd64.tar.gz -C /usr/local/flannel/)

systemd服务单元配置

RequiredBy=docker.service 代表docker依赖此服务

cat > /etc/systemd/system/flanneld.service << 'EOF'
[Unit]
Description=flannel - Network fabric for containers (System Application Container)
Documentation=https://github.com/coreos/flannel
After=network.target
After=network-online.target
Wants=network-online.target
After=etcd.service
Before=docker.service

[Service]
Type=notify
Restart=always
RestartSec=10s
TimeoutStartSec=300
LimitNOFILE=40000
LimitNPROC=1048576
EnvironmentFile=-/etc/kubernetes/flanneld
ExecStart=/usr/local/flannel/flanneld $FLANNEL_OPTIONS
ExecStartPost=/usr/local/flannel/mk-docker-opts.sh -f /etc/kubernetes/flanneld_subnet.env -k DOCKER_NETWORK_OPTIONS -d /etc/kubernetes/flanneld_docker.env

[Install]
WantedBy=multi-user.target
RequiredBy=docker.service
EOF

配置说明

注意:各个机器上bip网段不一致,bip中间两段与宿主机最后两段相同,目的是方便定位问题。bip根据宿主机ip变化 :

bip 172.26.191.1/24 对应:10.0.26.191

配置启动依赖

环境参数文件(参数里面不要有双引号,否则数据会被截断)

# 注意修改:public-ip、iface
cat > /etc/kubernetes/flanneld << 'EOF'
FLANNEL_OPTIONS="
  --public-ip=10.0.26.191
  --iface=enp0s3
  --healthz-port=2401
  --etcd-endpoints=https://10.0.26.180:2379
  --etcd-keyfile=/etc/kubernetes/pki/client-key.pem
  --etcd-certfile=/etc/kubernetes/pki/client.pem
  --etcd-cafile=/etc/kubernetes/pki/ca.pem
  --subnet-file=/etc/kubernetes/flanneld_subnet.env
"
EOF

flanneld网络配置文件,为了自定义子网,手动生成此文件即可

# 建议 flanneld 通过读取etcd后,自动分配网段
cat > /etc/kubernetes/flanneld_subnet.env << 'EOF'
FLANNEL_NETWORK=172.26.0.0/16
FLANNEL_SUBNET=172.26.191.1/24
FLANNEL_MTU=1500
FLANNEL_IPMASQ=false
EOF

启动

启动的时候实际上只是添加了本地的防火墙 FORWARD 的权限和生成 /etc/kubernetes/flanneld_docker.env

root@k8s-node-02:/usr/local/src# route -n Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface 0.0.0.0 10.0.26.1 0.0.0.0 UG 0 0 0 enp0s3 10.0.26.0 0.0.0.0 255.255.255.0 U 0 0 0 enp0s3 172.26.190.0 10.0.26.190 255.255.255.0 UG 0 0 0 enp0s3 # 会自动新增其它节点路由信息

root@k8s-node-01:/etc/kubernetes/pki# route -n Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface 0.0.0.0 10.0.26.1 0.0.0.0 UG 0 0 0 enp0s3 10.0.26.0 0.0.0.0 255.255.255.0 U 0 0 0 enp0s3 172.26.190.0 0.0.0.0 255.255.255.0 U 0 0 0 docker0 172.26.191.0 10.0.26.191 255.255.255.0 UG 0 0 0 enp0s3 # 会自动新增其它节点路由信息

# 开机自启动
systemctl enable flanneld.service

# 启动
systemctl start flanneld.service

# 检测状态
systemctl status flanneld.service

安装docker

下载安装

# https://download.docker.com/linux/static/stable/x86_64/
cd /usr/local/src
wget https://download.docker.com/linux/static/stable/x86_64/docker-20.10.6.tgz
[ -f /usr/bin/docker ] || (tar xzf docker-20.10.6.tgz && cp -v docker/* /usr/bin/)

配置

注意:各个机器上bip网段不一致,bip中间两段与宿主机最后两段相同,目的是方便定位问题。bip根据宿主机ip变化 :

bip 172.26.191.1/24 对应:10.0.26.191

mkdir -pv /etc/docker /data/docker

# 注意修改bip
# dockerd读取的默认配置文件:/etc/docker/daemon.json
cat > /etc/docker/daemon.json << 'EOF'
{
  "graph": "/data/docker",
  "storage-driver": "overlay2",
  "registry-mirrors": ["https://registry.docker-cn.com"],
  "exec-opts": ["native.cgroupdriver=systemd"],
  "live-restore": true
}
EOF

systemd服务单元配置

cat > /usr/lib/systemd/system/docker.service << 'EOF'
[Unit]
Description=Docker Application Container Engine
Documentation=https://docs.docker.com
After=network-online.target firewalld.service
Wants=network-online.target
[Service]
Type=notify
EnvironmentFile=-/etc/kubernetes/flanneld_docker.env
ExecStart=/usr/bin/dockerd $DOCKER_NETWORK_OPTIONS
ExecReload=/bin/kill -s HUP $MAINPID
LimitNOFILE=infinity
LimitNPROC=infinity
LimitCORE=infinity
TimeoutStartSec=0
Delegate=yes
KillMode=process
Restart=on-failure
StartLimitBurst=3
StartLimitInterval=60s
[Install]
WantedBy=multi-user.target
EOF

启动

修改配置文件后:systemctl daemon-reload

# 开机自启动
systemctl enable docker

# 启动
systemctl start docker

# 查看状态
systemctl status docker

状态检测

# 检查启动情况
docker -v

# 检查网络
ip a |grep docker0

# 查看容器运行是否符合配置
docker pull busybox
docker run -it --rm busybox /bin/sh

/ # ip add
# 容器IP为172.26.191.2,符合设置。这样,根据容器ip就可以很容易定位到容器所属主机节点。

kubernetes-node

下载安装

cd /usr/local/src/
wget https://dl.k8s.io/v1.21.1/kubernetes-node-linux-amd64.tar.gz
[ -d /usr/local/kubernetes ] || tar xzvf kubernetes-node-linux-amd64.tar.gz -C /usr/local

# 创建软连接
[ -f /usr/bin/kubectl ] || ln -sv /usr/local/kubernetes/node/bin/kube* /usr/bin/

kubelet配置

systemd服务单元配置
cat > /etc/systemd/system/kubelet.service << 'EOF'
[Unit]
Description=Kubernetes Kubelet Server
Documentation=https://github.com/GoogleCloudPlatform/kubernetes
After=docker.service
Requires=docker.service

[Service]
WorkingDirectory=/data/kubelet
EnvironmentFile=-/etc/kubernetes/kubelet
ExecStart=/usr/bin/kubelet $KUBELET_ARGS
Restart=on-failure
RestartSec=10

[Install]
WantedBy=multi-user.target
EOF
配置启动依赖

环境参数文件(参数里面不要有双引号,否则数据会被截断)

# 注意修改:hostname-override的ip
cat > /etc/kubernetes/kubelet << 'EOF'
KUBELET_ARGS="
  --anonymous-auth=false
  --cgroup-driver systemd
  --cluster-dns 10.26.0.2
  --cluster-domain cluster.local
  --runtime-cgroups=/systemd/system.slice
  --kubelet-cgroups=/systemd/system.slice
  --fail-swap-on=false
  --client-ca-file /etc/kubernetes/pki/ca.pem
  --tls-cert-file /etc/kubernetes/pki/kubelet.pem
  --tls-private-key-file /etc/kubernetes/pki/kubelet-key.pem
  --hostname-override 10.0.26.191
  --image-gc-high-threshold 20
  --image-gc-low-threshold 10
  --kubeconfig /etc/kubernetes/pki/kubelet.kubeconfig
  --log-dir /data/logs/kubernetes/kube-kubelet
  --pod-infra-container-image kubernetes/pause:latest
  --root-dir /data/kubelet
  --authorization-mode Webhook
"
EOF

准备pause基础镜像,kubelet启动依赖 一切运行时都是基于这个上面的(不安装的话,kubelet将无法正常启动)

docker pull kubernetes/pause

# 新增tag:k8s.gcr.io/pause:3.4.1
# 创建pod的时候拉取的k8s.gcr.io/pause:3.4.1镜像
docker tag kubernetes/pause:latest k8s.gcr.io/pause:3.4.1
docker images

创建目录

mkdir -pv /data/logs/kubernetes/kube-kubelet /data/kubelet
启动
# 开机自启动
systemctl enable kubelet.service

# 启动
systemctl start kubelet.service

# 检测状态
systemctl status kubelet.service
验证节点是否加入(master)
# 查看节点信息是否加入(加入节点需要一定的时间)
kubectl get nodes
kubectl get nodes -o wide

# 打标签
kubectl label nodes k8s-node-01 node-role.kubernetes.io/node=node/10.0.26.191 labeled
kubectl label nodes k8s-node-01 node-role.kubernetes.io/worker=node/10.0.26.191 labeled
kubectl get nodes

kube-proxy配置

systemd服务单元配置
cat > /etc/systemd/system/kube-proxy.service << 'EOF'
[Unit]
Description=Kubernetes Kube-Proxy Server
Documentation=https://github.com/GoogleCloudPlatform/kubernetes
After=network.target

[Service]
EnvironmentFile=-/etc/kubernetes/proxy
ExecStartPre=bash /etc/kubernetes/ipvs.sh
ExecStart=/usr/local/kubernetes/node/bin/kube-proxy $KUBE_PROXY_ARGS
Restart=on-failure
LimitNOFILE=65536

[Install]
WantedBy=multi-user.target
EOF
配置启动依赖

安装依赖包

apt-get -y install ipset ipvsadm

环境参数文件(参数里面不要有双引号,否则数据会被截断)

# 注意修改bind-address和hostname-override
cat > /etc/kubernetes/proxy << 'EOF'
KUBE_PROXY_ARGS="
  --cluster-cidr 172.26.0.0/16
  --bind-address 10.0.26.191
  --hostname-override 10.0.26.191
  --proxy-mode=ipvs
  --ipvs-scheduler=nq
  --kubeconfig /etc/kubernetes/pki/kube-proxy.kubeconfig
"
EOF

加载ipvs模块

lsmod | grep ip_vs_nq

# 加载脚本(关键模块:ip_vs_nq)
cat > /etc/kubernetes/ipvs.sh << 'EOF'
#!/bin/bash
ipvs_mods_dir="/usr/lib/modules/$(uname -r)/kernel/net/netfilter/ipvs"
for i in $(ls $ipvs_mods_dir|grep -o "^[^.]*")
do
  /sbin/modinfo -F filename $i &>/dev/null
  if [ $? -eq 0 ];then
    /sbin/modprobe $i
  fi
done
EOF

# 执行加载(放在ExecStartPre) 
bash /etc/kubernetes/ipvs.sh

# 查看(多了很多)
lsmod | grep ip_vs_nq
启动
# 开机自启动
systemctl enable kube-proxy.service

# 启动
systemctl start kube-proxy.service

# 检测状态
systemctl status kube-proxy.service
验证(master)
# 查看节点信息是否是:Ready状态,这个状态需要一小段时间进行同步
kubectl get nodes

# 生成配置
cat > ~/k8s-test/nginx-deployment.yaml << 'EOF'
apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-deployment
  labels:
    app: nginx
spec:
  replicas: 2
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx:stable
        ports:
        - containerPort: 80
EOF

# 创建一个deployment
kubectl apply -f ~/k8s-test/nginx-deployment.yaml

# 查看进度状态
kubectl get pods
kubectl get rs
kubectl get deployment

# 创建service
kubectl expose deployments.apps/nginx-deployment --port=80

# 查看 CLUSTER-IP
kubectl get svc nginx-deployment

# 任意node上访问测试,IP获取(kubectl get pod -o wide)
curl -I 172.26.190.2
curl -I 172.26.191.2

# 任意node上访问测试通过,代表 flannel 路由已经正常
etcdctl2 ls /coreos.com/network/subnets
etcdctl2 get /coreos.com/network/subnets/172.26.190.0-24
etcdctl2 get /coreos.com/network/subnets/172.26.191.0-24


# 添加 node 主机层端口
cat > ~/k8s-test/nginx-deployment-svc.yaml << 'EOF'
apiVersion: v1
kind: Service
metadata:
  name: nginx-deployment
spec:
  type: NodePort
  selector:
    app: nginx
  ports:
  - nodePort: 8888
    port: 80
    targetPort: 80
EOF

# 应用后暴露效果如下:
# kubectl apply -f ~/k8s-test/nginx-deployment-svc.yaml
# root@k8s-node-01:~# netstat -ntlp|grep 8888
# tcp        0      0 0.0.0.0:8888            0.0.0.0:*               LISTEN      736/kube-proxy

# root@k8s-node-02:~# netstat -ntlp|grep 8888
# tcp        0      0 0.0.0.0:8888            0.0.0.0:*               LISTEN      739/kube-proxy

# 可以直接通过云ip进行访问
# http://10.0.26.190:8888/
# http://10.0.26.191:8888/

常见问题

kubelet启动异常

systemd服务单元配置修改如下,目前来看无法从EnvironmentFile读取kubeconfig配置信息

Environment=“KUBELET_KUBECONFIG_ARGS=–kubeconfig=/etc/kubernetes/pki/kubelet.kubeconfig” ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_ARGS

# systemctl启动异常
# systemctl start kubelet.service
May 25 06:42:33 k8s-node-01 systemd[1]: Started Kubernetes systemd probe.
May 25 06:42:33 k8s-node-01 kubelet[2344]: I0525 06:42:33.115999    2344 server.go:440] "Kubelet version" kubeletVersion="v1.21.1"
May 25 06:42:33 k8s-node-01 kubelet[2344]: I0525 06:42:33.116379    2344 server.go:573] "Standalone mode, no API client"
May 25 06:42:33 k8s-node-01 systemd[1]: run-rb0e177d4fef94a8eb15734c5cc9e09cc.scope: Succeeded.
May 25 06:42:33 k8s-node-01 kubelet[2344]: I0525 06:42:33.184752    2344 server.go:488] "No api server defined - no events will be sent to API server"
May 25 06:42:33 k8s-node-01 kubelet[2344]: I0525 06:42:33.185061    2344 server.go:660] "--cgroups-per-qos enabled, but --cgroup-root was not specified.  defaulting to /"
May 25 06:42:33 k8s-node-01 kubelet[2344]: I0525 06:42:33.185330    2344 container_manager_linux.go:278] "Container manager verified user specified cgroup-root exists" cgroupRoot=[]

# 直接调用命令正常
# /usr/bin/kubelet $KUBELET_ARGS
I0525 06:42:54.713844    2589 server.go:440] "Kubelet version" kubeletVersion="v1.21.1"
I0525 06:42:54.733089    2589 dynamic_cafile_content.go:167] Starting client-ca-bundle::/etc/kubernetes/pki/ca.pem
I0525 06:42:54.799593    2589 server.go:660] "--cgroups-per-qos enabled, but --cgroup-root was not specified.  defaulting to /"
I0525 06:42:54.800012    2589 container_manager_linux.go:278] "Container manager verified user specified cgroup-root exists" cgroupRoot=[]

kube-proxy警告

版本的正常警告

W0521 01:54:07.352575 266283 warnings.go:70] discovery.k8s.io/v1beta1 EndpointSlice is deprecated in v1.21+, unavailable in v1.25+; use discovery.k8s.io/v1 EndpointSlice

# 查看api接口
kubectl api-resources -o wide

核心插件

CNI网络插件Flannel

快速安装:For Kubernetes v1.17+ kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml

Flannel通过etcd读取网络,并分配没有使用的子网(docker使用此子网启动),分配成功后,所有node更新路由

root@k8s-node-01:~# route -n
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
0.0.0.0         10.0.26.1        0.0.0.0         UG    0      0        0 enp0s3
10.0.26.0        0.0.0.0         255.255.255.0   U     0      0        0 enp0s3
172.26.190.0     0.0.0.0         255.255.255.0   U     0      0        0 docker0
172.26.191.0     10.0.26.191      255.255.255.0   UG    0      0        0 enp0s3 # 会自动新增其它节点路由信息

root@k8s-node-02:/usr/local/src# route -n
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
0.0.0.0         10.0.26.1        0.0.0.0         UG    0      0        0 enp0s3
10.0.26.0        0.0.0.0         255.255.255.0   U     0      0        0 enp0s3
172.26.190.0     10.0.26.190      255.255.255.0   UG    0      0        0 enp0s3 # 会自动新增其它节点路由信息
172.26.191.0     0.0.0.0         255.255.255.0   U     0      0        0 docker0

# 说明
# k8s-node-01
190上面的docker(172.26.190.0) 访问docker网络为:172.26.191.0 指定网关为10.0.26.191,数据流向191
# k8s-node-02
来自190(docker 172.26.190.0) 进来的数据 172.26.191.0 网络数据路由到docker0,最终实现pods数据互通

CoreDNS服务发现

环境准备

# ubuntu做以下处理
# 关闭,否则 /etc/resolv.conf 会被修改成 nameserver 127.0.0.53
systemctl stop systemd-resolved.service

vim /etc/resolv.conf

nameserver 114.114.114.114

拉取镜像

# 默认的 k8s.gcr.io 很难下载
docker pull coredns/coredns:1.7.0
# docker tag coredns/coredns:1.7.0 k8s.gcr.io/coredns:1.7.0
docker images

coredns配置清单

https://github.com/kubernetes/kubernetes/blob/v1.20.2/cluster/addons/dns/coredns/coredns.yaml.base

# 下载资源
wget -O coredns.yaml https://raw.githubusercontent.com/kubernetes/kubernetes/v1.20.2/cluster/addons/dns/coredns/coredns.yaml.base

修改1:

kubernetes __DNS__DOMAIN__ in-addr.arpa ip6.arpa {
            pods insecure
            fallthrough in-addr.arpa ip6.arpa
            ttl 30
        }

修改为:kubernetes cluster.local 10.26.0.0/16

修改2:

memory: __DNS__MEMORY__LIMIT__

修改为:memory: 150Mi

修改3:

clusterIP: __DNS__SERVER__

修改为:clusterIP: 10.26.0.2

修改4:

k8s.gcr.io/coredns:1.7.0 改为:coredns/coredns:1.7.0

应用coredns配置

kubectl apply -f coredns.yaml
kubectl get pod -n kube-system -o wide
kubectl get svc -A

# 查看
# kubectl describe pod -n kube-system
# 删除
# kubectl delete deployment coredns -n kube-system

测试

kubectl run pod1 --image=busybox --  sh -c "sleep 10000"
kubectl get pod -o wide

# 测试可用性
kubectl exec pod1 -- ping -c 5 kubernetes
kubectl exec pod1 -- nslookup kube-dns.kube-system

# 登录实例
# kubectl exec -it pod1 -- sh

# 或者登录docker服务器,再进入容器
# docker exec -it 98b8521c809a /bin/sh

ingress(服务暴露)控制器-traefik

dashboard(Kubernetes 集群的通用Web UI)

参考文档