Centos7.9在K8s安装生产级别的分布式存储Rook+Ceph
1.介绍
在k8s云原生平台中,存储是除了网络之外的另一个核心,因为他涉及到了数据的保存,以及容灾等一系列的问题,做生产级别的应用,一定要具有多节点分布式,灾备及时恢复,数据平滑迁移等多种特性。Rook+Ceph就是我们在生产中常用的k8s存储方案。接下来,我们在k8s上安装该存储系统。
2.安装K8s
安装K8s,按照这篇文章去安装即可<<Centos7.9 yum形式安装kubernetes1.19.9(K8s)系统>>
3.升级内核
由于CephFs要4.17以上的内核版本,所有我们想要升级一下K8s各节点的操作系统内核版本,默认安装好Centos7之后的内核版本是3.10,默认安装好后的k8s是这样的环境。并且每个node节点都有第二块硬盘来作为rook-ceph的存储。
将要升级的内核文件下载下来。如果嫌下载太慢可以在这里的云盘直接下载。提取码: 6bqe
#下载内核
wget http://mirrors.coreix.net/elrepo-archive-archive/kernel/el7/x86_64/RPMS/kernel-ml-5.19.9-1.el7.elrepo.x86_64.rpm
wget http://mirrors.coreix.net/elrepo-archive-archive/kernel/el7/x86_64/RPMS/kernel-ml-devel-5.19.9-1.el7.elrepo.x86_64.rpm
wget http://mirrors.coreix.net/elrepo-archive-archive/kernel/el7/x86_64/RPMS/kernel-ml-headers-5.19.9-1.el7.elrepo.x86_64.rpm
yum -y install perl.x86_64
#安装内核
rpm -ivh kernel-ml-5.19.9-1.el7.elrepo.x86_64.rpm
rpm -ivh kernel-ml-devel-5.19.9-1.el7.elrepo.x86_64.rpm
rpm -ivh kernel-ml-headers-5.19.9-1.el7.elrepo.x86_64.rpm
#设置对应的数字启动内核,0代表5.17版本
grub2-set-default 0
#重新加载启动文件
grub2-mkconfig -o /boot/grub2/grub.cfg
#重启服务器
reboot now
升级完成后,整体集群的内核已经变成了5.19.
4.启用RBD 下载ceph镜像
#安装lvm
yum -y install lvm2
#启用rbd模块
modprobe rbd
#生成自启动文件
cat > /etc/rc.sysinit << EOF
#!/bin/bash
for file in /etc/sysconfig/modules/*.modules
do
[ -x \$file ] && \$file
done
EOF
#rbd导入模块
cat > /etc/sysconfig/modules/rbd.modules << EOF
modprobe rbd
EOF
chmod 755 /etc/sysconfig/modules/rbd.modules
lsmod |grep rbd
导入rook-ceph镜像,镜像如果不能从官方下载可以直接从这里 (提取码: 6bqe)下载导入各节点
docker load < ./cephcephv1428.tar
docker load < ./cephcsiv122.tar
docker load < ./csi-attacherv120.tar
docker load < ./csi-node-driver-registrarv120.tar
docker load < ./csi-provisionerv140.tar
docker load < ./csi-snapshotterv122.tar
docker load < ./rookcephV1.2.6.tar
导入镜像后,开始下载rook1.2的包。可以直接在官网下载,也可以直接在这里的云盘下载。
5.安装ceph集群
#解压源码包,解压完成后在当前目录出现一个rook的文件夹
tar -xzvf rook1.2.tar.gz
#创建权限等基础数据
kubectl create -f rook/cluster/examples/kubernetes/ceph/common.yaml
修改operator.yaml里面的镜像,改成rook/ceph:v1.2.6这个已经导入的版本
kubectl create -f rook/cluster/examples/kubernetes/ceph/operator.yaml
kubectl -n rook-ceph get pod -o wide
最后的operator.yaml的修改内容如下:
#################################################################################################################
# The deployment for the rook operator
# Contains the common settings for most Kubernetes deployments.
# For example, to create the rook-ceph cluster:
# kubectl create -f common.yaml
# kubectl create -f operator.yaml
# kubectl create -f cluster.yaml
#
# Also see other operator sample files for variations of operator.yaml:
# - operator-openshift.yaml: Common settings for running in OpenShift
#################################################################################################################
# Rook Ceph Operator Config
# Use this ConfigMap to override operator configurations
# Precedence will be given to this config in case
# Env Var also exists for the same
#
kind: ConfigMap
apiVersion: v1
metadata:
name: rook-ceph-operator-config
# should be in the namespace of the operator
namespace: rook-ceph
data:
# # (Optional) Ceph Provisioner NodeAffinity.
# CSI_PROVISIONER_NODE_AFFINITY: "role=storage-node; storage=rook, ceph"
# # (Optional) CEPH CSI provisioner tolerations list. Put here list of taints you want to tolerate in YAML format.
# # CSI provisioner would be best to start on the same nodes as other ceph daemons.
# CSI_PROVISIONER_TOLERATIONS: |
# - effect: NoSchedule
# key: node-role.kubernetes.io/controlplane
# operator: Exists
# - effect: NoExecute
# key: node-role.kubernetes.io/etcd
# operator: Exists
# # (Optional) Ceph CSI plugin NodeAffinity.
# CSI_PLUGIN_NODE_AFFINITY: "role=storage-node; storage=rook, ceph"
# # (Optional) CEPH CSI plugin tolerations list. Put here list of taints you want to tolerate in YAML format.
# # CSI plugins need to be started on all the nodes where the clients need to mount the storage.
# CSI_PLUGIN_TOLERATIONS: |
# - effect: NoSchedule
# key: node-role.kubernetes.io/controlplane
# operator: Exists
# - effect: NoExecute
# key: node-role.kubernetes.io/etcd
# operator: Exists
---
# OLM: BEGIN OPERATOR DEPLOYMENT
apiVersion: apps/v1
kind: Deployment
metadata:
name: rook-ceph-operator
namespace: rook-ceph
labels:
operator: rook
storage-backend: ceph
spec:
selector:
matchLabels:
app: rook-ceph-operator
replicas: 1
template:
metadata:
labels:
app: rook-ceph-operator
spec:
serviceAccountName: rook-ceph-system
containers:
- name: rook-ceph-operator
image: rook/ceph:v1.2.6
args: ["ceph", "operator"]
volumeMounts:
- mountPath: /var/lib/rook
name: rook-config
- mountPath: /etc/ceph
name: default-config-dir
env:
# If the operator should only watch for cluster CRDs in the same namespace, set this to "true".
# If this is not set to true, the operator will watch for cluster CRDs in all namespaces.
- name: ROOK_CURRENT_NAMESPACE_ONLY
value: "false"
# To disable RBAC, uncomment the following:
# - name: RBAC_ENABLED
# value: "false"
# Rook Agent toleration. Will tolerate all taints with all keys.
# Choose between NoSchedule, PreferNoSchedule and NoExecute:
# - name: AGENT_TOLERATION
# value: "NoSchedule"
# (Optional) Rook Agent toleration key. Set this to the key of the taint you want to tolerate
# - name: AGENT_TOLERATION_KEY
# value: "<KeyOfTheTaintToTolerate>"
# (Optional) Rook Agent tolerations list. Put here list of taints you want to tolerate in YAML format.
# - name: AGENT_TOLERATIONS
# value: |
# - effect: NoSchedule
# key: node-role.kubernetes.io/controlplane
# operator: Exists
# - effect: NoExecute
# key: node-role.kubernetes.io/etcd
# operator: Exists
# (Optional) Rook Agent priority class name to set on the pod(s)
# - name: AGENT_PRIORITY_CLASS_NAME
# value: "<PriorityClassName>"
# (Optional) Rook Agent NodeAffinity.
# - name: AGENT_NODE_AFFINITY
# value: "role=storage-node; storage=rook,ceph"
# (Optional) Rook Agent mount security mode. Can by `Any` or `Restricted`.
# `Any` uses Ceph admin credentials by default/fallback.
# For using `Restricted` you must have a Ceph secret in each namespace storage should be consumed from and
# set `mountUser` to the Ceph user, `mountSecret` to the Kubernetes secret name.
# to the namespace in which the `mountSecret` Kubernetes secret namespace.
# - name: AGENT_MOUNT_SECURITY_MODE
# value: "Any"
# Set the path where the Rook agent can find the flex volumes
# - name: FLEXVOLUME_DIR_PATH
# value: "<PathToFlexVolumes>"
# Set the path where kernel modules can be found
# - name: LIB_MODULES_DIR_PATH
# value: "<PathToLibModules>"
# Mount any extra directories into the agent container
# - name: AGENT_MOUNTS
# value: "somemount=/host/path:/container/path,someothermount=/host/path2:/container/path2"
# Rook Discover toleration. Will tolerate all taints with all keys.
# Choose between NoSchedule, PreferNoSchedule and NoExecute:
# - name: DISCOVER_TOLERATION
# value: "NoSchedule"
# (Optional) Rook Discover toleration key. Set this to the key of the taint you want to tolerate
# - name: DISCOVER_TOLERATION_KEY
# value: "<KeyOfTheTaintToTolerate>"
# (Optional) Rook Discover tolerations list. Put here list of taints you want to tolerate in YAML format.
# - name: DISCOVER_TOLERATIONS
# value: |
# - effect: NoSchedule
# key: node-role.kubernetes.io/controlplane
# operator: Exists
# - effect: NoExecute
# key: node-role.kubernetes.io/etcd
# operator: Exists
# (Optional) Rook Discover priority class name to set on the pod(s)
# - name: DISCOVER_PRIORITY_CLASS_NAME
# value: "<PriorityClassName>"
# (Optional) Discover Agent NodeAffinity.
# - name: DISCOVER_AGENT_NODE_AFFINITY
# value: "role=storage-node; storage=rook, ceph"
# Allow rook to create multiple file systems. Note: This is considered
# an experimental feature in Ceph as described at
# http://docs.ceph.com/docs/master/cephfs/experimental-features/#multiple-filesystems-within-a-ceph-cluster
# which might cause mons to crash as seen in https://github.com/rook/rook/issues/1027
- name: ROOK_ALLOW_MULTIPLE_FILESYSTEMS
value: "false"
# The logging level for the operator: INFO | DEBUG
- name: ROOK_LOG_LEVEL
value: "INFO"
# The interval to check the health of the ceph cluster and update the status in the custom resource.
- name: ROOK_CEPH_STATUS_CHECK_INTERVAL
value: "60s"
# The interval to check if every mon is in the quorum.
- name: ROOK_MON_HEALTHCHECK_INTERVAL
value: "45s"
# The duration to wait before trying to failover or remove/replace the
# current mon with a new mon (useful for compensating flapping network).
- name: ROOK_MON_OUT_TIMEOUT
value: "600s"
# The duration between discovering devices in the rook-discover daemonset.
- name: ROOK_DISCOVER_DEVICES_INTERVAL
value: "60m"
# Whether to start pods as privileged that mount a host path, which includes the Ceph mon and osd pods.
# This is necessary to workaround the anyuid issues when running on OpenShift.
# For more details see https://github.com/rook/rook/issues/1314#issuecomment-355799641
- name: ROOK_HOSTPATH_REQUIRES_PRIVILEGED
value: "false"
# In some situations SELinux relabelling breaks (times out) on large filesystems, and doesn't work with cephfs ReadWriteMany volumes (last relabel wins).
# Disable it here if you have similar issues.
# For more details see https://github.com/rook/rook/issues/2417
- name: ROOK_ENABLE_SELINUX_RELABELING
value: "true"
# In large volumes it will take some time to chown all the files. Disable it here if you have performance issues.
# For more details see https://github.com/rook/rook/issues/2254
- name: ROOK_ENABLE_FSGROUP
value: "true"
# Disable automatic orchestration when new devices are discovered
- name: ROOK_DISABLE_DEVICE_HOTPLUG
value: "false"
# Provide customised regex as the values using comma. For eg. regex for rbd based volume, value will be like "(?i)rbd[0-9]+".
# In case of more than one regex, use comma to seperate between them.
# Default regex will be "(?i)dm-[0-9]+,(?i)rbd[0-9]+,(?i)nbd[0-9]+"
# Add regex expression after putting a comma to blacklist a disk
# If value is empty, the default regex will be used.
- name: DISCOVER_DAEMON_UDEV_BLACKLIST
value: "(?i)dm-[0-9]+,(?i)rbd[0-9]+,(?i)nbd[0-9]+"
# Whether to enable the flex driver. By default it is enabled and is fully supported, but will be deprecated in some future release
# in favor of the CSI driver.
- name: ROOK_ENABLE_FLEX_DRIVER
value: "false"
# Whether to start the discovery daemon to watch for raw storage devices on nodes in the cluster.
# This daemon does not need to run if you are only going to create your OSDs based on StorageClassDeviceSets with PVCs.
- name: ROOK_ENABLE_DISCOVERY_DAEMON
value: "true"
# Enable the default version of the CSI CephFS driver. To start another version of the CSI driver, see image properties below.
- name: ROOK_CSI_ENABLE_CEPHFS
value: "true"
# Enable the default version of the CSI RBD driver. To start another version of the CSI driver, see image properties below.
- name: ROOK_CSI_ENABLE_RBD
value: "true"
- name: ROOK_CSI_ENABLE_GRPC_METRICS
value: "true"
# Enable deployment of snapshotter container in ceph-csi provisioner.
- name: CSI_ENABLE_SNAPSHOTTER
value: "true"
# Enable Ceph Kernel clients on kernel < 4.17 which support quotas for Cephfs
# If you disable the kernel client, your application may be disrupted during upgrade.
# See the upgrade guide: https://rook.io/docs/rook/v1.2/ceph-upgrade.html
- name: CSI_FORCE_CEPHFS_KERNEL_CLIENT
value: "true"
# CSI CephFS plugin daemonset update strategy, supported values are OnDelete and RollingUpdate.
# Default value is RollingUpdate.
#- name: CSI_CEPHFS_PLUGIN_UPDATE_STRATEGY
# value: "OnDelete"
# CSI Rbd plugin daemonset update strategy, supported values are OnDelete and RollingUpdate.
# Default value is RollingUpdate.
#- name: CSI_RBD_PLUGIN_UPDATE_STRATEGY
# value: "OnDelete"
# The default version of CSI supported by Rook will be started. To change the version
# of the CSI driver to something other than what is officially supported, change
# these images to the desired release of the CSI driver.
#- name: ROOK_CSI_CEPH_IMAGE
# value: "quay.io/cephcsi/cephcsi:v2.0.0"
#- name: ROOK_CSI_REGISTRAR_IMAGE
# value: "quay.io/k8scsi/csi-node-driver-registrar:v1.2.0"
#- name: ROOK_CSI_RESIZER_IMAGE
# value: "quay.io/k8scsi/csi-resizer:v0.4.0"
#- name: ROOK_CSI_PROVISIONER_IMAGE
# value: "quay.io/k8scsi/csi-provisioner:v1.4.0"
#- name: ROOK_CSI_SNAPSHOTTER_IMAGE
# value: "quay.io/k8scsi/csi-snapshotter:v1.2.2"
#- name: ROOK_CSI_ATTACHER_IMAGE
# value: "quay.io/k8scsi/csi-attacher:v2.1.0"
# kubelet directory path, if kubelet configured to use other than /var/lib/kubelet path.
#- name: ROOK_CSI_KUBELET_DIR_PATH
# value: "/var/lib/kubelet"
# (Optional) Ceph Provisioner NodeAffinity.
# - name: CSI_PROVISIONER_NODE_AFFINITY
# value: "role=storage-node; storage=rook, ceph"
# (Optional) CEPH CSI provisioner tolerations list. Put here list of taints you want to tolerate in YAML format.
# CSI provisioner would be best to start on the same nodes as other ceph daemons.
# - name: CSI_PROVISIONER_TOLERATIONS
# value: |
# - effect: NoSchedule
# key: node-role.kubernetes.io/controlplane
# operator: Exists
# - effect: NoExecute
# key: node-role.kubernetes.io/etcd
# operator: Exists
# (Optional) Ceph CSI plugin NodeAffinity.
# - name: CSI_PLUGIN_NODE_AFFINITY
# value: "role=storage-node; storage=rook, ceph"
# (Optional) CEPH CSI plugin tolerations list. Put here list of taints you want to tolerate in YAML format.
# CSI plugins need to be started on all the nodes where the clients need to mount the storage.
# - name: CSI_PLUGIN_TOLERATIONS
# value: |
# - effect: NoSchedule
# key: node-role.kubernetes.io/controlplane
# operator: Exists
# - effect: NoExecute
# key: node-role.kubernetes.io/etcd
# operator: Exists
# Configure CSI cephfs grpc and liveness metrics port
#- name: CSI_CEPHFS_GRPC_METRICS_PORT
# value: "9091"
#- name: CSI_CEPHFS_LIVENESS_METRICS_PORT
# value: "9081"
# Configure CSI rbd grpc and liveness metrics port
#- name: CSI_RBD_GRPC_METRICS_PORT
# value: "9090"
#- name: CSI_RBD_LIVENESS_METRICS_PORT
# value: "9080"
# Time to wait until the node controller will move Rook pods to other
# nodes after detecting an unreachable node.
# Pods affected by this setting are:
# mgr, rbd, mds, rgw, nfs, PVC based mons and osds, and ceph toolbox
# The value used in this variable replaces the default value of 300 secs
# added automatically by k8s as Toleration for
# <node.kubernetes.io/unreachable>
# The total amount of time to reschedule Rook pods in healthy nodes
# before detecting a <not ready node> condition will be the sum of:
# --> node-monitor-grace-period: 40 seconds (k8s kube-controller-manager flag)
# --> ROOK_UNREACHABLE_NODE_TOLERATION_SECONDS: 5 seconds
- name: ROOK_UNREACHABLE_NODE_TOLERATION_SECONDS
value: "5"
# The name of the node to pass with the downward API
- name: NODE_NAME
valueFrom:
fieldRef:
fieldPath: spec.nodeName
# The pod name to pass with the downward API
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
# The pod namespace to pass with the downward API
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
# Uncomment it to run rook operator on the host network
#hostNetwork: true
volumes:
- name: rook-config
emptyDir: {}
- name: default-config-dir
emptyDir: {}
# OLM: END OPERATOR DEPLOYMENT
安装完成后的结果
开始安装rook-ceph集群配置cluster.yaml,想要修改几个地方
# 修改集群配置文件,替换镜像
sed -i 's|ceph/ceph:v14.2.9|ceph/ceph:v14.2.8|g' rook/cluster/examples/kubernetes/ceph/cluster.yaml
#关闭所有节点和所有设备选择,手动指定节点和设备
sed -i 's|useAllNodes: true|useAllNodes: false|g' rook/cluster/examples/kubernetes/ceph/cluster.yaml
sed -i 's|useAllDevices: true|useAllDevices: false|g' rook/cluster/examples/kubernetes/ceph/cluster.yaml
在storage标签的config:下添加如下配置,每个节点下的第二个磁盘作为ceph存储
metadataDevice:
databaseSizeMB: "1024"
journalSizeMB: "1024"
nodes:
- name: "node1"
devices:
- name: "sdb"
config:
storeType: bluestore
- name: "node2"
devices:
- name: "sdb"
config:
storeType: bluestore
- name: "node3"
devices:
- name: "sdb"
config:
storeType: bluestore
kubectl apply -f rook/cluster/examples/kubernetes/ceph/cluster.yaml
kubectl -n rook-ceph get pod -o wide
整体的cluster.yaml配置源文件如下:
#################################################################################################################
# Define the settings for the rook-ceph cluster with common settings for a production cluster.
# All nodes with available raw devices will be used for the Ceph cluster. At least three nodes are required
# in this example. See the documentation for more details on storage settings available.
# For example, to create the cluster:
# kubectl create -f common.yaml
# kubectl create -f operator.yaml
# kubectl create -f cluster.yaml
#################################################################################################################
apiVersion: ceph.rook.io/v1
kind: CephCluster
metadata:
name: rook-ceph
namespace: rook-ceph
spec:
cephVersion:
# The container image used to launch the Ceph daemon pods (mon, mgr, osd, mds, rgw).
# v13 is mimic, v14 is nautilus, and v15 is octopus.
# RECOMMENDATION: In production, use a specific version tag instead of the general v14 flag, which pulls the latest release and could result in different
# versions running within the cluster. See tags available at https://hub.docker.com/r/ceph/ceph/tags/.
# If you want to be more precise, you can always use a timestamp tag such ceph/ceph:v14.2.5-20190917
# This tag might not contain a new Ceph version, just security fixes from the underlying operating system, which will reduce vulnerabilities
image: ceph/ceph:v14.2.8
# Whether to allow unsupported versions of Ceph. Currently mimic and nautilus are supported, with the recommendation to upgrade to nautilus.
# Octopus is the version allowed when this is set to true.
# Do not set to true in production.
allowUnsupported: false
# The path on the host where configuration files will be persisted. Must be specified.
# Important: if you reinstall the cluster, make sure you delete this directory from each host or else the mons will fail to start on the new cluster.
# In Minikube, the '/data' directory is configured to persist across reboots. Use "/data/rook" in Minikube environment.
dataDirHostPath: /var/lib/rook
# Whether or not upgrade should continue even if a check fails
# This means Ceph's status could be degraded and we don't recommend upgrading but you might decide otherwise
# Use at your OWN risk
# To understand Rook's upgrade process of Ceph, read https://rook.io/docs/rook/master/ceph-upgrade.html#ceph-version-upgrades
skipUpgradeChecks: false
# Whether or not continue if PGs are not clean during an upgrade
continueUpgradeAfterChecksEvenIfNotHealthy: false
# set the amount of mons to be started
mon:
count: 3
allowMultiplePerNode: false
# mgr:
# modules:
# Several modules should not need to be included in this list. The "dashboard" and "monitoring" modules
# are already enabled by other settings in the cluster CR and the "rook" module is always enabled.
# - name: pg_autoscaler
# enabled: true
# enable the ceph dashboard for viewing cluster status
dashboard:
enabled: true
# serve the dashboard under a subpath (useful when you are accessing the dashboard via a reverse proxy)
# urlPrefix: /ceph-dashboard
# serve the dashboard at the given port.
# port: 8443
# serve the dashboard using SSL
ssl: true
# enable prometheus alerting for cluster
monitoring:
# requires Prometheus to be pre-installed
enabled: false
# namespace to deploy prometheusRule in. If empty, namespace of the cluster will be used.
# Recommended:
# If you have a single rook-ceph cluster, set the rulesNamespace to the same namespace as the cluster or keep it empty.
# If you have multiple rook-ceph clusters in the same k8s cluster, choose the same namespace (ideally, namespace with prometheus
# deployed) to set rulesNamespace for all the clusters. Otherwise, you will get duplicate alerts with multiple alert definitions.
rulesNamespace: rook-ceph
network:
# toggle to use hostNetwork
hostNetwork: false
rbdMirroring:
# The number of daemons that will perform the rbd mirroring.
# rbd mirroring must be configured with "rbd mirror" from the rook toolbox.
workers: 0
# enable the crash collector for ceph daemon crash collection
crashCollector:
disable: false
# To control where various services will be scheduled by kubernetes, use the placement configuration sections below.
# The example under 'all' would have all services scheduled on kubernetes nodes labeled with 'role=storage-node' and
# tolerate taints with a key of 'storage-node'.
# placement:
# all:
# nodeAffinity:
# requiredDuringSchedulingIgnoredDuringExecution:
# nodeSelectorTerms:
# - matchExpressions:
# - key: role
# operator: In
# values:
# - storage-node
# podAffinity:
# podAntiAffinity:
# tolerations:
# - key: storage-node
# operator: Exists
# The above placement information can also be specified for mon, osd, and mgr components
# mon:
# Monitor deployments may contain an anti-affinity rule for avoiding monitor
# collocation on the same node. This is a required rule when host network is used
# or when AllowMultiplePerNode is false. Otherwise this anti-affinity rule is a
# preferred rule with weight: 50.
# osd:
# mgr:
annotations:
# all:
# mon:
# osd:
# If no mgr annotations are set, prometheus scrape annotations will be set by default.
# mgr:
resources:
# The requests and limits set here, allow the mgr pod to use half of one CPU core and 1 gigabyte of memory
# mgr:
# limits:
# cpu: "500m"
# memory: "1024Mi"
# requests:
# cpu: "500m"
# memory: "1024Mi"
# The above example requests/limits can also be added to the mon and osd components
# mon:
# osd:
# prepareosd:
# crashcollector:
# The option to automatically remove OSDs that are out and are safe to destroy.
removeOSDsIfOutAndSafeToRemove: false
# priorityClassNames:
# all: rook-ceph-default-priority-class
# mon: rook-ceph-mon-priority-class
# osd: rook-ceph-osd-priority-class
# mgr: rook-ceph-mgr-priority-class
storage: # cluster level storage configuration and selection
useAllNodes: false
useAllDevices: false
#deviceFilter:
config:
# The default and recommended storeType is dynamically set to bluestore for devices and filestore for directories.
# Set the storeType explicitly only if it is required not to use the default.
# storeType: bluestore
# metadataDevice: "md0" # specify a non-rotational storage so ceph-volume will use it as block db device of bluestore.
# databaseSizeMB: "1024" # uncomment if the disks are smaller than 100 GB
# journalSizeMB: "1024" # uncomment if the disks are 20 GB or smaller
# osdsPerDevice: "1" # this value can be overridden at the node or device level
# encryptedDevice: "true" # the default value for this option is "false"
metadataDevice:
databaseSizeMB: "1024"
journalSizeMB: "1024"
nodes:
- name: "node1"
devices:
- name: "sdb"
config:
storeType: bluestore
- name: "node2"
devices:
- name: "sdb"
config:
storeType: bluestore
- name: "node3"
devices:
- name: "sdb"
config:
storeType: bluestore
# Cluster level list of directories to use for filestore-based OSD storage. If uncomment, this example would create an OSD under the dataDirHostPath.
#directories:
#- path: /var/lib/rook
# Individual nodes and their config can be specified as well, but 'useAllNodes' above must be set to false. Then, only the named
# nodes below will be used as storage resources. Each node's 'name' field should match their 'kubernetes.io/hostname' label.
# nodes:
# - name: "172.17.4.101"
# directories: # specific directories to use for storage can be specified for each node
# - path: "/rook/storage-dir"
# resources:
# limits:
# cpu: "500m"
# memory: "1024Mi"
# requests:
# cpu: "500m"
# memory: "1024Mi"
# - name: "172.17.4.201"
# devices: # specific devices to use for storage can be specified for each node
# - name: "sdb"
# - name: "nvme01" # multiple osds can be created on high performance devices
# config:
# osdsPerDevice: "5"
# config: # configuration can be specified at the node level which overrides the cluster level config
# storeType: filestore
# - name: "172.17.4.301"
# deviceFilter: "^sd."
# The section for configuring management of daemon disruptions during upgrade or fencing.
disruptionManagement:
# If true, the operator will create and manage PodDisruptionBudgets for OSD, Mon, RGW, and MDS daemons. OSD PDBs are managed dynamically
# via the strategy outlined in the [design](https://github.com/rook/rook/blob/master/design/ceph/ceph-managed-disruptionbudgets.md). The operator will
# block eviction of OSDs by default and unblock them safely when drains are detected.
managePodBudgets: false
# A duration in minutes that determines how long an entire failureDomain like `region/zone/host` will be held in `noout` (in addition to the
# default DOWN/OUT interval) when it is draining. This is only relevant when `managePodBudgets` is `true`. The default value is `30` minutes.
osdMaintenanceTimeout: 30
# If true, the operator will create and manage MachineDisruptionBudgets to ensure OSDs are only fenced when the cluster is healthy.
# Only available on OpenShift.
manageMachineDisruptionBudgets: false
# Namespace in which to watch for the MachineDisruptionBudgets.
machineDisruptionBudgetNamespace: openshift-machine-api
经过一段时间的初始化后,rook-ceph创建完毕,显示如下:
每个node节点都会显示ceph的lvm信息
安装一下ceph tools,查看整个集群的状态
sed -i 's|rook/ceph:v1.2.7|rook/ceph:v1.2.6|g' rook/cluster/examples/kubernetes/ceph/toolbox.yaml
kubectl apply -f rook/cluster/examples/kubernetes/ceph/toolbox.yaml
kubectl -n rook-ceph get pod -l "app=rook-ceph-tools"
#查看集群状态
NAME=$(kubectl -n rook-ceph get pod -l "app=rook-ceph-tools" -o jsonpath='{.items[0].metadata.name}')
kubectl -n rook-ceph exec -it ${NAME} sh
ceph status
ceph osd status
ceph osd df
ceph osd utilization
ceph osd pool stats
ceph osd tree
ceph pg stat
ceph df
rados df
exit
登录Ceph Dashboard看看集群状态,把rook-ceph-mgr的管理端口改成NodePort端口,这样就可以访问了,更新的NodePort_update.yaml文件如下:
# Please edit the object below. Lines beginning with a '#' will be ignored,
# and an empty file will abort the edit. If an error occurs while saving this file will be
# reopened with the relevant failures.
#
apiVersion: v1
kind: Service
metadata:
creationTimestamp: "2024-09-23T08:34:18Z"
labels:
app: rook-ceph-mgr
rook_cluster: rook-ceph
name: rook-ceph-mgr-dashboard
namespace: rook-ceph
ownerReferences:
- apiVersion: ceph.rook.io/v1
blockOwnerDeletion: true
kind: CephCluster
name: rook-ceph
uid: 12207a89-a6e7-48d0-8499-03a4e96604f5
resourceVersion: "53342"
selfLink: /api/v1/namespaces/rook-ceph/services/rook-ceph-mgr-dashboard
uid: f8154ff1-87f3-49e0-8561-e3d928560e68
spec:
clusterIP: 10.1.98.80
ports:
- name: https-dashboard
port: 8443
protocol: TCP
targetPort: 8443
nodePort: 31112
selector:
app: rook-ceph-mgr
rook_cluster: rook-ceph
sessionAffinity: None
type: NodePort
status:
loadBalancer: {}
kubectl apply -f NodePort_update.yaml
然后可以看到,31112端口已经开始监听
通过浏览器打开https://xxx.xxx.xxx.xxx:31112
用户名:admin 密码是通过如下方式获得
Ciphertext=$(kubectl -n rook-ceph get secret rook-ceph-dashboard-password -o jsonpath="{['data']['password']}")
Pass=$(echo ${Ciphertext}|base64 --decode)
echo ${Pass}
由于我这里服务器时间没有对齐,所以出现了health_warn.把服务器的时间全部对齐
yum -y install chrony
systemctl enable chronyd && systemctl start chronyd
timedatectl status
timedatectl set-local-rtc 0
systemctl restart rsyslog && systemctl restart crond
Ceph能为pod提供块设备,接下来我们创建块设备
sed -i 's/failureDomain: host/failureDomain: osd/g' rook/cluster/examples/kubernetes/ceph/csi/rbd/storageclass.yaml
kubectl apply -f rook/cluster/examples/kubernetes/ceph/csi/rbd/storageclass.yaml
在创建完成后,可以在k8s的dashboard看见rook-ceph-block.
我现在安装一个范例的mysql来看看是否能自动创建pv,pvc.
kubectl apply -f rook/cluster/examples/kubernetes/mysql.yaml
可以看到自动创建pv,pvc