当前位置: 首页 > article >正文

ck集群数据迁移

一,迁移背景
1,原始集群中只有两个节点,带副本集群,但由于其中一台集群故障,没有维护,为了功能正常,临时将损坏节点从集群配置中注释掉,集群变成了无副本的,数据继续正常入库一段时间了
2,现在要将数据进行迁移,将原始ck数据迁移到新集群 2分片2副本中,要保证数据正常
二,迁移步骤
1,保证新集群兼容老集群的版本
2,将原始集群数据目录下的clickhouse数据复制到新集群的数据目录下,确保两个环境配置data_path的路径一致
3,尽量保证新集群和旧集群的配置一致
三、问题
四、由于原始数据只有一个节点,所以需要将数据和表结构同步到其中一个副本
但是这个新集群的另外一个副本没有数据,1,需要将没数据的副本数据恢复2,让另外一个分片的表结构恢复

方案一:将原始数据scp到复位副本的两个节点上,表数据同步一份到另外一个分片的两个副本的对应目录下

方案二:将原始数据scp到副本的一个节点上,表结构同步一份到另外一个分片的两个副本的对应目录下

方案三:将原始数据scp到副本的一个节点上,表结构通过建表语句刷进去

由于现网中小文件太多,传一份文件花费1-2个小时,已经将文件同步到其中一个分片的一个副本节点,再同步到其他节点耗时太长。同时建表语句有保存,所以采用方案三,建表语句直接刷一下,通过on cluster 创建表

五、同步完数据后,副本如果路径没有,表都是出于readonly中,通过附录一中的脚本进行readonly恢复.恢复完后,从旧集群上传文件到新集群的那个分片的节点对应的另外一个副本数据会自动同步过去,另外一个新的分片是没数据

附录一`#!/bin/bash

Script: repair_clickhouse_readonly_tables.sh

Author: admin

Version: 3.5

Date: 20241204

Description: This script automates the repair of ClickHouse readonly tables by detaching, dropping replica,

attaching, restoring replica, and syncing the replica for each readonly table found in the system.replicas table.

History :

Default values

ch_server=“localhost”
ch_port=“9000”
ch_user=“default”
ch_password=“”

Prompt the user for ClickHouse server information

read -p “Enter ClickHouse server hostname or IP (default is c h s e r v e r ) : " u s e r s e r v e r c h s e r v e r = " ch_server): " user_server ch_server=" chserver):"userserverchserver="{user_server:-$ch_server}”
read -p “Enter ClickHouse server port (default is c h p o r t ) : " u s e r p o r t c h p o r t = " ch_port): " user_port ch_port=" chport):"userportchport="{user_port:-$ch_port}”
read -p “Enter ClickHouse username (default is c h u s e r ) : " u s e r u s e r c h u s e r = " ch_user): " user_user ch_user=" chuser):"useruserchuser="{user_user:-$ch_user}”

Only prompt for password if not empty

if [ -z “ c h p a s s w o r d " ] ; t h e n r e a d − s − p " E n t e r C l i c k H o u s e p a s s w o r d : " u s e r p a s s w o r d c h p a s s w o r d = " ch_password" ]; then read -s -p "Enter ClickHouse password: " user_password ch_password=" chpassword"];thenreadsp"EnterClickHousepassword:"userpasswordchpassword="user_password”
fi

Timeout settings

max_execution_time=3600
receive_timeout=7200

Get ClickHouse version info

ch_version= ( c l i c k h o u s e − c l i e n t − − m a x e x e c u t i o n t i m e " (clickhouse-client --max_execution_time " (clickhouseclientmaxexecutiontime"max_execution_time" --receive_timeout “ r e c e i v e t i m e o u t " − − h o s t " receive_timeout" --host " receivetimeout"host"ch_server” --port “ c h p o r t " − − u s e r " ch_port" --user " chport"user"ch_user” --password “$ch_password” --query “SELECT version()” --format TSV)

Step 1: Get readonly table info

readonly_info= ( c l i c k h o u s e − c l i e n t − − m a x e x e c u t i o n t i m e " (clickhouse-client --max_execution_time " (clickhouseclientmaxexecutiontime"max_execution_time" --receive_timeout “ r e c e i v e t i m e o u t " − − h o s t " receive_timeout" --host " receivetimeout"host"ch_server” --port “ c h p o r t " − − u s e r " ch_port" --user " chport"user"ch_user” --password “$ch_password” --query “SELECT hostname() AS hostname, database, table, zookeeper_path, replica_path FROM system.replicas WHERE is_readonly = 1” --format TSV)

Step 2: Split the result into lines

IFS=KaTeX parse error: Undefined control sequence: \n at position 2: '\̲n̲' readonly_tabl…readonly_info)

Loop through the readonly tables

for row in “${readonly_tables[@]}”; do

Extract values from the row

hostname= ( e c h o " (echo " (echo"row" | awk '{print KaTeX parse error: Expected 'EOF', got '}' at position 2: 1}̲') database=(echo “$row” | awk '{print KaTeX parse error: Expected 'EOF', got '}' at position 2: 2}̲') table=(echo “$row” | awk '{print KaTeX parse error: Expected 'EOF', got '}' at position 2: 3}̲') replica_pa…(echo “$row” | awk ‘{print $5}’)

Skip tables containing ‘inner’

if [[ “$table” == “inner” ]]; then
echo “Skipping table d a t a b a s e . database. database.table as it contains ‘inner’ and is not supported for materialized views.”
continue
fi

echo “Processing readonly table: d a t a b a s e . database. database.table on $hostname”

Step 3: Detach the table

clickhouse-client --max_execution_time “ m a x e x e c u t i o n t i m e " − − r e c e i v e t i m e o u t " max_execution_time" --receive_timeout " maxexecutiontime"receivetimeout"receive_timeout” --host “ c h s e r v e r " − − p o r t " ch_server" --port " chserver"port"ch_port” --user “ c h u s e r " − − p a s s w o r d " ch_user" --password " chuser"password"ch_password” --query “DETACH TABLE d a t a b a s e . database. database.table”

Step 4: Drop the replica

clickhouse-client --max_execution_time “ m a x e x e c u t i o n t i m e " − − r e c e i v e t i m e o u t " max_execution_time" --receive_timeout " maxexecutiontime"receivetimeout"receive_timeout” --host “ c h s e r v e r " − − p o r t " ch_server" --port " chserver"port"ch_port” --user “ c h u s e r " − − p a s s w o r d " ch_user" --password " chuser"password"ch_password” --query “SYSTEM DROP REPLICA ‘ h o s t n a m e ′ F R O M Z K P A T H ′ hostname' FROM ZKPATH ' hostnameFROMZKPATHreplica_path’”

Step 5: Attach the table

clickhouse-client --max_execution_time “ m a x e x e c u t i o n t i m e " − − r e c e i v e t i m e o u t " max_execution_time" --receive_timeout " maxexecutiontime"receivetimeout"receive_timeout” --host “ c h s e r v e r " − − p o r t " ch_server" --port " chserver"port"ch_port” --user “ c h u s e r " − − p a s s w o r d " ch_user" --password " chuser"password"ch_password” --query “ATTACH TABLE d a t a b a s e . database. database.table”

Step 6: Restore the replica

clickhouse-client --max_execution_time “ m a x e x e c u t i o n t i m e " − − r e c e i v e t i m e o u t " max_execution_time" --receive_timeout " maxexecutiontime"receivetimeout"receive_timeout” --host “ c h s e r v e r " − − p o r t " ch_server" --port " chserver"port"ch_port” --user “ c h u s e r " − − p a s s w o r d " ch_user" --password " chuser"password"ch_password” --query “SYSTEM RESTORE REPLICA d a t a b a s e . database. database.table”

Step 7: Sync the replica

clickhouse-client --max_execution_time “ m a x e x e c u t i o n t i m e " − − r e c e i v e t i m e o u t " max_execution_time" --receive_timeout " maxexecutiontime"receivetimeout"receive_timeout” --host “ c h s e r v e r " − − p o r t " ch_server" --port " chserver"port"ch_port” --user “ c h u s e r " − − p a s s w o r d " ch_user" --password " chuser"password"ch_password” --query “SYSTEM SYNC REPLICA d a t a b a s e . database. database.table”

echo “Completed processing readonly table: d a t a b a s e . database. database.table on $hostname”
done

`


http://www.kler.cn/a/450554.html

相关文章:

  • 后端-redis
  • Mono里运行C#脚本3—mono_jit_init
  • 24/12/24 力扣每日一题 # LeetCode 524. 通过删除字母匹配到字典里最长单词 全网最详细解释
  • Linux 常见用例汇总
  • Odoo:免费开源ERP的AI技术赋能出海企业电子商务应用介绍
  • 贪心算法求解跳跃游戏
  • SMMU软件指南SMMU编程之虚拟机结构和缓存
  • Spring篇--基于注解的Spring应用
  • SSM书影音社区前端设计与实现u15u5--(程序+源码+数据库+调试部署+开发环境)
  • 深入理解IQ混频(基于ad9361架构)
  • 数据分析-54-时间序列分析之滑动窗口处理及连续触发的判断逻辑
  • Python(二)str、list、tuple、dict、set
  • 深入理解CAN协议
  • 【人工智能设计模式:迪米特法则的智能实践】
  • 十四、从0开始卷出一个新项目之瑞萨RZN2L之栈回溯(Default_Handler/hartfault)
  • 图为科技与奥比中光强强联手,打造机器人专用的全景全域AI视觉解决方案
  • 计算机网络:IP地址相关知识总结
  • 神经网络-LeNet
  • OBIEE 12C 功能测试:运行总和的运用
  • 概率论 期末 笔记
  • JVM系列(十三) -常用调优工具介绍
  • 14-C语言多文件编程
  • CES Asia(赛逸展)有哪些科技创新产品?
  • 静态路由配置包括DHCP
  • 华为ensp--BGP路由反射器
  • 数据库操作【JDBC HIbernate Mybatis】