当前位置：首页 > article >正文

HBase 源码阅读（一）

article 2025/2/6 1:07:49

1. HMaster main方法

在上文中Macos M1 IDEA本地调试 HBase 2.2.2，我们使用HMaster的主函数使用"start"作为入参，启动了HMaster进程

这里我们再深入了解下HMaster的运行机理

  public static void main(String [] args) {
    LOG.info("STARTING service " + HMaster.class.getSimpleName());
    VersionInfo.logVersion();
    new HMasterCommandLine(HMaster.class).doMain(args);

  }

这里构建了HMasterCommandLine 一个新的对象, 调用了HMasterCommandLine的doMain方法

public class HMasterCommandLine extends ServerCommandLine

HMasterCommandLine 继承 ServerCommandLine方法，用于处理命令行参数，并启动相应的服务

ServerCommandLine 类中的doMain方法，HMaster main方法中的doMain(args)通过这个内容进行切分

public void doMain(String args[]) {
  try {
    int ret = ToolRunner.run(HBaseConfiguration.create(), this, args);
    if (ret != 0) {
      System.exit(ret);
    }
  } catch (Exception e) {
    LOG.error("Failed to run", e);
    System.exit(-1);
  }
}

这里会通过ToolRunner.run调用 HMasterCommandLine的run方法（因为是HMasterCommandLine继承了这个类，这里的this 指的就是HMasterCommandLine 类的内容）

这里还会载入HBase的配置文件

HBaseConfiguration.create()

这里就会创建一个HMasterCommandLine实例，并调用它的run方法

 new HMasterCommandLine(HMaster.class).doMain(args);

一个常见的模式是通过反射调用传递的类（在这里是HMaster.class）的构造器，并在构造器完成后调用相应的run()或start()方法。由于HMasterCommandLine是带有HMaster类信息的，所以doMain最终会创建一个HMaster实例，这就会调用HMaster的构造函数。

2. HMasterCommandLine的run 方法

载入Options 信息

判断需要创建多少master进程，多少RegionServers进程

最后判断用户输入的命令行数据，这里会调用HMasterCommandLine的startMaster 方法

    if ("start".equals(command)) {
      System.out.println("调用启动函数");
      return startMaster();
    } else if ("stop".equals(command)) {
      return stopMaster();
    } else if ("clear".equals(command)) {
      return (ZNodeClearer.clear(getConf()) ? 0 : 1);
    } else {
      usage("Invalid command: " + command);
      return 1;
    }

3. HMasterCommandLine的startMaster方法

获取配置文件

这里我们设置的是在一台JVM下同时启动master和regionserver，否则只会在本JVM上创建一个Master进程

由于是Mini 集群这里也会相应的创建一个MiniZookeeperCluster对象

同时当 master 关闭时，需要关闭 zk 集群。
运行一个子类，在退出时关闭 zk 集群。

// 这一行代码，就会调用HMaster的构造方法
        LocalHBaseCluster cluster = new LocalHBaseCluster(conf, mastersCount, regionServersCount,
          LocalHMaster.class, HRegionServer.class);


        ((LocalHMaster)cluster.getMaster(0)).setZKCluster(zooKeeperCluster);
        cluster.startup();
        waitOnMasterThreads(cluster);

这里在本地创建了HBaseCluster内容，设置了master数目，Regionserver数目，LocalHMaster类（继承了HMaster），在后面设置了Mini Zookeeper 的对象，此时，如果关闭cluster，也会同时关闭zooKeeperCluster

在 LocalHBaseCluster 的构造方法中，它会根据传入的 LocalHMaster.class 和 HRegionServer.class 创建相应的 HMaster 和 HRegionServer 实例（即调用他们的构造器）

最后执行

cluster.startup();

由JVMClusterUtil 启动各自的线程（即HMaster的run方法）

  public void startup() throws IOException {
    JVMClusterUtil.startup(this.masterThreads, this.regionThreads);
  }

4. HMaster

public class HMaster extends HRegionServer implements MasterServices

HMaster 继承HRegionServer类，实现了MasterServices 接口

HRegionServer为客户端提供一组 HRegions。它与HMaster 进行检查。单个 HBase 部署中有许多 HRegionServer

而MasterServices 为 HMaster 提供的精选服务子集。仅供内部使用。传递给Managers，Services and Chores，因此在测试时可以传递非完整的 HMaster。

5. HMaster Constructor

HMaster Constructor，过程

初始化本地HRegionServer
启动 ActiveMasterManager

在Master变为Active状态后，剩余的初始化步骤在finishActiveMasterInitialization(MonitoredTask) 方法下进行

初始化本地HRegionServer 是通过 super(conf)执行的

  public HMaster(final Configuration conf)
      throws IOException, KeeperException {
    // 初始化本地HRegionServer
    super(conf);

6. HRegionServer.java类继承了HasThread 类

HasThread类

包含线程并将常用线程方法委托给该实例的抽象类。此类的目的是解决 Sun JVM 错误 #6915621，其中 JDK 内部的某些内容使用 Thread.currentThread() 作为监视器锁。这会产生类似 HBASE-4367、HBASE-4101 等死锁

7. 这里有一个经常出现的类 TraceUtil

这个包装类提供了以简化的方式访问 htrace 4+ 功能的函数。

具体可以参考

https://www.cnblogs.com/itsoku123/p/11377738.html

初始化了HBaseHTraceConfiguration的一个对象

8. 回到HRegionServer的 Constructor

初始化了一些HRegionServer的参数

创建Rpc服务

      rpcServices = createRpcServices();

调用

getUseThisHostnameInstead(conf)

HMaster 应该重写此方法来加载 master 的特定配置

构建ServerName

构建

rpcControllerFactory
rpcRetryingCallerFactory

如果安全性zookeeper，安全hadoop等需要登录

      // login the zookeeper client principal (if using security)
      ZKUtil.loginClient(this.conf, HConstants.ZK_CLIENT_KEYTAB_FILE,
          HConstants.ZK_CLIENT_KERBEROS_PRINCIPAL, hostName);
      // login the server principal (if using secure Hadoop)
      login(userProvider, hostName);
      // init superusers and add the server principal (if using security)
      // or process owner as default super user.
      Superusers.initialize(conf);
      regionServerAccounting = new RegionServerAccounting(conf);

检测Master是否带表，如果不带表，说明是一个新的Master，不需要实例化块缓存和 mob 文件缓存

initializeFileSystem()

# 如果系统运行在Windows上， 则进行一个windows特殊化设置
setupWindows(getConfiguration(), getConfigurationManager());

下面是关于一部分启动Zookeeper 的内容

9. HRegionServer initializeFileSystem()函数

这里Constructor 使用这个函数构建了HBase底层的文件存储层

private void initializeFileSystem() throws IOException {
// 获取此 RS（HRegionServer） 使用的 fs 实例。我们是否在 hbase 中使用校验和验证？如果 hbase
// 启用了校验和验证，则自动关闭 hdfs 校验和验证。

boolean useHBaseChecksum = conf.getBoolean(HConstants.HBASE_CHECKSUM_VERIFICATION, true);

FSUtils.setFsDefault(this.conf, FSUtils.getWALRootDir(this.conf));
this.walFs = new HFileSystem(this.conf, useHBaseChecksum);
this.walRootDir = FSUtils.getWALRootDir(this.conf);

// 这里的fs.defaultFS时通过hbase-site.xml设置的
// 设置“fs.defaultFS”以匹配 hbase.rootdir 上的文件系统，否则
// 底层 hadoop hdfs 访问器将与错误的文件系统相冲突
// （除非全部设置为默认值）。
FSUtils.setFsDefault(this.conf，FSUtils.getRootDir(this.conf));
this.fs = new HFileSystem(this.conf，useHBaseChecksum);
this.rootDir = FSUtils.getRootDir(this.conf);
this.tableDescriptors = getFsTableDescriptors();
}

获取了HDFS底层存储的HBase 表结构

this.tableDescriptors = getFsTableDescriptors();

10. 回到HMaster的构建器中

此时，HMaster的第一个步骤已经完成，下一步是要启动 ActiveMasterManager

在HMaster的Constructor中，并没有找到finishActiveMasterInitialization的入口，我们直接找到HMaster.java文件中finishActiveMasterInitialization函数，往上找该方法的调用

finishActiveMasterInitialization先是被 startActiveMasterManage方法调用， startActiveMasterManage又被run()方法调用，

在Java中，run()方法并不会在构造函数之后自动执行。通常情况下，run()方法是与Thread类或实现Runnable接口的类关联的。

关于Runnable 类的介绍

Java并发编程（二）：Thread与Runnable的底层原理

11. HMaster的run方法

判断是否在本地测试时启动一个完整的HBase集群

此时，通过Threads类，启动了一个线程

调用了startActiveMasterManager 函数

最后执行了

 super.run();

此时，调用HRegionServer的run方法

12. HMaster的startActiveMasterManager方法

   // 构建了一个Znode路径，用于表示当前HMaster 作为主节点的Znode，
    // ZNodePaths.joinZNode方法将ZooKeeper中备份主节点目录的路径和当前HMaster的serverName拼接成一个完整的ZNode路径。
    String backupZNode = ZNodePaths.joinZNode(
      zooKeeper.getZNodePaths().backupMasterAddressesZNode, serverName.toString());
    /*
    * Add a ZNode for ourselves in the backup master directory since we
    * may not become the active master. If so, we want the actual active
    * master to know we are backup masters, so that it won't assign
    * regions to us if so configured.
    *
    * If we become the active master later, ActiveMasterManager will delete
    * this node explicitly.  If we crash before then, ZooKeeper will delete
    * this node for us since it is ephemeral.
    */
    LOG.info("Adding backup master ZNode " + backupZNode);
    // 创建一个临时的Znode，setMasterAddress将当前HMaster的信息写入到指定的Znode中
    if (!MasterAddressTracker.setMasterAddress(zooKeeper, backupZNode, serverName, infoPort)) {
      LOG.warn("Failed create of " + backupZNode + " by " + serverName);
    }
    // 设置信息端口
    this.activeMasterManager.setInfoPort(infoPort);
    int timeout = conf.getInt(HConstants.ZK_SESSION_TIMEOUT, HConstants.DEFAULT_ZK_SESSION_TIMEOUT);
    // If we're a backup master, stall until a primary to write this address

    // 判断是否是备份模式，如果是备份模式，则等到又一个active的主节点在Zookeeper中注册
    // 这中等待通过轮训的方式实现，直到activeMasterManager 有一个活动的主节点
    if (conf.getBoolean(HConstants.MASTER_TYPE_BACKUP, HConstants.DEFAULT_MASTER_TYPE_BACKUP)) {
      LOG.debug("HMaster started in backup mode. Stalling until master znode is written.");
      // This will only be a minute or so while the cluster starts up,
      // so don't worry about setting watches on the parent znode
      while (!activeMasterManager.hasActiveMaster()) {
        LOG.debug("Waiting for master address and cluster state znode to be written.");
        Threads.sleep(timeout);
      }
    }
    // 创建任务监控状态
    MonitoredTask status = TaskMonitor.get().createStatus("Master startup");
    status.setDescription("Master startup");
    try {
      // 尝试成为Active 主节点
      if (activeMasterManager.blockUntilBecomingActiveMaster(timeout, status)) {
        finishActiveMasterInitialization(status);
      }
      // 下面就是错误处理

13. HMaster 的 finishActiveMasterInitialzation方法

代码中的介绍，HMaster构建的最主要的部分（295行代码）

成为主 master 后完成 HMaster 的初始化。启动顺序有点复杂但非常重要，除非您知道自己在做什么，否则请不要更改它。

初始化基于文件系统的组件 - 文件系统管理器、wal 管理器、表描述符等
发布集群 ID
这是最复杂的部分 - 初始化服务器管理器、分配管理器和区域服务器跟踪器
1. 创建服务器管理器
2. 创建过程执行器，加载过程，但不要启动工作器。我们将在完成 SCP 调度后稍后启动它，以避免为同一服务器调度重复的 SCP
3. 创建分配管理器并启动它，加载元区域状态，但不要从元区域加载数据
4. 启动区域服务器跟踪器，构建在线服务器集并找出死机服务器并为它们调度 SCP。在线服务器将通过扫描 zk 来构建，我们还将扫描 wal 目录以找出可能的活动区域服务器，这两组之间的差异就是死服务器
如果这是新部署，请安排 InitMetaProcedure 来初始化元数据
启动必要的服务线程 - 平衡器、目录管理员、执行器服务以及过程执行器等。请注意，必须首先创建平衡器，因为分配管理器在分配区域时可能会使用它。
等待元数据初始化（如有必要），启动表状态管理器。
等待足够的区域服务器签入
让分配管理器从元数据加载数据并构建区域状态
启动所有其他事情，例如杂务服务等
请注意，现在我们不会安排特殊程序使元数据在线（除非第一次尚未创建元数据），我们将依靠 SCP 使元数据在线。

13.1 根据Master内容，预先载入内存数据

在初始化文件系统组件之前，会根据master是否带表，初始化MemStoreLAB

这里对应的应该是LSM-Tree里的Memory层

  protected void initializeMemStoreChunkCreator() {
    // 检查是否启用了MemStoreLAB
    if (MemStoreLAB.isEnabled(conf)) {
      // MSLAB is enabled. So initialize MemStoreChunkPool
      // By this time, the MemstoreFlusher is already initialized. We can get the global limits from
      // it.
      // 获取全局MemStore大小吗这是HBase 用来存储到内存中的大小限制，此值用于计算内存块池的大小
      Pair<Long, MemoryType> pair = MemorySizeUtil.getGlobalMemStoreSize(conf);
      long globalMemStoreSize = pair.getFirst();

      // 检查是否使用OffHeap内存，这里的Offheap指的是直接内存，非堆内存，如果启用，内存管理将不同于常规的内存管理
      boolean offheap = this.regionServerAccounting.isOffheap();
      // When off heap memstore in use, take full area for chunk pool.

      // 确定内存池大小的百分比
      // 如果使用OffHeap内存，则内存池大小将设为100%，饿否则，将从配置中获取MemStoreLAB内存池大小的最大百分比（默认0.4
      float poolSizePercentage = offheap? 1.0F:
          conf.getFloat(MemStoreLAB.CHUNK_POOL_MAXSIZE_KEY, MemStoreLAB.POOL_MAX_SIZE_DEFAULT);
      float initialCountPercentage = conf.getFloat(MemStoreLAB.CHUNK_POOL_INITIALSIZE_KEY,
          MemStoreLAB.POOL_INITIAL_SIZE_DEFAULT);

      // 获取内存块儿大小
      int chunkSize = conf.getInt(MemStoreLAB.CHUNK_SIZE_KEY, MemStoreLAB.CHUNK_SIZE_DEFAULT);
      // init the chunkCreator
      //初始化ChunkCreator
      // chunkSize ： 每个内存块大小
      // offheap ： 是否使用Offheap内存
      // globalMemStoreSize  // 全局MemStore的内存限制
      // poolSizePercentage // 内存池的最大大小百分比
      // initialCountPercentage // 初始内存池的大小把粉笔
      // hMemManager：  // 管理内存回收的管理器

      ChunkCreator.initialize(chunkSize, offheap, globalMemStoreSize, poolSizePercentage,
        initialCountPercentage, this.hMemManager);
    }

13. 2 初始化文件系统组件

    
    this.fileSystemManager = new MasterFileSystem(conf);
    this.walManager = new MasterWalManager(this);

    // enable table descriptors cache
    this.tableDescriptors.setCacheOn();

    // warm-up HTDs cache on master initialization
    if (preLoadTableDescriptors) {
      status.setStatus("Pre-loading table descriptors");
      this.tableDescriptors.getAll();
    }

13.3 发布CLuster ID

发布集群 ID；也在 Master 中设置它。超类 RegionServer 稍后会执行此操作，但
仅在它与 Master 签入后。至少有几个测试在 Master 调用其运行方法之前和 RegionServer 完成 reportForDuty 之前向 Master 询问 clusterId。

预防措施。设置旧的 hbck1 锁定文件，以隔离运行其
hbck1 的旧 hbase1 对抗 hbase2 集群；这可能会造成损害。要跳过此行为，请将
hbase.write.hbck1.lock.file 设置为 false。

    ClusterId clusterId = fileSystemManager.getClusterId();
    status.setStatus("Publishing Cluster ID " + clusterId + " in ZooKeeper");
    ZKClusterId.setClusterId(this.zooKeeper, fileSystemManager.getClusterId());
    this.clusterId = clusterId.toString();

    if (this.conf.getBoolean("hbase.write.hbck1.lock.file", true)) {
      HBaseFsck.checkAndMarkRunningHbck(this.conf,
          HBaseFsck.createLockRetryCounterFactory(this.conf).create());
    }

后面内容比较多，我就不一一列举了，可以通过IDEA点击函数名来进行学习