当前位置: 首页 > article >正文

Hive集成Iceberg碰到的问题

背景

想基于尚硅谷的数仓的环境(完成了hadoop组件和hive的安装)做实时流的开发,于是在集成Iceberg的时候,就复用了原始的环境,但是在成功创建iceberg表,对其插入数据,执行MR的过程中报错了。

报错情况

因为在之前数仓使用hive的时候,都是配置好Hive的环境变量,通过hive 的命令直接进入客户端进行增删改查并没有发现有什么问题。

于是在集成了iceberg的时候也采用了这种方式,并且配置文件也开启了iceberg的支持,但是执行MR就报错了。

新增配置文件

<property>
    <name>iceberg.engine.hive.enabled</name>
    <value>true</value>
</property>

报错信息
在这里插入图片描述
通过 yarn logs -applicatinId 查看到了具体报错,说是缺少某个类。
在这里插入图片描述

2024-12-30 21:21:24,687 ERROR [CommitterEvent Processor #1] org.apache.hadoop.hive.metastore.RetryingHMSHandler: java.lang.NoClassDefFoundError: org/datanucleus/NucleusContext
	at java.lang.Class.forName0(Native Method)
	at java.lang.Class.forName(Class.java:348)
	at org.apache.hadoop.hive.metastore.utils.JavaUtils.getClass(JavaUtils.java:52)
	at org.apache.hadoop.hive.metastore.RawStoreProxy.getProxy(RawStoreProxy.java:65)
	at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newRawStoreForConf(HiveMetaStore.java:718)
	at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getMSForConf(HiveMetaStore.java:696)
	at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getMS(HiveMetaStore.java:690)
	at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB(HiveMetaStore.java:767)
	at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.init(HiveMetaStore.java:538)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147)
	at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108)
	at org.apache.hadoop.hive.metastore.RetryingHMSHandler.<init>(RetryingHMSHandler.java:80)
	at org.apache.hadoop.hive.metastore.RetryingHMSHandler.getProxy(RetryingHMSHandler.java:93)
	at org.apache.hadoop.hive.metastore.HiveMetaStore.newRetryingHMSHandler(HiveMetaStore.java:8667)
	at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:169)
	at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:137)
	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
	at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
	at org.apache.iceberg.common.DynConstructors$Ctor.newInstanceChecked(DynConstructors.java:60)
	at org.apache.iceberg.common.DynConstructors$Ctor.newInstance(DynConstructors.java:73)
	at org.apache.iceberg.hive.HiveClientPool.newClient(HiveClientPool.java:53)
	at org.apache.iceberg.hive.HiveClientPool.newClient(HiveClientPool.java:32)
	at org.apache.iceberg.ClientPoolImpl.get(ClientPoolImpl.java:118)
	at org.apache.iceberg.ClientPoolImpl.run(ClientPoolImpl.java:49)
	at org.apache.iceberg.hive.CachedClientPool.run(CachedClientPool.java:76)
	at org.apache.iceberg.hive.HiveTableOperations.doRefresh(HiveTableOperations.java:181)
	at org.apache.iceberg.BaseMetastoreTableOperations.refresh(BaseMetastoreTableOperations.java:94)
	at org.apache.iceberg.BaseMetastoreTableOperations.current(BaseMetastoreTableOperations.java:77)
	at org.apache.iceberg.BaseMetastoreCatalog.loadTable(BaseMetastoreCatalog.java:93)
	at org.apache.iceberg.mr.Catalogs.loadTable(Catalogs.java:115)
	at org.apache.iceberg.mr.Catalogs.loadTable(Catalogs.java:105)
	at org.apache.iceberg.mr.hive.HiveIcebergOutputCommitter.commitTable(HiveIcebergOutputCommitter.java:280)
	at org.apache.iceberg.mr.hive.HiveIcebergOutputCommitter.lambda$commitJob$2(HiveIcebergOutputCommitter.java:193)
	at org.apache.iceberg.util.Tasks$Builder.runTaskWithRetry(Tasks.java:405)
	at org.apache.iceberg.util.Tasks$Builder.runSingleThreaded(Tasks.java:214)
	at org.apache.iceberg.util.Tasks$Builder.run(Tasks.java:198)
	at org.apache.iceberg.util.Tasks$Builder.run(Tasks.java:190)
	at org.apache.iceberg.mr.hive.HiveIcebergOutputCommitter.commitJob(HiveIcebergOutputCommitter.java:188)
	at org.apache.hadoop.mapred.OutputCommitter.commitJob(OutputCommitter.java:291)
	at org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler$EventProcessor.handleJobCommit(CommitterEventHandler.java:286)
	at org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler$EventProcessor.run(CommitterEventHandler.java:238)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.ClassNotFoundException: org.datanucleus.NucleusContext
	at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
	... 50 more

往下翻,提示无法链接Hive的元数据信息
在这里插入图片描述

2024-12-30 21:21:25,128 INFO [Thread-71] org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator: Setting job diagnostics to Job commit failed: org.apache.iceberg.hive.RuntimeMetaException: Failed to connect to Hive Metastore
	at org.apache.iceberg.hive.HiveClientPool.newClient(HiveClientPool.java:62)
	at org.apache.iceberg.hive.HiveClientPool.newClient(HiveClientPool.java:32)
	at org.apache.iceberg.ClientPoolImpl.get(ClientPoolImpl.java:118)
	at org.apache.iceberg.ClientPoolImpl.run(ClientPoolImpl.java:49)
	at org.apache.iceberg.hive.CachedClientPool.run(CachedClientPool.java:76)
	at org.apache.iceberg.hive.HiveTableOperations.doRefresh(HiveTableOperations.java:181)
	at org.apache.iceberg.BaseMetastoreTableOperations.refresh(BaseMetastoreTableOperations.java:94)
	at org.apache.iceberg.BaseMetastoreTableOperations.current(BaseMetastoreTableOperations.java:77)
	at org.apache.iceberg.BaseMetastoreCatalog.loadTable(BaseMetastoreCatalog.java:93)
	at org.apache.iceberg.mr.Catalogs.loadTable(Catalogs.java:115)
	at org.apache.iceberg.mr.Catalogs.loadTable(Catalogs.java:105)
	at org.apache.iceberg.mr.hive.HiveIcebergOutputCommitter.commitTable(HiveIcebergOutputCommitter.java:280)
	at org.apache.iceberg.mr.hive.HiveIcebergOutputCommitter.lambda$commitJob$2(HiveIcebergOutputCommitter.java:193)
	at org.apache.iceberg.util.Tasks$Builder.runTaskWithRetry(Tasks.java:405)
	at org.apache.iceberg.util.Tasks$Builder.runSingleThreaded(Tasks.java:214)
	at org.apache.iceberg.util.Tasks$Builder.run(Tasks.java:198)
	at org.apache.iceberg.util.Tasks$Builder.run(Tasks.java:190)
	at org.apache.iceberg.mr.hive.HiveIcebergOutputCommitter.commitJob(HiveIcebergOutputCommitter.java:188)
	at org.apache.hadoop.mapred.OutputCommitter.commitJob(OutputCommitter.java:291)
	at org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler$EventProcessor.handleJobCommit(CommitterEventHandler.java:286)
	at org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler$EventProcessor.run(CommitterEventHandler.java:238)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
Caused by: MetaException(message:org/datanucleus/NucleusContext)
	at org.apache.hadoop.hive.metastore.RetryingHMSHandler.<init>(RetryingHMSHandler.java:84)
	at org.apache.hadoop.hive.metastore.RetryingHMSHandler.getProxy(RetryingHMSHandler.java:93)
	at org.apache.hadoop.hive.metastore.HiveMetaStore.newRetryingHMSHandler(HiveMetaStore.java:8667)
	at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:169)
	at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:137)
	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
	at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
	at org.apache.iceberg.common.DynConstructors$Ctor.newInstanceChecked(DynConstructors.java:60)
	at org.apache.iceberg.common.DynConstructors$Ctor.newInstance(DynConstructors.java:73)
	at org.apache.iceberg.hive.HiveClientPool.newClient(HiveClientPool.java:53)
	... 23 more
Caused by: java.lang.NoClassDefFoundError: org/datanucleus/NucleusContext
	at java.lang.Class.forName0(Native Method)
	at java.lang.Class.forName(Class.java:348)
	at org.apache.hadoop.hive.metastore.utils.JavaUtils.getClass(JavaUtils.java:52)
	at org.apache.hadoop.hive.metastore.RawStoreProxy.getProxy(RawStoreProxy.java:65)
	at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newRawStoreForConf(HiveMetaStore.java:718)
	at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getMSForConf(HiveMetaStore.java:696)
	at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getMS(HiveMetaStore.java:690)
	at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB(HiveMetaStore.java:767)
	at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.init(HiveMetaStore.java:538)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147)
	at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108)
	at org.apache.hadoop.hive.metastore.RetryingHMSHandler.<init>(RetryingHMSHandler.java:80)
	... 34 more
Caused by: java.lang.ClassNotFoundException: org.datanucleus.NucleusContext
	at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
	... 50 more

解决方案

最开始,以为是版本不适配,把hadoop升级从3.1.3 升级到了3.1.4,无济于事;而后,通过调整iceberg-runtime.jar的版本,也无济于事。

通过网上查阅相关的资料,发现别人在执行的过程中需要增加Hive的元数据链接的配置,并且开启hive的元数据信息,具体操作如下:
1、新增元数据链接配置信息

<property>
    <name>hive.metastore.uris</name>
    <value>thrift://hadoop101:9083</value>
</property>

2、先在后台启动hive元数据信息

hive --service metastore &

最后,完美解决。
在这里插入图片描述


http://www.kler.cn/a/464403.html

相关文章:

  • 封装/前线修饰符/Idea项目结构/package/impore
  • MCGS学习记录
  • IoC设计模式详解:控制反转的核心思想
  • HTML——56.表单发送
  • Java虚拟机(Java Virtual Machine,JVM)
  • 【大模型】7 天 AI 大模型学习
  • Bash 中的 2>1 | tee 命令详解
  • java实现预览服务器文件,不进行下载,并增加水印效果
  • 《Vue3实战教程》37:Vue3生产部署
  • 【SpringBoot教程】搭建SpringBoot项目之编写pom.xml
  • 《Java 数据结构》
  • spring-boot启动源码分析(二)之SpringApplicationRunListener
  • redis的学习(一)
  • 【人工智能机器学习基础篇】——深入详解无监督学习之聚类,理解K-Means、层次聚类、数据分组和分类
  • Flutter:邀请海报,Widget转图片,保存相册
  • 快递物流查询API接口推荐
  • 操作018:Stream Queue
  • 【2025优质学术推荐】征稿控制科学、仪器、智能系统、通信、计算机、电子信息、人工智能、大数据、机器学习、软件工程、网络安全方向
  • Leetcode打卡:分割数组
  • 使用 Python结合ffmpeg 实现单线程和多线程推流
  • 婚庆摄影小程序ssm+论文源码调试讲解
  • UE5.3 虚幻引擎 Windows插件开发打包(带源码插件打包、无源码插件打包)
  • 神经网络入门实战:(二十三)使用本地数据集进行训练和验证
  • Qt使用CMake编译项目时报错:#undefined reference to `vtable for MainView‘
  • 网络安全 | 量子计算与网络安全:未来的威胁与机遇
  • 量子计算:定义、使用方法和示例