Mac M1安装Hive
一、下载解压Hive
1.官网地址
https://dlcdn.apache.org/hive/
2.选择对应版本进行下载,这里我以3.1.3为例;
3.下载好后,进行解压,并重命名为hive-3.1.3,放到资源库目录下;
二、配置系统环境
1.打开~/.bash_profile文件
open -e ~/.bash_profile
2.添加Hadoop、Hive环境变量
export HADOOP_HOME=/Library/hadoop-3.4.0
export PATH=$PATH:$HADOOP_HOME/bin
export HIVE_HOME=/Library/hive-3.1.3
export PATH=$HIVE_HOME/bin:$PATH
3.使得配置生效
source ~/.bash_profile
4.停止hadoop并重启
如果hadoop正在运行暂停hadoop
# 进入hadoop目录
cd /Library/hadoop-3.4.0
# 停止hadoop服务
./sbin/stop-all.sh
# 启动hadoop
./sbin/start-all.sh
5.查看hive版本
hive --version
如果出现mac权限问题,解决方法参考链接
hive --version
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/Library/hive-3.1.3/lib/log4j-slf4j-impl-2.18.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/Library/hadoop-3.4.0/libexec/share/hadoop/common/lib/slf4j-reload4j-1.7.36.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/Library/hive-3.1.3/lib/log4j-slf4j-impl-2.18.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/Library/hadoop-3.4.0/libexec/share/hadoop/common/lib/slf4j-reload4j-1.7.36.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Hive 4.0.0
Git git://MacBook-Air.local/Users/xxx/projects/hive/fork/hive -r 183f8cb41d3dbed961ffd27999876468ff06690c
Compiled by xxx on Mon Mar 25 12:44:09 CET 2024
From source with checksum e3c64bec52632c61cf7214c8b545b564
三、修改Hive配置文件
1.重命名conf文件夹下的hive-default.xml.template
cd /Library/hive-3.1.3/conf
mv hive-default.xml.template hive-default.xml
2.新建hive-site.xml
vim hive-site.xml
open -e hive-site.xml
在hive-site.xml
文件中添加如下内容:
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://localhost:3306/hive?createDatabaseIfNotExist=true&useSSL=false&allowPublicKeyRetrieval=true</value>
<description>JDBC connect string for a JDBC metastore</description>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
<description>Driver class name for a JDBC metastore</description>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>hive</value>
<description>username to use against metastore database</description>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>hive</value>
<description>password to use against metastore database</description>
</property>
</configuration>
保存并关闭;
四、安装并配置mysql
1.下载Mysql驱动
MySQL 驱动下载网址
2.将下载的压缩包解压,找到mysql-connector-j-8.2.0.jar文件,将该文件拷贝到/Library/hive-3.1.3/lib
目录下。
3.确保你的电脑安装过mysql
打开终端,执行如下命令:
mysql -u root -p
如果你确定装过mysql,但是执行上述命令后,发现不存在mysql命令,那说明你的系统环境没有配置。
# 查看 mysql 安装路径
which mysql
若出现路径,则存在mysql。
# 打开如下配置文件
open -e ~/.bash_profile
export PATH=${PATH}:/usr/local/mysql/bin/
配置好后保存,使配置文件生效。
source ~/.bash_profile
4.重新执行mysql 登录
mysql -u root -p
回车,输入密码。
五、创建hive数据库
1.创建hive数据库
create database hive;
CREATE USER 'hive'@'localhost' IDENTIFIED BY 'hive';
GRANT ALL ON *.* TO 'hive'@'localhost';
#刷新mysql系统权限关系表
flush privileges;
2.使用Hive自带的schematool工具升级元数据
cd /Library/hive-3.1.3
./bin/schematool -initSchema -dbType mysql
出现一段空白,接着出现Initialization script completed。
六、修改Hadoop配置文件并重启
1.编辑Hadoop的core-site.xml配置文件
cd /Library/hadoop-3.4.0
open -e core-site.xml
在文件中补充如下内容:
<property>
<name>hadoop.proxyuser.用户名.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.用户名.groups</name>
<value>*</value>
</property>
保存并推出。
用户名通过如下命令查看
whoami
2.重启Hadoop集群
cd /Library/hadoop-3.4.0/sbin
./stop-all.sh
./satrt-all.sh
七、启动Hive并访问
1.启动hive
cd /Library/hive-3.1.3/bin
hive --service hiveserver2&
启动后,出现如下命令证明启动成功:
SLF4J: Found binding in [jar:file:/Library/hive-3.1.3/lib/log4j-slf4j-impl-2.18.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/Library/hadoop-3.4.0/libexec/share/hadoop/common/lib/slf4j-reload4j-1.7.36.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Hive Session ID = 7afcdcf2-3f5d-4912-99b9-4e57a7ef3a03
Hive Session ID = 51a21ecb-ae06-4ed3-a6b7-a62444b5836e
2.浏览器访问
再去浏览器里输入
http://localhost:10002
就可以见到hive的web界面了
3.客户端访问
在bin目录下,重启一个终端,输入如下命令:
beeline
!connect jdbc:hive2://localhost:10000
用户名hive,密码hive
成功登录出现
0: jdbc:hive2://localhost:10000>
show databases;
执行上面命令后,若有结果输出,则证明配置完成了。
关闭hiveServer2时,执行如下命令查看hive进程;
ps aux | grep hive
进程大概是下面内容:
52843 0.1 2.9 413527440 489776 s003 SN 8:58下午 0:17.23 /Library/Java/JavaVirtualMachines/jdk-1.8.jdk/Contents/Home/bin/java -Dproc_jar -Dproc_hiveserver2 -Dlog4j2.formatMsgNoLookups=true -Dlog4j.configurationFile=hive-log4j2.properties -Djava.util.logging.config.file=/Library/hive-3.1.3/conf/parquet-logging.properties -Djline.terminal=jline.Unsuppo
4.通过如下命令杀死该进程
kill -9 52843
八、通过DBeaver连接Hive
主机:localhost
认证:Database Native
用户名:hive
密码:hive
九、利用Python连接Hive库,并在库中插入数据
代码如下:
from pyhive import hive
def ConnectHive(addr, port, user, pwd, db, auth):
try:
# 1. Create connection
conn = hive.Connection(host=addr, port=port, username=user, password=pwd, database=db, auth=auth)
print("Hive connection successful!")
# 2. Use Hive SQL to create table
create_tab_sql = """CREATE TABLE users (id INT, name STRING, age INT, address STRING)"""
# 3. Execute SQL with cursor
with conn.cursor() as cursor:
cursor.execute(create_tab_sql)
print("User table created successfully!")
except Exception as e:
print(f"An error occurred: {e}")
finally:
# 4. Close connection
if conn:
conn.close()
print("Hive connection closed.")
if __name__ == '__main__':
addr = '127.0.0.1'
port = 10000
user = 'hive'
pwd = 'hive'
db = 'default'
auth = 'LDAP'
ConnectHive(addr, port, user, pwd, db, auth)
在执行这段代码时,出现下面报错:
org.apache.hadoop.security.AccessControlException Permission denied: user=hive, access=WRITE, inode="/"
说明hive用户没有写HDFS根目录(/)的权限。
2.解决办法
# 列出hdfs根目录的内容并查看权限
hdfs dfs -ls /
hdfs dfs -chmod 755 /
hdfs dfs -chown hive:supergroup /
再执行如下命令:
hdfs dfs -ls /