当前位置: 首页 > article >正文

【AI】openEuler 22.03 LTS SP4安装 docker NVIDIA Container Toolkit

NVIDIA Container Toolkit

打开网址

Unsupported distribution or misconfigured repository settings | NVIDIA Container Toolkit

为方便离线安装,先下载过来

wget https://nvidia.github.io/libnvidia-container/stable/rpm/nvidia-container-toolkit.repo
mkdir rpms
yumdownloader --resolve --destdir=./rpms/ nvidia-container-toolkit

离线安装

# yum install ./*.rpm
Last metadata expiration check: 0:12:41 ago on Fri 21 Feb 2025 05:15:45 PM CST.
Dependencies resolved.
=================================================================================================================================================================
 Package                                              Architecture                  Version                            Repository                           Size
=================================================================================================================================================================
Installing:
 libnvidia-container-tools                            x86_64                        1.17.4-1                           @commandline                         40 k
 libnvidia-container1                                 x86_64                        1.17.4-1                           @commandline                        1.0 M
 nvidia-container-toolkit                             x86_64                        1.17.4-1                           @commandline                        1.2 M
 nvidia-container-toolkit-base                        x86_64                        1.17.4-1                           @commandline                        5.6 M

Transaction Summary
=================================================================================================================================================================
Install  4 Packages

Total size: 7.9 M
Installed size: 26 M
Is this ok [y/N]: y
Downloading Packages:
Running transaction check
Transaction check succeeded.
Running transaction test
Transaction test succeeded.
Running transaction
  Preparing        :                                                                                                                                         1/1
  Installing       : nvidia-container-toolkit-base-1.17.4-1.x86_64                                                                                           1/4
  Installing       : libnvidia-container1-1.17.4-1.x86_64                                                                                                    2/4
  Running scriptlet: libnvidia-container1-1.17.4-1.x86_64                                                                                                    2/4
  Installing       : libnvidia-container-tools-1.17.4-1.x86_64                                                                                               3/4
  Installing       : nvidia-container-toolkit-1.17.4-1.x86_64                                                                                                4/4
  Running scriptlet: nvidia-container-toolkit-1.17.4-1.x86_64                                                                                                4/4
  Verifying        : libnvidia-container1-1.17.4-1.x86_64                                                                                                    1/4
  Verifying        : libnvidia-container-tools-1.17.4-1.x86_64                                                                                               2/4
  Verifying        : nvidia-container-toolkit-1.17.4-1.x86_64                                                                                                3/4
  Verifying        : nvidia-container-toolkit-base-1.17.4-1.x86_64                                                                                           4/4

Installed:
  libnvidia-container-tools-1.17.4-1.x86_64                 libnvidia-container1-1.17.4-1.x86_64             nvidia-container-toolkit-1.17.4-1.x86_64
  nvidia-container-toolkit-base-1.17.4-1.x86_64

Complete!

Docker

手动下载最新版本

https://download.docker.com/linux/static/stable/x86_64/docker-28.0.0.tgz

wget https://download.docker.com/linux/static/stable/x86_64/docker-28.0.0.tgz
[root@localhost media]# tar -xvf docker-28.0.0.tgz
docker/
docker/containerd-shim-runc-v2
docker/containerd
docker/docker
docker/runc
docker/ctr
docker/dockerd
docker/docker-init
docker/docker-proxy
[root@localhost media]# mv -v docker/* /usr/local/bin/
renamed 'docker/containerd' -> '/usr/local/bin/containerd'
renamed 'docker/containerd-shim-runc-v2' -> '/usr/local/bin/containerd-shim-runc-v2'
renamed 'docker/ctr' -> '/usr/local/bin/ctr'
renamed 'docker/docker' -> '/usr/local/bin/docker'
renamed 'docker/dockerd' -> '/usr/local/bin/dockerd'
renamed 'docker/docker-init' -> '/usr/local/bin/docker-init'
renamed 'docker/docker-proxy' -> '/usr/local/bin/docker-proxy'
renamed 'docker/runc' -> '/usr/local/bin/runc'
[root@localhost media]# ll docker
total 0
[root@localhost media]# ll /usr/local/bin/
total 206856
-rwxr-xr-x. 1 1000 1000 40415384 Feb 20 06:11 containerd
-rwxr-xr-x. 1 1000 1000 13299864 Feb 20 06:11 containerd-shim-runc-v2
-rwxr-xr-x. 1 1000 1000 20394136 Feb 20 06:11 ctr
-rwxr-xr-x. 1 1000 1000 41532216 Feb 20 06:11 docker
-rwxr-xr-x. 1 1000 1000 76647872 Feb 20 06:11 dockerd
-rwxr-xr-x. 1 1000 1000   708448 Feb 20 06:11 docker-init
-rwxr-xr-x. 1 1000 1000  2377328 Feb 20 06:11 docker-proxy
-rwxr-xr-x. 1 1000 1000 16426200 Feb 20 06:11 runc

创建  /usr/lib/systemd/system/docker.service

[Unit]
Description=Docker Application Container Engine
Documentation=https://docs.docker.com
After=network-online.target firewalld.service
Wants=network-online.target

[Service]
Type=notify
EnvironmentFile=-/etc/sysconfig/docker
EnvironmentFile=-/etc/sysconfig/docker-storage
EnvironmentFile=-/etc/sysconfig/docker-network
Environment=GOTRACEBACK=crash
Environment=GOTRACEBACK=crash

ExecStart=/usr/local/bin/dockerd $OPTIONS \
                           $DOCKER_STORAGE_OPTIONS \
                           $DOCKER_NETWORK_OPTIONS \
                           $INSECURE_REGISTRY
ExecReload=/bin/kill -s HUP $MAINPID
LimitNOFILE=1048576
LimitNPROC=1048576
LimitCORE=infinity
# set delegate yes so that systemd does not reset the cgroups of docker containers
Delegate=yes
# kill only the docker process, not all processes in the cgroup
KillMode=process

[Install]
WantedBy=multi-user.target

nvidia-ctk配置runtime

[root@localhost media]# nvidia-ctk runtime configure --runtime=docker
INFO[0000] Config file does not exist; using empty config
INFO[0000] Wrote updated config to /etc/docker/daemon.json
INFO[0000] It is recommended that docker daemon be restarted.
[root@localhost media]# cat /etc/docker/daemon.json
{
    "runtimes": {
        "nvidia": {
            "args": [],
            "path": "nvidia-container-runtime"
        }
    }
}

启动Docker服务

[root@localhost media]# systemctl enable docker --now
Created symlink /etc/systemd/system/multi-user.target.wants/docker.service → /usr/lib/systemd/system/docker.service.
[root@localhost ~]# docker info
Client:
 Version:    28.0.0
 Context:    default
 Debug Mode: false

Server:
 Containers: 0
  Running: 0
  Paused: 0
  Stopped: 0
 Images: 0
 Server Version: 28.0.0
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Using metacopy: false
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Cgroup Version: 1
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local splunk syslog
 Swarm: inactive
 Runtimes: nvidia runc io.containerd.runc.v2
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: bcc810d6b9066471b0b6fa75f557a15a1cbf31bb
 runc version: v1.2.5-0-g59923ef
 init version: de40ad0
 Security Options:
  seccomp
   Profile: builtin
 Kernel Version: 5.10.0-216.0.0.115.oe2203sp4.x86_64
 Operating System: openEuler 22.03 (LTS-SP4)
 OSType: linux
 Architecture: x86_64
 CPUs: 128
 Total Memory: 30.46GiB
 Name: localhost.localdomain
 ID: e146eb60-c3e3-41d9-bf61-71e7cd5707f9
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Experimental: false
 Insecure Registries:
  ::1/128
  127.0.0.0/8
 Live Restore Enabled: false
 Product License: Community Engine

验证Docker nvidia-smi

随便找个镜像,采用--gpus=all参数执行nvidia-smi,如果不配置--gpus参数,容器内没有注入nvidia-smi指令

[root@localhost ollama]# docker run --rm -it ubuntu:22.04 nvidia-smi -l 1
docker: Error response from daemon: failed to create task for container: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: exec: "nvidia-smi": executable file not found in $PATH: unknown

Run 'docker run --help' for more information
[root@localhost ollama]# docker run --rm -it --gpus=all ubuntu:22.04 nvidia-smi -l 1
Fri Feb 21 10:08:47 2025
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 570.86.10              Driver Version: 570.86.10      CUDA Version: 12.8     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce RTX 4090        Off |   00000000:0C:00.0 Off |                  Off |
| 30%   27C    P8             18W /  450W |    8173MiB /  24564MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   1  NVIDIA GeForce RTX 4090        Off |   00000000:25:00.0 Off |                  Off |
| 30%   28C    P8             28W /  450W |    7821MiB /  24564MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   2  NVIDIA GeForce RTX 4090        Off |   00000000:32:00.0 Off |                  Off |
| 30%   27C    P8              5W /  450W |    7821MiB /  24564MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   3  NVIDIA GeForce RTX 4090        Off |   00000000:45:00.0 Off |                  Off |
| 30%   27C    P8             30W /  450W |    7821MiB /  24564MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   4  NVIDIA GeForce RTX 4090        Off |   00000000:58:00.0 Off |                  Off |
| 30%   28C    P8             18W /  450W |    7327MiB /  24564MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   5  NVIDIA GeForce RTX 4090        Off |   00000000:84:00.0 Off |                  Off |
| 30%   28C    P8             21W /  450W |    7327MiB /  24564MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   6  NVIDIA GeForce RTX 4090        Off |   00000000:D4:00.0 Off |                  Off |
| 30%   28C    P8             22W /  450W |    8009MiB /  24564MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI              PID   Type   Process name                        GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|  No running processes found                                                             |
+-----------------------------------------------------------------------------------------+
Fri Feb 21 10:08:49 2025
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 570.86.10              Driver Version: 570.86.10      CUDA Version: 12.8     |
|-----------------------------------------+------------------------+----------------------+

参考 

nstalling the NVIDIA Container Toolkit — NVIDIA Container Toolkit


http://www.kler.cn/a/556506.html

相关文章:

  • 在nodejs中使用ElasticSearch(二)核心概念,应用
  • c++17 std::timespec_get 简介
  • 性格测评小程序10生成报告
  • SHELL32!SHLoadPopupMenu函数分析之添加属性菜单项
  • 1.22作业
  • 基于 JavaWeb 的 Spring Boot 网上商城系统设计和实现(源码+文档+部署讲解)
  • 【学习笔记】Cadence电子设计全流程(二)原理图库的创建与设计(8-15)
  • RabbitMQ的脑裂(网络分区)问题
  • go 网络编程 websocket gorilla/websocket
  • python网络安全怎么学 python做网络安全
  • 视觉应用工程师(面试)
  • C语言之宏定义
  • 在低功耗MCU上实现人工智能和机器学习
  • DeepSeek 点燃关键技术突破的科技引擎,驶向未来新航道
  • 电商搜索API的Elasticsearch优化策略
  • CSS 布局技术深度解析:从传统到现代的核心布局方案
  • 猿大师播放器:网页播放RTSP H.265零转码革命延迟低更流畅智慧安防首选
  • 华为昇腾服务器固件Firmware、驱动Drive、CANN各自的作用与联系?
  • 大模型产品Deepseek(八)、数据嵌入+知识库管理+联网搜索,实现精准的知识查询
  • 回溯算法:非递减子序列子集,这题的去重并不是通解!!!!