当前位置：首页 > article >正文

ubuntu下安装 git 及部署cosyvoice(2)

article 2025/2/2 6:05:50

上一节已经可以了一部分。这一节，主要是让他动起来。

1.第一个错误

(cosyvoice) duyicheng@duyicheng-computer:~/gitee/CosyVoice$ python webui.py
Traceback (most recent call last):
File "webui.py", line 17, in <module>
import gradio as gr
ModuleNotFoundError: No module named 'gradio'
(cosyvoice) duyicheng@duyicheng-computer:~/gitee/CosyVoice$

解决方法

pip install gradio

忘了一件事，应该是找文件安装的。

哈。

pip install -r requirements.txt

--extra-index-url https://download.pytorch.org/whl/cu118
conformer==0.3.2
deepspeed==0.14.2; sys_platform == 'linux'
diffusers==0.27.2
gdown==5.1.0
gradio==4.32.2
grpcio==1.57.0
grpcio-tools==1.57.0
huggingface-hub==0.23.5
hydra-core==1.3.2
HyperPyYAML==1.2.2
inflect==7.3.1
librosa==0.10.2
lightning==2.2.4
matplotlib==3.7.5
modelscope==1.15.0
networkx==3.1
omegaconf==2.3.0
onnx==1.16.0
onnxruntime-gpu==1.16.0; sys_platform == 'linux'
onnxruntime==1.16.0; sys_platform == 'darwin' or sys_platform == 'windows'
openai-whisper==20231117
protobuf==4.25
pydantic==2.7.0
rich==13.7.1
soundfile==0.12.1
tensorboard==2.14.0
torch==2.0.1
torchaudio==2.0.2
uvicorn==0.30.0
wget==3.2
fastapi==0.111.0
fastapi-cli==0.0.4
WeTextProcessing==1.0.3

漫长的等等。

太慢了，用下载器下

下载后安装

(cosyvoice) duyicheng@duyicheng-computer:~/gitee/CosyVoice$ pip install /home/duyicheng/Downloads/torch-2.0.1+cu118-cp38-cp38-linux_x86_64_1.whl

(cosyvoice) duyicheng@duyicheng-computer:~/gitee/CosyVoice$ pip install /home/duyicheng/Downloads/torch-2.0.1+cu118-cp38-cp38-linux_x86_64_1.whl
Looking in indexes: https://mirrors.tuna.tsinghua.edu.cn/pypi/web/simple
ERROR: torch-2.0.1+cu118-cp38-cp38-linux_x86_64_1.whl is not a supported wheel on this platform.
(cosyvoice) duyicheng@duyicheng-computer:~/gitee/CosyVoice$

出错了。

sudo apt install nvidia-cuda-toolkit

又是很长的等等

(cosyvoice) duyicheng@duyicheng-computer:~/gitee/CosyVoice$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2023 NVIDIA Corporation
Built on Fri_Jan__6_16:45:21_PST_2023
Cuda compilation tools, release 12.0, V12.0.140
Build cuda_12.0.r12.0/compiler.32267302_0

如何不确定可使用在线

如

pip install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu118

我用的是120的版本。

所以

(cosyvoice) duyicheng@duyicheng-computer:~/gitee/CosyVoice$ pip install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu120
Looking in indexes: https://mirrors.tuna.tsinghua.edu.cn/pypi/web/simple, https://download.pytorch.org/whl/cu120
Collecting torch
  Downloading https://mirrors.tuna.tsinghua.edu.cn/pypi/web/packages/a9/71/45aac46b75742e08d2d6f9fc2b612223b5e36115b8b2ed673b23c21b5387/torch-2.4.1-cp38-cp38-manylinux1_x86_64.whl (797.1 MB)
     ━━━━━━━━━━━━━━━━━━━╸━━━━━━━━━━━━━━━━━━━━ 393.5/797.1 MB 11.1 MB/s eta 0:00:37

修改了需要列表

--extra-index-url https://download.pytorch.org/whl/cu120
conformer==0.3.2
deepspeed==0.14.2
diffusers==0.27.2
gdown==5.1.0
gradio==4.32.2
grpcio==1.57.0
grpcio-tools==1.57.0
huggingface-hub==0.23.5
hydra-core==1.3.2
HyperPyYAML==1.2.2
inflect==7.3.1
librosa==0.10.2
lightning==2.2.4
matplotlib==3.7.5
modelscope==1.15.0
networkx==3.1
omegaconf==2.3.0
onnx==1.16.0
onnxruntime-gpu==1.16.0
openai-whisper==20231117
protobuf==4.25
pydantic==2.7.0
rich==13.7.1
soundfile==0.12.1
tensorboard==2.14.0
torch==2.0.1
torchaudio==2.0.2
uvicorn==0.30.0
wget==3.2
fastapi==0.111.0
fastapi-cli==0.0.4
WeTextProcessing==1.0.3

有一个提示。与cuda12.0兼容的问题，似乎可以先忽略。过下测试下，如果不成的，再更新这个。

ERROR: pip's dependency resolver does not currently take into account all the packages t77hat are installed. This behaviour is the source of the following dependency conflicts.
torchvision 0.19.1 requires torch==2.4.1, but you have torch 2.0.1 which is incompatible.

再测试一下。

(cosyvoice) duyicheng@duyicheng-computer:~/gitee/CosyVoice$ python webui.py
2024-11-06 17:48:34,901 - modelscope - INFO - PyTorch version 2.0.1 Found.
2024-11-06 17:48:34,902 - modelscope - INFO - Loading ast index from /home/duyicheng/.cache/modelscope/ast_indexer
2024-11-06 17:48:34,902 - modelscope - INFO - No valid ast index found from /home/duyicheng/.cache/modelscope/ast_indexer, generating ast index from prebuilt!
2024-11-06 17:48:35,021 - modelscope - INFO - Loading done! Current index file version is 1.15.0, with md5 87f1fcfbd39133baac40066388797472 and a total number of 980 components indexed
transformer is not installed, please install it if you want to use related modules
failed to import ttsfrd, use WeTextProcessing instead
2024-11-06 17:48:44,066 DEBUG Starting new HTTPS connection (1): www.modelscope.cn:443
2024-11-06 17:48:44,260 DEBUG https://www.modelscope.cn:443 "GET /api/v1/models/pretrained_models/CosyVoice-300M/revisions HTTP/11" 404 152
2024-11-06 17:48:44,262 - modelscope - ERROR - The request model: pretrained_models/CosyVoice-300M does not exist!
Traceback (most recent call last):
  File "webui.py", line 184, in <module>
    cosyvoice = CosyVoice(args.model_dir)
  File "/home/duyicheng/gitee/CosyVoice/cosyvoice/cli/cosyvoice.py", line 30, in __init__
    model_dir = snapshot_download(model_dir)
  File "/home/duyicheng/anaconda3/envs/cosyvoice/lib/python3.8/site-packages/modelscope/hub/snapshot_download.py", line 94, in snapshot_download
    revision_detail = _api.get_valid_revision_detail(
  File "/home/duyicheng/anaconda3/envs/cosyvoice/lib/python3.8/site-packages/modelscope/hub/api.py", line 499, in get_valid_revision_detail
    all_branches_detail, all_tags_detail = self.get_model_branches_and_tags_details(
  File "/home/duyicheng/anaconda3/envs/cosyvoice/lib/python3.8/site-packages/modelscope/hub/api.py", line 579, in get_model_branches_and_tags_details
    handle_http_response(r, logger, cookies, model_id)
  File "/home/duyicheng/anaconda3/envs/cosyvoice/lib/python3.8/site-packages/modelscope/hub/errors.py", line 117, in handle_http_response
    raise HTTPError(http_error_msg, response=response)
requests.exceptions.HTTPError: The request model: pretrained_models/CosyVoice-300M does not exist!
(cosyvoice) duyicheng@duyicheng-computer:~/gitee/CosyVoice$

没装模型啊，写一个py文件，用于下载模型，如果没有modelscope的话，pip install modelscope

# SDK模型下载
from modelscope import snapshot_download
snapshot_download('iic/CosyVoice-300M', local_dir='pretrained_models/CosyVoice-300M')
snapshot_download('iic/CosyVoice-300M-25Hz', local_dir='pretrained_models/CosyVoice-300M-25Hz')
snapshot_download('iic/CosyVoice-300M-SFT', local_dir='pretrained_models/CosyVoice-300M-SFT')
snapshot_download('iic/CosyVoice-300M-Instruct', local_dir='pretrained_models/CosyVoice-300M-Instruct')
snapshot_download('iic/CosyVoice-ttsfrd', local_dir='pretrained_models/CosyVoice-ttsfrd')

运行这个py

(cosyvoice) duyicheng@duyicheng-computer:~/gitee/CosyVoice$ python downmodels.py
2024-11-07 07:49:34,382 - modelscope - INFO - PyTorch version 2.0.1 Found.
2024-11-07 07:49:34,383 - modelscope - INFO - Loading ast index from /home/duyicheng/.cache/modelscope/ast_indexer
2024-11-07 07:49:34,453 - modelscope - INFO - Loading done! Current index file version is 1.15.0, with md5 87f1fcfbd39133baac40066388797472 and a total number of 980 components indexed
transformer is not installed, please install it if you want to use related modules

漫长的等待。

提示：transformer 没安装？等下看看

`transformer` 是一种深度学习模型架构，最初由 Vaswani 等人在 2017 年的论文《Attention is All You Need》中提出。这种模型在自然语言处理（NLP）任务中取得了显著的成功，并且已经被广泛应用于各种领域，包括但不限于文本生成、机器翻译、情感分析、问答系统等。

### 主要特点
1. **自注意力机制（Self-Attention Mechanism）**：
   - **并行化**：与传统的循环神经网络（RNN）不同，Transformer 模型可以完全并行化，从而大大加快训练速度。
   - **长距离依赖**：自注意力机制使得模型能够更好地捕捉输入序列中的长距离依赖关系。

2. **位置编码（Positional Encoding）**：
   - 由于 Transformer 模型没有内在的时间/顺序信息，位置编码被添加到输入嵌入中，以提供序列中每个位置的信息。

3. **多头注意力（Multi-Head Attention）**：
   - 多头注意力机制允许模型在不同的表示子空间中并行地关注不同的信息，从而增强模型的表达能力。

4. **前馈神经网络（Feed-Forward Neural Network）**：
   - 每个 Transformer 层包含一个前馈神经网络，用于对输入进行非线性变换。

### 应用场景
1. **自然语言处理（NLP）**：
   - **机器翻译**：如 Google 的 Transformer 模型在多种语言对的翻译任务中取得了显著的性能提升。
   - **文本生成**：如 GPT-3 和 BERT 等模型在生成高质量的文本方面表现出色。
   - **情感分析**：用于识别和分类文本中的情感倾向。
   - **问答系统**：如 BERT 在 SQuAD 数据集上的表现。

2. **计算机视觉（CV）**：
   - **图像分类**：如 ViT（Vision Transformer）在图像分类任务中取得了与传统卷积神经网络（CNN）相当甚至更好的性能。
   - **目标检测**：如 DETR（DEtection TRansformer）在目标检测任务中表现出色。

3. **语音处理**：
   - **语音识别**：如 Conformer 模型在语音识别任务中取得了显著的性能提升。
   - **语音合成**：用于生成高质量的语音信号。

### 实现库
- **Hugging Face Transformers**：这是一个非常流行的库，提供了多种预训练的 Transformer 模型，支持多种任务和语言。
- **PyTorch** 和 **TensorFlow**：这两个深度学习框架都提供了实现 Transformer 模型的工具和库。

### 示例代码
以下是一个简单的示例，展示如何使用 Hugging Face 的 `transformers` 库进行文本分类：

```python
from transformers import pipeline

# 加载预训练的文本分类模型
classifier = pipeline("sentiment-analysis")

# 输入文本
text = "I love using transformers for NLP tasks!"

# 进行情感分析
result = classifier(text)

print(result)
```

### 总结
Transformer 模型通过其独特的自注意力机制和并行化能力，在多种任务中展现了强大的性能。它不仅在自然语言处理领域取得了重大突破，还在计算机视觉和语音处理等领域展示了广泛的应用前景。

pip install transformers

export PYTHONPATH=third_party/Matcha-TTS   非win平台，
 
 
set PYTHONPATH=third_party\Matcha-TTS  win平台，真坑

出错了

(cosyvoice) duyicheng@duyicheng-computer:~/gitee/CosyVoice$ python webui.py
2024-11-07 08:34:18,477 - modelscope - INFO - PyTorch version 2.0.1 Found.
2024-11-07 08:34:18,478 - modelscope - INFO - Loading ast index from /home/duyicheng/.cache/modelscope/ast_indexer
2024-11-07 08:34:18,546 - modelscope - INFO - Loading done! Current index file version is 1.15.0, with md5 87f1fcfbd39133baac40066388797472 and a total number of 980 components indexed
Traceback (most recent call last):
File "/home/duyicheng/anaconda3/envs/cosyvoice/lib/python3.8/site-packages/transformers/utils/import_utils.py", line 1778, in _get_module
    return importlib.import_module("." + module_name, self.__name__)
File "/home/duyicheng/anaconda3/envs/cosyvoice/lib/python3.8/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
File "<frozen importlib._bootstrap>", line 1014, in _gcd_import
File "<frozen importlib._bootstrap>", line 991, in _find_and_load
File "<frozen importlib._bootstrap>", line 975, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 671, in _load_unlocked
File "<frozen importlib._bootstrap_external>", line 843, in exec_module
File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
File "/home/duyicheng/anaconda3/envs/cosyvoice/lib/python3.8/site-packages/transformers/modeling_utils.py", line 48, in <module>
    from .loss.loss_utils import LOSS_MAPPING
File "/home/duyicheng/anaconda3/envs/cosyvoice/lib/python3.8/site-packages/transformers/loss/loss_utils.py", line 19, in <module>
    from .loss_deformable_detr import DeformableDetrForObjectDetectionLoss, DeformableDetrForSegmentationLoss
File "/home/duyicheng/anaconda3/envs/cosyvoice/lib/python3.8/site-packages/transformers/loss/loss_deformable_detr.py", line 4, in <module>
    from ..image_transforms import center_to_corners_format
File "/home/duyicheng/anaconda3/envs/cosyvoice/lib/python3.8/site-packages/transformers/image_transforms.py", line 22, in <module>
    from .image_utils import (
File "/home/duyicheng/anaconda3/envs/cosyvoice/lib/python3.8/site-packages/transformers/image_utils.py", line 58, in <module>
    from torchvision.transforms import InterpolationMode
File "/home/duyicheng/anaconda3/envs/cosyvoice/lib/python3.8/site-packages/torchvision/__init__.py", line 10, in <module>
    from torchvision import _meta_registrations, datasets, io, models, ops, transforms, utils # usort:skip
File "/home/duyicheng/anaconda3/envs/cosyvoice/lib/python3.8/site-packages/torchvision/_meta_registrations.py", line 4, in <module>
    import torch._custom_ops
ModuleNotFoundError: No module named 'torch._custom_ops'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "webui.py", line 25, in <module>
    from cosyvoice.cli.cosyvoice import CosyVoice
File "/home/duyicheng/gitee/CosyVoice/cosyvoice/cli/cosyvoice.py", line 18, in <module>
    from modelscope import snapshot_download
File "/home/duyicheng/anaconda3/envs/cosyvoice/lib/python3.8/site-packages/modelscope/__init__.py", line 115, in <module>
    fix_transformers_upgrade()
File "/home/duyicheng/anaconda3/envs/cosyvoice/lib/python3.8/site-packages/modelscope/utils/automodel_utils.py", line 44, in fix_transformers_upgrade
    from transformers import PreTrainedModel
File "<frozen importlib._bootstrap>", line 1039, in _handle_fromlist
File "/home/duyicheng/anaconda3/envs/cosyvoice/lib/python3.8/site-packages/transformers/utils/import_utils.py", line 1766, in __getattr__
    module = self._get_module(self._class_to_module[name])
File "/home/duyicheng/anaconda3/envs/cosyvoice/lib/python3.8/site-packages/transformers/utils/import_utils.py", line 1780, in _get_module
    raise RuntimeError(
RuntimeError: Failed to import transformers.modeling_utils because of the following error (look up to see its traceback):
No module named 'torch._custom_ops'

更新：

pip install --upgrade torch torchvision torchaudio

又出错了。

(cosyvoice) duyicheng@duyicheng-computer:~/gitee/CosyVoice$ python webui.py
Traceback (most recent call last):
File "webui.py", line 19, in <module>
import torch
File "/home/duyicheng/anaconda3/envs/cosyvoice/lib/python3.8/site-packages/torch/__init__.py", line 290, in <module>
from torch._C import * # noqa: F403
ImportError: /home/duyicheng/anaconda3/envs/cosyvoice/lib/python3.8/site-packages/torch/lib/libtorch_cuda.so: undefined symbol: ncclCommRegister
(cosyvoice) duyicheng@duyicheng-computer:~/gitee/CosyVoice$

【ImportError】from torch._C import * # noqa: F403； ImportError: xxx: defined symbol: iJIT_NotifyEvent-CSDN博客

按照上面的完成后，又出错了。

(cosyvoice) duyicheng@duyicheng-computer:~/gitee/CosyVoice$ python webui.py
Traceback (most recent call last):
File "webui.py", line 19, in <module>
import torch
File "/home/duyicheng/anaconda3/envs/cosyvoice/lib/python3.8/site-packages/torch/__init__.py", line 290, in <module>
from torch._C import * # noqa: F403
ImportError: /home/duyicheng/anaconda3/envs/cosyvoice/lib/python3.8/site-packages/torch/lib/libtorch_cuda.so: undefined symbol: ncclCommRegister

再改：

根据你提供的错误信息，`libtorch_cuda.so` 文件中缺少 `ncclCommRegister` 符号，这通常是因为 NCCL 库未正确安装或版本不匹配。以下是一些详细的步骤来解决这个问题：

### 1. 确认 NCCL 库已安装
首先，确保 NCCL 库已正确安装。你可以通过以下命令检查：

```bash
dpkg -l | grep libnccl
```

如果没有安装，可以使用以下命令安装：

```bash
sudo apt-get update
sudo apt-get install libnccl2 libnccl-dev
```

### 2. 确认 NCCL 库路径
确保 NCCL 库路径已正确设置。你可以通过以下命令检查：

```bash
ldconfig -p | grep nccl
```

如果路径不正确或缺失，你可以手动添加路径到 `LD_LIBRARY_PATH`：

```bash
export LD_LIBRARY_PATH=/usr/lib/x86_64-linux-gnu:$LD_LIBRARY_PATH
```

### 3. 重新安装 PyTorch
有时重新安装 PyTorch 可以解决依赖问题。你可以尝试卸载并重新安装 PyTorch：

```bash
pip uninstall torch torchvision torchaudio
pip install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu120
```

### 4. 使用 Conda 安装 PyTorch
如果你使用的是 Anaconda，可以尝试使用 Conda 来安装 PyTorch，以确保依赖关系正确：

```bash
conda remove pytorch torchvision torchaudio
conda install pytorch torchvision torchaudio cudatoolkit=12.0 -c pytorch
```

### 5. 检查 CUDA 版本
确保你的 CUDA 版本与 PyTorch 版本兼容。你可以通过以下命令检查 CUDA 版本：

```bash
nvcc --version
```

### 6. 检查 `libtorch_cuda.so` 文件
确保 `libtorch_cuda.so` 文件存在且未损坏。你可以通过以下命令检查文件：

```bash
ls -l /home/duyicheng/anaconda3/envs/cosyvoice/lib/python3.8/site-packages/torch/lib/libtorch_cuda.so
```

### 7. 检查 NCCL 库路径
确保 NCCL 库路径正确。你可以通过以下命令检查 NCCL 库路径：

```bash
ldd /home/duyicheng/anaconda3/envs/cosyvoice/lib/python3.8/site-packages/torch/lib/libtorch_cuda.so | grep nccl
```

如果 `nccl` 库路径不正确或缺失，你可以手动添加路径到 `LD_LIBRARY_PATH`：

```bash
export LD_LIBRARY_PATH=/usr/local/cuda-12.0/lib64:/usr/lib/x86_64-linux-gnu:$LD_LIBRARY_PATH
```

### 8. 重新创建虚拟环境
如果上述方法都无法解决问题，你可以尝试重新创建虚拟环境并重新安装所有依赖项：

```bash
conda create -n new_cosyvoice python=3.8
conda activate new_cosyvoice
conda install pytorch torchvision torchaudio cudatoolkit=12.0 -c pytorch
pip install -r requirements.txt # 假设你有一个 requirements.txt 文件
```

### 9. 检查 PyTorch 安装日志
如果问题仍然存在，可以查看 PyTorch 的安装日志，看看是否有其他依赖项缺失或配置错误。你可以通过以下命令查看安装日志：

```bash
pip show torch
```

### 10. 手动安装 NCCL
如果上述方法都无法解决问题，你可以尝试手动下载并安装 NCCL 库。你可以从 NVIDIA 的官方网站下载 NCCL 库，并按照说明进行安装。

### 11. 检查 PyTorch 版本
确保你使用的 PyTorch 版本支持 CUDA 12.0。你可以通过以下命令检查 PyTorch 版本：

```python
import torch
print(torch.__version__)
```

如果版本不匹配，可以尝试安装一个支持 CUDA 12.0 的 PyTorch 版本。

希望这些方法能帮助你解决问题。如果问题仍然存在，请提供更多的错误信息和环境细节，以便进一步诊断。

(cosyvoice) duyicheng@duyicheng-computer:~/gitee/CosyVoice$ dpkg -l | grep libnccl
(cosyvoice) duyicheng@duyicheng-computer:~/gitee/CosyVoice$ ldconfig -p | grep nccl
(cosyvoice) duyicheng@duyicheng-computer:~/gitee/CosyVoice$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2023 NVIDIA Corporation
Built on Fri_Jan__6_16:45:21_PST_2023
Cuda compilation tools, release 12.0, V12.0.140
Build cuda_12.0.r12.0/compiler.32267302_0
(cosyvoice) duyicheng@duyicheng-computer:~/gitee/CosyVoice$ conda install pytorch torchvision torchaudio cudatoolkit=12.0 -c pytorch
Channels:
 - pytorch
 - defaults
Platform: linux-64
Collecting package metadata (repodata.json): done
Solving environment: failed

PackagesNotFoundError: The following packages are not available from current channels:

  - cudatoolkit=12.0*

Current channels:

  - https://conda.anaconda.org/pytorch
  - https://repo.anaconda.com/pkgs/main
  - https://repo.anaconda.com/pkgs/r

To search for alternate channels that may provide the conda package you're
looking for, navigate to

    https://anaconda.org

and use the search bar at the top of the page.


(cosyvoice) duyicheng@duyicheng-computer:~/gitee/CosyVoice$ 
(cosyvoice) duyicheng@duyicheng-computer:~/gitee/CosyVoice$ pip show torch
Name: torch
Version: 2.4.1
Summary: Tensors and Dynamic neural networks in Python with strong GPU acceleration
Home-page: https://pytorch.org/
Author: PyTorch Team
Author-email: packages@pytorch.org
License: BSD-3
Location: /home/duyicheng/anaconda3/envs/cosyvoice/lib/python3.8/site-packages
Requires: filelock, fsspec, jinja2, networkx, nvidia-cublas-cu12, nvidia-cuda-cupti-cu12, nvidia-cuda-nvrtc-cu12, nvidia-cuda-runtime-cu12, nvidia-cudnn-cu12, nvidia-cufft-cu12, nvidia-curand-cu12, nvidia-cusolver-cu12, nvidia-cusparse-cu12, nvidia-nccl-cu12, nvidia-nvtx-cu12, sympy, triton, typing-extensions
Required-by: conformer, deepspeed, lightning, openai-whisper, pytorch-lightning, torchaudio, torchmetrics, torchvision

上面第7项出错

(cosyvoice) duyicheng@duyicheng-computer:~/gitee/CosyVoice$ ldd /home/duyicheng/anaconda3/envs/cosyvoice/lib/python3.8/site-packages/torch/lib/libtorch_cuda.so | grep nccl
libnccl.so.2 => /home/duyicheng/anaconda3/envs/cosyvoice/lib/python3.8/site-packages/torch/lib/../../nvidia/nccl/lib/libnccl.so.2 (0x00007e4264a00000)

根据你提供的 `ldd` 输出，`libtorch_cuda.so` 正在链接到 `/home/duyicheng/anaconda3/envs/cosyvoice/lib/python3.8/site-packages/torch/lib/../../nvidia/nccl/lib/libnccl.so.2`。这表明 NCCL 库路径是正确的，但仍然出现了 `undefined symbol: ncclCommRegister` 错误。这可能是由于 NCCL 库版本不匹配或其他依赖问题导致的。

### 解决步骤

#### 1. 检查 NCCL 库版本
确保你安装的 NCCL 库版本与 PyTorch 兼容。你可以通过以下命令检查 NCCL 库版本：

```bash
nccl-info -v
```

如果没有 `nccl-info` 工具，你可以通过以下命令安装：

```bash
sudo apt-get install libnccl-dev
```

#### 2. 重新安装 NCCL 库
尝试重新安装 NCCL 库，确保安装的是最新版本：

```bash
sudo apt-get update
sudo apt-get install --reinstall libnccl2 libnccl-dev
```

#### 3. 检查 PyTorch 和 NCCL 的兼容性
确保你安装的 PyTorch 版本与 NCCL 库版本兼容。你可以通过以下命令安装特定版本的 NCCL 库：

```bash
conda install -c nvidia nccl
```

#### 4. 检查 CUDA 版本
确保你的 CUDA 版本与 PyTorch 和 NCCL 库版本兼容。你可以通过以下命令检查 CUDA 版本：

```bash
nvcc --version
```

#### 5. 重新安装 PyTorch
尝试重新安装 PyTorch，确保安装的是支持 CUDA 12.0 的版本：

```bash
pip uninstall torch torchvision torchaudio
pip install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu120
```

#### 6. 检查环境变量
确保 `LD_LIBRARY_PATH` 包含了 NCCL 库的路径。你可以通过以下命令检查和设置环境变量：

```bash
echo $LD_LIBRARY_PATH
export LD_LIBRARY_PATH=/home/duyicheng/anaconda3/envs/cosyvoice/lib/python3.8/site-packages/torch/lib/../../nvidia/nccl/lib:$LD_LIBRARY_PATH
```

#### 7. 检查 PyTorch 安装
确保 PyTorch 安装时没有遗漏任何依赖项。你可以通过以下 Python 代码检查 PyTorch 版本和 CUDA 版本：

```python
import torch
print(f"PyTorch version: {torch.__version__}")
print(f"CUDA version: {torch.version.cuda}")
print(f"Is CUDA available: {torch.cuda.is_available()}")
```

#### 8. 检查 `libtorch_cuda.so` 文件
确保 `libtorch_cuda.so` 文件存在且未损坏。你可以通过以下命令检查文件：

```bash
ls -l /home/duyicheng/anaconda3/envs/cosyvoice/lib/python3.8/site-packages/torch/lib/libtorch_cuda.so
```

#### 9. 检查 NCCL 库路径
确保 NCCL 库路径正确。你可以通过以下命令检查 NCCL 库路径：

```bash
ldd /home/duyicheng/anaconda3/envs/cosyvoice/lib/python3.8/site-packages/torch/lib/libtorch_cuda.so | grep nccl
```

如果 `nccl` 库路径不正确或缺失，你可以手动添加路径到 `LD_LIBRARY_PATH`：

```bash
export LD_LIBRARY_PATH=/home/duyicheng/anaconda3/envs/cosyvoice/lib/python3.8/site-packages/torch/lib/../../nvidia/nccl/lib:$LD_LIBRARY_PATH
```

### 10. 重新创建虚拟环境
如果上述方法都无法解决问题，你可以尝试重新创建虚拟环境并重新安装所有依赖项：

```bash
conda create -n new_cosyvoice python=3.8
conda activate new_cosyvoice
conda install pytorch torchvision torchaudio cudatoolkit=12.0 -c pytorch
pip install -r requirements.txt # 假设你有一个 requirements.txt 文件
```

希望这些方法能帮助你解决问题。如果问题仍然存在，请提供更多的错误信息和环境细节，以便进一步诊断。

没招了，降低porch版本了。

pip uninstall torch torchvision torchaudio
pip install torch==2.0.1 torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu120

降低后的列表

--extra-index-url https://download.pytorch.org/whl/cu120
conformer==0.3.2
deepspeed==0.14.2
diffusers==0.27.2
gdown==5.1.0
gradio==4.32.2
grpcio==1.57.0
grpcio-tools==1.57.0
huggingface-hub==0.23.5
hydra-core==1.3.2
HyperPyYAML==1.2.2
inflect==7.3.1
librosa==0.10.2
lightning==2.2.4
matplotlib==3.7.5
modelscope==1.15.0
networkx==3.1
omegaconf==2.3.0
onnx==1.16.0
onnxruntime-gpu==1.16.0
openai-whisper==20231117
protobuf==4.25
pydantic==2.7.0
rich==13.7.1
soundfile==0.12.1
tensorboard==2.14.0
torch==2.0.1
torchvision==0.15.2
torchaudio==2.0.2
uvicorn==0.30.0
wget==3.2
fastapi==0.111.0
fastapi-cli==0.0.4
WeTextProcessing==1.0.3

运行下面的测试代码。python cs.py.成功如下。

from cosyvoice.cli.cosyvoice import CosyVoice
from cosyvoice.utils.file_utils import load_wav
import torchaudio

cosyvoice = CosyVoice('pretrained_models/CosyVoice-300M-SFT', load_jit=True, load_onnx=False, fp16=True)
# sft usage
print(cosyvoice.list_avaliable_spks())
# change stream=True for chunk stream inference
for i, j in enumerate(
        cosyvoice.inference_sft('你好，我是通义生成式语音大模型，请问有什么可以帮您的吗？', '中文女', stream=False)):
    torchaudio.save('sft_{}.wav'.format(i), j['tts_speech'], 22050)

cosyvoice = CosyVoice(
    'pretrained_models/CosyVoice-300M-25Hz')  # or change to pretrained_models/CosyVoice-300M for 50Hz inference
# zero_shot usage, <|zh|><|en|><|jp|><|yue|><|ko|> for Chinese/English/Japanese/Cantonese/Korean
prompt_speech_16k = load_wav('zero_shot_prompt.wav', 16000)
for i, j in enumerate(cosyvoice.inference_zero_shot(
        '收到好友从远方寄来的生日礼物，那份意外的惊喜与深深的祝福让我心中充满了甜蜜的快乐，笑容如花儿般绽放。',
        '希望你以后能够做的比我还好呦。', prompt_speech_16k, stream=False)):
    torchaudio.save('zero_shot_{}.wav'.format(i), j['tts_speech'], 22050)
# cross_lingual usage
prompt_speech_16k = load_wav('cross_lingual_prompt.wav', 16000)
for i, j in enumerate(cosyvoice.inference_cross_lingual(
        '<|en|>And then later on, fully acquiring that company. So keeping management in line, interest in line with the asset that\'s coming into the family is a reason why sometimes we don\'t buy the whole thing.',
        prompt_speech_16k, stream=False)):
    torchaudio.save('cross_lingual_{}.wav'.format(i), j['tts_speech'], 22050)
# vc usage
prompt_speech_16k = load_wav('zero_shot_prompt.wav', 16000)
source_speech_16k = load_wav('cross_lingual_prompt.wav', 16000)
for i, j in enumerate(cosyvoice.inference_vc(source_speech_16k, prompt_speech_16k, stream=False)):
    torchaudio.save('vc_{}.wav'.format(i), j['tts_speech'], 22050)

cosyvoice = CosyVoice('pretrained_models/CosyVoice-300M-Instruct')
# instruct usage, support <laughter></laughter><strong></strong>[laughter][breath]
for i, j in enumerate(
        cosyvoice.inference_instruct('在面对挑战时，他展现了非凡的<strong>勇气</strong>与<strong>智慧</strong>。',
                                     '中文男',
                                     'Theo \'Crimson\', is a fiery, passionate rebel leader. Fights with fervor for justice, but struggles with impulsiveness.',
                                     stream=False)):
    torchaudio.save('instruct_{}.wav'.format(i), j['tts_speech'], 22050)

生成了音频如图：

运行提供的webui.py

如下信息提示：

python3 webui.py --port 8888 --model_dir pretrained_models/CosyVoice-300M
2024-11-07 10:12:11,500 - modelscope - INFO - PyTorch version 2.0.1 Found.
2024-11-07 10:12:11,500 - modelscope - INFO - Loading ast index from /home/duyicheng/.cache/modelscope/ast_indexer
2024-11-07 10:12:11,569 - modelscope - INFO - Loading done! Current index file version is 1.15.0, with md5 87f1fcfbd39133baac40066388797472 and a total number of 980 components indexed
/home/duyicheng/anaconda3/envs/cosyvoice/lib/python3.8/site-packages/torchvision/datapoints/__init__.py:12: UserWarning: The torchvision.datapoints and torchvision.transforms.v2 namespaces are still Beta. While we do not expect major breaking changes, some APIs may still change according to user feedback. Please submit any feedback you may have in this issue: https://github.com/pytorch/vision/issues/6753, and you can also check out https://github.com/pytorch/vision/issues/7319 to learn more about the APIs that we suspect might involve future changes. You can silence this warning by calling torchvision.disable_beta_transforms_warning().
warnings.warn(_BETA_TRANSFORMS_WARNING)
/home/duyicheng/anaconda3/envs/cosyvoice/lib/python3.8/site-packages/torchvision/transforms/v2/__init__.py:54: UserWarning: The torchvision.datapoints and torchvision.transforms.v2 namespaces are still Beta. While we do not expect major breaking changes, some APIs may still change according to user feedback. Please submit any feedback you may have in this issue: https://github.com/pytorch/vision/issues/6753, and you can also check out https://github.com/pytorch/vision/issues/7319 to learn more about the APIs that we suspect might involve future changes. You can silence this warning by calling torchvision.disable_beta_transforms_warning().
warnings.warn(_BETA_TRANSFORMS_WARNING)
failed to import ttsfrd, use WeTextProcessing instead
/home/duyicheng/anaconda3/envs/cosyvoice/lib/python3.8/site-packages/diffusers/models/lora.py:393: FutureWarning: `LoRACompatibleLinear` is deprecated and will be removed in version 1.0.0. Use of `LoRACompatibleLinear` is deprecated. Please switch to PEFT backend by installing PEFT: `pip install peft`.
deprecate("LoRACompatibleLinear", "1.0.0", deprecation_message)
2024-11-07 10:12:28,980 INFO input frame rate=50
2024-11-07 10:12:34.343806583 [W:onnxruntime:, session_state.cc:1162 VerifyEachNodeIsAssignedToAnEp] Some nodes were not assigned to the preferred execution providers which may or may not have an negative impact on performance. e.g. ORT explicitly assigns shape related ops to CPU to improve perf.
2024-11-07 10:12:34.343849370 [W:onnxruntime:, session_state.cc:1164 VerifyEachNodeIsAssignedToAnEp] Rerunning with verbose output on a non-minimal build will show node assignments.
2024-11-07 10:12:35,290 WETEXT INFO found existing fst: /home/duyicheng/anaconda3/envs/cosyvoice/lib/python3.8/site-packages/tn/zh_tn_tagger.fst
2024-11-07 10:12:35,290 INFO found existing fst: /home/duyicheng/anaconda3/envs/cosyvoice/lib/python3.8/site-packages/tn/zh_tn_tagger.fst
2024-11-07 10:12:35,290 WETEXT INFO                     /home/duyicheng/anaconda3/envs/cosyvoice/lib/python3.8/site-packages/tn/zh_tn_verbalizer.fst
2024-11-07 10:12:35,290 INFO                     /home/duyicheng/anaconda3/envs/cosyvoice/lib/python3.8/site-packages/tn/zh_tn_verbalizer.fst
2024-11-07 10:12:35,290 WETEXT INFO skip building fst for zh_normalizer ...
2024-11-07 10:12:35,290 INFO skip building fst for zh_normalizer ...
2024-11-07 10:12:36,132 WETEXT INFO found existing fst: /home/duyicheng/anaconda3/envs/cosyvoice/lib/python3.8/site-packages/tn/en_tn_tagger.fst
2024-11-07 10:12:36,132 INFO found existing fst: /home/duyicheng/anaconda3/envs/cosyvoice/lib/python3.8/site-packages/tn/en_tn_tagger.fst
2024-11-07 10:12:36,132 WETEXT INFO                     /home/duyicheng/anaconda3/envs/cosyvoice/lib/python3.8/site-packages/tn/en_tn_verbalizer.fst
2024-11-07 10:12:36,132 INFO                     /home/duyicheng/anaconda3/envs/cosyvoice/lib/python3.8/site-packages/tn/en_tn_verbalizer.fst
2024-11-07 10:12:36,132 WETEXT INFO skip building fst for en_normalizer ...
2024-11-07 10:12:36,132 INFO skip building fst for en_normalizer ...
2024-11-07 10:12:48,711 DEBUG load_ssl_context verify=True cert=None trust_env=True http2=False
2024-11-07 10:12:48,712 DEBUG load_ssl_context verify=True cert=None trust_env=True http2=False
2024-11-07 10:12:48,713 DEBUG Using selector: EpollSelector
2024-11-07 10:12:48,713 DEBUG load_verify_locations cafile='/home/duyicheng/anaconda3/envs/cosyvoice/lib/python3.8/site-packages/certifi/cacert.pem'
2024-11-07 10:12:48,714 DEBUG load_verify_locations cafile='/home/duyicheng/anaconda3/envs/cosyvoice/lib/python3.8/site-packages/certifi/cacert.pem'
2024-11-07 10:12:48,778 DEBUG connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None
2024-11-07 10:12:48,789 DEBUG connect_tcp.started host='checkip.amazonaws.com' port=443 local_address=None timeout=3 socket_options=None
/home/duyicheng/anaconda3/envs/cosyvoice/lib/python3.8/site-packages/gradio/components/base.py:190: UserWarning: 'scale' value should be an integer. Using 0.5 will cause issues.
warnings.warn(
/home/duyicheng/anaconda3/envs/cosyvoice/lib/python3.8/site-packages/gradio/components/base.py:190: UserWarning: 'scale' value should be an integer. Using 0.25 will cause issues.
warnings.warn(
/home/duyicheng/anaconda3/envs/cosyvoice/lib/python3.8/site-packages/gradio/layouts/column.py:55: UserWarning: 'scale' value should be an integer. Using 0.25 will cause issues.
warnings.warn(
2024-11-07 10:12:48,914 DEBUG connect_tcp.complete return_value=<httpcore._backends.sync.SyncStream object at 0x73142afe62b0>
2024-11-07 10:12:48,919 DEBUG start_tls.started ssl_context=<ssl.SSLContext object at 0x731439a92540> server_hostname='checkip.amazonaws.com' timeout=3
2024-11-07 10:12:49,026 DEBUG connect_tcp.complete return_value=<httpcore._backends.sync.SyncStream object at 0x73142af86370>
2024-11-07 10:12:49,026 DEBUG start_tls.complete return_value=<httpcore._backends.sync.SyncStream object at 0x73142afe6760>
2024-11-07 10:12:49,026 DEBUG start_tls.started ssl_context=<ssl.SSLContext object at 0x731439a92340> server_hostname='api.gradio.app' timeout=3
2024-11-07 10:12:49,042 DEBUG send_request_headers.started request=<Request [b'GET']>
2024-11-07 10:12:49,051 DEBUG send_request_headers.complete
2024-11-07 10:12:49,051 DEBUG send_request_body.started request=<Request [b'GET']>
2024-11-07 10:12:49,051 DEBUG send_request_body.complete
2024-11-07 10:12:49,051 DEBUG receive_response_headers.started request=<Request [b'GET']>
Running on local URL: http://0.0.0.0:8888
2024-11-07 10:12:49,103 DEBUG load_ssl_context verify=True cert=None trust_env=True http2=False
2024-11-07 10:12:49,104 DEBUG load_verify_locations cafile='/home/duyicheng/anaconda3/envs/cosyvoice/lib/python3.8/site-packages/certifi/cacert.pem'
2024-11-07 10:12:49,131 DEBUG receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'', [(b'Date', b'Thu, 07 Nov 2024 02:12:49 GMT'), (b'Content-Type', b'text/plain;charset=UTF-8'), (b'Content-Length', b'14'), (b'Connection', b'keep-alive'), (b'Server', b'nginx'), (b'Vary', b'Origin'), (b'Vary', b'Access-Control-Request-Method'), (b'Vary', b'Access-Control-Request-Headers')])
2024-11-07 10:12:49,132 INFO HTTP Request: GET https://checkip.amazonaws.com/ "HTTP/1.1 200 "
2024-11-07 10:12:49,132 DEBUG receive_response_body.started request=<Request [b'GET']>
2024-11-07 10:12:49,132 DEBUG receive_response_body.complete
2024-11-07 10:12:49,132 DEBUG response_closed.started
2024-11-07 10:12:49,132 DEBUG response_closed.complete
2024-11-07 10:12:49,132 DEBUG close.started
2024-11-07 10:12:49,133 DEBUG close.complete
2024-11-07 10:12:49,136 DEBUG Starting new HTTPS connection (1): huggingface.co:443
2024-11-07 10:12:49,159 DEBUG connect_tcp.started host='localhost' port=8888 local_address=None timeout=None socket_options=None
2024-11-07 10:12:49,160 DEBUG connect_tcp.complete return_value=<httpcore._backends.sync.SyncStream object at 0x73142c381310>
2024-11-07 10:12:49,160 DEBUG send_request_headers.started request=<Request [b'GET']>
2024-11-07 10:12:49,160 DEBUG send_request_headers.complete
2024-11-07 10:12:49,160 DEBUG send_request_body.started request=<Request [b'GET']>
2024-11-07 10:12:49,160 DEBUG send_request_body.complete
2024-11-07 10:12:49,161 DEBUG receive_response_headers.started request=<Request [b'GET']>
2024-11-07 10:12:49,162 DEBUG receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Thu, 07 Nov 2024 02:12:49 GMT'), (b'server', b'uvicorn'), (b'content-length', b'4'), (b'content-type', b'application/json')])
2024-11-07 10:12:49,163 INFO HTTP Request: GET http://localhost:8888/startup-events "HTTP/1.1 200 OK"
2024-11-07 10:12:49,163 DEBUG receive_response_body.started request=<Request [b'GET']>
2024-11-07 10:12:49,163 DEBUG receive_response_body.complete
2024-11-07 10:12:49,163 DEBUG response_closed.started
2024-11-07 10:12:49,163 DEBUG response_closed.complete
2024-11-07 10:12:49,163 DEBUG close.started
2024-11-07 10:12:49,163 DEBUG close.complete
2024-11-07 10:12:49,165 DEBUG load_ssl_context verify=False cert=None trust_env=True http2=False
2024-11-07 10:12:49,167 DEBUG connect_tcp.started host='localhost' port=8888 local_address=None timeout=3 socket_options=None
2024-11-07 10:12:49,167 DEBUG connect_tcp.complete return_value=<httpcore._backends.sync.SyncStream object at 0x73142c391430>
2024-11-07 10:12:49,167 DEBUG send_request_headers.started request=<Request [b'HEAD']>
2024-11-07 10:12:49,168 DEBUG send_request_headers.complete
2024-11-07 10:12:49,168 DEBUG send_request_body.started request=<Request [b'HEAD']>
2024-11-07 10:12:49,168 DEBUG send_request_body.complete
2024-11-07 10:12:49,168 DEBUG receive_response_headers.started request=<Request [b'HEAD']>
2024-11-07 10:12:49,204 DEBUG receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Thu, 07 Nov 2024 02:12:49 GMT'), (b'server', b'uvicorn'), (b'content-length', b'17055'), (b'content-type', b'text/html; charset=utf-8')])
2024-11-07 10:12:49,205 INFO HTTP Request: HEAD http://localhost:8888/ "HTTP/1.1 200 OK"
2024-11-07 10:12:49,205 DEBUG receive_response_body.started request=<Request [b'HEAD']>
2024-11-07 10:12:49,205 DEBUG receive_response_body.complete
2024-11-07 10:12:49,205 DEBUG response_closed.started
2024-11-07 10:12:49,206 DEBUG response_closed.complete
2024-11-07 10:12:49,206 DEBUG close.started
2024-11-07 10:12:49,206 DEBUG close.complete

To create a public link, set `share=True` in `launch()`.
2024-11-07 10:12:49,210 DEBUG Starting new HTTPS connection (1): huggingface.co:443
2024-11-07 10:12:49,489 DEBUG start_tls.complete return_value=<httpcore._backends.sync.SyncStream object at 0x73142afb9a30>
2024-11-07 10:12:49,490 DEBUG send_request_headers.started request=<Request [b'GET']>
2024-11-07 10:12:49,490 DEBUG send_request_headers.complete
2024-11-07 10:12:49,490 DEBUG send_request_body.started request=<Request [b'GET']>
2024-11-07 10:12:49,490 DEBUG send_request_body.complete
2024-11-07 10:12:49,490 DEBUG receive_response_headers.started request=<Request [b'GET']>
2024-11-07 10:12:49,713 DEBUG receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Thu, 07 Nov 2024 02:12:49 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')])
2024-11-07 10:12:49,714 INFO HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK"
2024-11-07 10:12:49,714 DEBUG receive_response_body.started request=<Request [b'GET']>
2024-11-07 10:12:49,714 DEBUG receive_response_body.complete
2024-11-07 10:12:49,714 DEBUG response_closed.started
2024-11-07 10:12:49,714 DEBUG response_closed.complete
2024-11-07 10:12:49,714 DEBUG close.started
2024-11-07 10:12:49,714 DEBUG close.complete
IMPORTANT: You are using gradio version 4.32.2, however version 4.44.1 is available, please upgrade.

秋熟如下：

从日志信息来看，您的应用程序启动并运行在本地URL `http://0.0.0.0:8888` 上。以下是日志中的一些关键点：

- **PyTorch 版本检测**：成功检测到 PyTorch 版本为 2.0.1。
- **模型加载**：从缓存中加载了模型索引文件，版本为 1.15.0，包含 980 个组件。
- **警告信息**：
- `torchvision` 的 `datapoints` 和 `transforms.v2` 命名空间仍处于 Beta 阶段，可能会根据用户反馈进行更改。
- `LoRACompatibleLinear` 已弃用，建议使用 PEFT 后端。
- `gradio` 中的 `scale` 值应为整数，使用浮点数可能会导致问题。
- **ONNX 运行时警告**：某些节点未分配到首选执行提供程序，可能会影响性能。
- **FST 文件加载**：成功找到了现有的 FST 文件，跳过了构建步骤。
- **SSL 上下文加载**：成功加载了 SSL 上下文和证书。
- **HTTP 请求**：成功发送了多个 HTTP 请求，包括检查 IP 地址和 Gradio API 版本。
- **Gradio 应用启动**：应用成功启动并运行在本地 URL 上。

### 建议
1. **处理警告信息**：
   - 对于 `torchvision` 的 Beta 警告，可以考虑提交反馈或查看相关文档以了解未来的变更。
   - 对于 `LoRACompatibleLinear` 的弃用警告，建议安装 PEFT 并进行相应的代码更新。
   - 对于 `gradio` 的 `scale` 警告，确保所有 `scale` 参数都设置为整数。

2. **性能优化**：
   - 查看 ONNX 运行时的警告信息，如果性能受到影响，可以尝试调整节点分配或使用不同的执行提供程序。

3. **安全性**：
   - 确保 SSL 证书和上下文配置正确，以保证通信的安全性。

4. **公共链接**：
   - 如果需要创建公共链接，可以在 `launch()` 方法中设置 `share=True`。

如果您有其他具体问题或需要进一步的帮助，请随时告知。