当前位置：首页 > news >正文

[自记录]一次Nvidia显卡的AI容器基础镜像制作过程（含Torch版本和ONNXRuntime版本选择）

news 来源：原创 2025/4/28 20:47:02

1 宿主机情况

+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.161.07             Driver Version: 535.161.07   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  Tesla V100S-PCIE-32GB          Off | 00000000:00:0D.0 Off |                    0 |
| N/A   28C    P0              37W / 250W |  15572MiB / 32768MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+

2 制作镜像

基础镜像

docker run --gpus=all --rm -it nvidia/cuda:12.2.2-base-ubuntu22.04
apt install python3 python3-pip
pip config set global.index-url https://pypi.tuna.tsinghua.edu.cn/simple

默认apt按照的python版本为3.10.12。

torch版本选择

torch版本建议在torch官网下载，不建议pip安装

torch 2.5.1+cu121

torchaudio 2.5.1+cu121

onnxruntime选择

pip install onnxruntime-gpu[cuda,cudnn]==1.21.1

onnxruntime版本兼容性检查
如果用较低版本的torch，还需要关注cuDNN的版本不要冲突。
在这里插入图片描述

3 结果展示

$ python3
Python 3.10.12 (main, Feb  4 2025, 14:57:36) [GCC 11.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> print(torch.__version__)
2.5.1+cu121
>>> print(torch.version.cuda)
12.1
>>> print(torch.backends.cudnn.version())
90100