在运行 nvidia-docker 的时候出现这样的错误:

docker: Error response from daemon: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running hook #1: error running hook: exit status 1, stdout: , stderr: Auto-detected mode as 'legacy'
nvidia-container-cli: initialization error: nvml error: driver not loaded: unknown.

看上去是显卡驱动出问题了,在运行 nvidia-smi,出现以下错误:

NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.

可以确认显卡驱动出问题了。

这个情况出现过好多次了,都是莫名其妙的出现这个问题,没人动过服务器。现在分析,那大概率是 Ubuntu 自动安装了更新。

禁用系统自动更新。编辑 /etc/apt/apt.conf.d/20auto-upgrades 。默认是这样的:

APT::Periodic::Update-Package-Lists "1";
APT::Periodic::Unattended-Upgrade "1";

APT::Periodic::Unattended-Upgrade 改成 0。并保存,即可。

APT::Periodic::Update-Package-Lists 是检测是否有新的包,我觉得是有意义的,系统还是需要更新的,但应该是受控的更新。因此,我们需要了解是否有宝需要更新了,以便安排更新时间。


  1. Disable Automatic Updates on Ubuntu 22.04 Jammy Jellyfish Linux
  2. Ubuntu Server : 自动更新 

发表回复

您的电子邮箱地址不会被公开。