Environmental Science
ubuntu16.04
GTX1650
Kernel: 4.15.0-142-generic
Old nvidia:430
Old version cuda:10.1
step
1. Uninstall the original nvidia and cuda10.0
Uninstall nvidia
sudo apt-get remove --purge nvidia*
Uninstall cuda
sudo apt-get remove cuda sudo apt autoremove sudo apt-get remove cuda*
Then switch the Terminal Run Directory to / usr/local/(this is the default installation path for cuda)
cd /usr/local/ dir#You should see a "cuda" or "cuda-xxx" folder, or both sudo rm -r cuda-10.0
2. Install new nvidia and cuda
No extra nvidia drivers!
No extra nvidia drivers!
No extra nvidia drivers!
Install via the following command line, super simple!!!
go Official Web Select the version you want and follow the command
https://developer.nvidia.com/cuda-11.1.1-download-archive
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64/cuda-ubuntu1604.pin sudo mv cuda-ubuntu1604.pin /etc/apt/preferences.d/cuda-repository-pin-600 wget https://developer.download.nvidia.com/compute/cuda/11.1.1/local_installers/cuda-repo-ubuntu1604-11-1-local_11.1.1-455.32.00-1_amd64.deb sudo dpkg -i cuda-repo-ubuntu1604-11-1-local_11.1.1-455.32.00-1_amd64.deb sudo apt-key add /var/cuda-repo-ubuntu1604-11-1-local/7fa2af80.pub sudo apt-get update sudo apt-get -y install cuda
3. Problems encountered
1.nvcc-V does not display problems
sudo gedit ~/.bashrc Add the following: export PATH=/usr/local/cuda-11.1/bin:$PATH export LD_LIBRARY_PATH=/usr/local/cuda-11.1/lib64:$LD_LIBRARY_PATH
2. Failed to initialize NVIDIA NVML Driver/library version mismatch on nvidia-smi execution
Reason analysis: Kernel-driven version is inconsistent with system-driven
View graphics card driver version
cat /proc/driver/nvidia/version
The output version is nvidia-430 (that is, the NVIDIA driver from the previous version of the system)
[The blog was written after the environment was installed, there is no picture here."
See all the drivers in your computer right now
sudo dpkg --list | grep nvidia-*
Solution 1: Restart (not valid for me)
After restart, when the card enters the ubuntu system, ctrl+alt+F1 enters the command line interface, uninstall nvidia before re-entering the ubuntu system
sudo apt-get remove --purge nvidia*
Re-install nvidia455 driver after entering the system or mismatch error will be reported
Solution 2: Uninstall the original loaded driver (temporary, computer restart, or there will be a Driver/library version mismatch problem)
# Unload loaded nvidia drivers sudo rmmod nvidia_drm sudo rmmod nvidia_uvm sudo rmmod nvidia_modeset sudo rmmod nvidia # Execute nvidia-smi again
If you uninstall a mod, you encounter an error, such as
implement
sudo lsof /dev/nvidia*
Load kill process
sudo kill 37667 If more than one use nvidia Process, kill Followed by related PID Number
Solution 3: Uninstall the current version of the NVIDIA driver and switch to the version displayed using the cat/proc/driver/nvidia/version command (useless, because to upgrade the cuda, you must upgrade the NVIDIA driver. If there is no requirement for the graphics card driver version, you can choose this path)
Solution 4: Uninstall the nvidia driver and use the sudo ubuntu-drivers devices command to view computer-supported cuda
Then use the command ubuntu-drivers autoinstall to automatically install the graphics card driver
CONCLUSION: No use, there will still be the same error after loading
Solution 5: After executing the steps of Solution 2, add a sudo apt-get upgrade command (done!!!)
But be careful!!
If you encounter two install package versions in between, just follow the default (press N)
Configuration file 'xxxx' ==> File on system created by you or by a script. ==> File also in package provided by package maintainer. What would you like to do about it ? Your options are: Y or I : install the package maintainer's version N or O : keep your currently-installed version D : show the differences between the versions Z : start a shell to examine the situation The default action is to keep your current version.
3. ata1.00 failed command: READ FPDMA QUEUD
The ubuntu system will output this error when entering the normal ubuntu and recovery mode l, and the time to enter a system will be changed from 10s to 30s.
ta2.00: status: { DRDY ERR } ata2.00: error: { UNC } ata2.00: failed command: READ FPDMA QUEUED ata2.00: cmd 60/28:70:28:19:89/00:00:6c:01:00/40 tag 14 ncq 20480 in
At the end, press F2 to enter BIOS, change all BIOS settings to default (EXIT page has a RESTORE option), and you can enter ubuntu smoothly