Edit Info Other
Login

CUDA"

Differences between revisions 1 and 70 (spanning 69 versions)
Revision 1 as of 2016-12-14 11:12:50
Size: 2258
Comment:
Revision 70 as of 2024-10-15 07:26:21
Size: 9115
Comment:
Deletions are marked like this. Additions are marked like this.
Line 1: Line 1:
## Please edit system and help pages ONLY in the master wiki!
## For more information, please see MoinMoin:MoinDev/Translation.
## page was renamed from Howto/NVIDIA_CUDA
##master-page:Unknown-Page
##master-date:Unknown-Date
#acl -All:write Default
#format wiki
#language en
Line 2: Line 10:
== NVIDIA CUDA Installation ==
This Howto provides a way to install the official NVIDIA packages for CUDA.
== Installation ==
This Howto provides a way to install the official CUDA packages from NVIDIA along our packaged NVIDIA driver at RPM Fusion.
Line 5: Line 13:
== NVIDIA CUDA Repository ==
This repository contains a given version of CUDA that is parallel installable along with another version.
While this repository contains both the NVIDIA driver and CUDA toolkit, We recommend to use our packaged driver instead, in order to receive fixes needed by Fedora kernel update.

== NVIDIA official repositories ==
These repositories contain versions of CUDA that are parallel installable along with another version.

=== CUDA Toolkit ===
Line 10: Line 22:
* RHEL/CentOS 7 {{{
yum install http://developer.download.nvidia.com/compute/cuda/repos/rhel7/x86_64/cuda-repo-rhel7-8.0.44-1.x86_64.rpm
yum install cuda
}}}
* Fedora 23 (and later) {{{
dnf install http://developer.download.nvidia.com/compute/cuda/repos/fedora23/x86_64/cuda-repo-fedora23-8.0.44-1.x86_64.rpm
dnf install cuda
 * Fedora 39 and later (if using a compatible compiler, see also {{{
sudo dnf config-manager --add-repo https://developer.download.nvidia.com/compute/cuda/repos/fedora39/x86_64/cuda-fedora39.repo
sudo dnf clean all
sudo dnf module disable nvidia-driver
sudo dnf -y install cuda
Line 19: Line 29:
== GCC version issue ==
When using a later version of Fedora than what is supported by the NVIDIA CUDA Official repository, you might be unable to compile.
You can either:
* Tweak the /usr/local/cuda-8.0/targets/x86_64-linux/include/host_defines.h to accept the Fedora default compiler.
* Install the appropriate gcc version for CentOS developper toolset. Please see https://www.softwarecollections.org/en/scls/rhscl/devtoolset-4/
{{{
dnf install http://ftp.ciril.fr/pub/linux/centos/7.3.1611/extras/x86_64/Packages/centos-release-scl-2-2.el7.centos.noarch.rpm
dnf install devtoolset-4-toolchain
 * RHEL/Rocky/Alma 9 {{{
sudo dnf config-manager --add-repo http://developer.download.nvidia.com/compute/cuda/repos/rhel9/x86_64/cuda-rhel9.repo
sudo dnf clean all
sudo dnf module disable nvidia-driver
sudo dnf -y install cuda
Line 28: Line 35:
You cannot install the whole devtoolset-4 collection, but the toolchain is enough , then each time you need to build using cuda, you start by
{{{
$ scl run devtoolset-4 bash
$ gcc --version
gcc (GCC) 5.2.1 20150902 (Red Hat 5.2.1-2)
Copyright (C) 2015 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE
$ exit
$ gcc --version
gcc --version
gcc (GCC) 6.2.1 20160916 (Red Hat 6.2.1-2)
Copyright © 2016 Free Software Foundation, Inc.
Ce logiciel est libre; voir les sources pour les conditions de copie. Il n'y a PAS
GARANTIE; ni implicite pour le MARCHANDAGE ou pour un BUT PARTICULIER
 * RHEL/Rocky/Alma 8 {{{
sudo dnf config-manager --add-repo http://developer.download.nvidia.com/compute/cuda/repos/rhel8/x86_64/cuda-rhel8.repo
sudo dnf clean all
sudo dnf module disable nvidia-driver
sudo dnf -y install cuda
}}}
 * RHEL/CentOS 7 {{{
sudo yum-config-manager --add-repo http://developer.download.nvidia.com/compute/cuda/repos/rhel7/x86_64/cuda-rhel7.repo
sudo yum clean all
sudo yum install cuda
Line 46: Line 48:
=== Machine Learning repository ===

Please use the official link: https://developer.nvidia.com/nccl/nccl-download

 * RHEL 9/Fedora
Merged with the regular CUDA repository

 * RHEL/CentOS 8 {{{
sudo dnf install https://developer.download.nvidia.com/compute/machine-learning/repos/rhel8/x86_64/nvidia-machine-learning-repo-rhel8-1.0.0-1.x86_64.rpm
sudo dnf install libcudnn7 libcudnn7-devel libnccl libnccl-devel
}}}

 * RHEL/CentOS 7 {{{
sudo yum install https://developer.download.nvidia.com/compute/machine-learning/repos/rhel7/x86_64/nvidia-machine-learning-repo-rhel7-1.0.0-1.x86_64.rpm
sudo yum install libcudnn7 libcudnn7-devel libnccl libnccl-devel
}}}


=== TensorRT repository ===

You can download the TensorRT component using the appropriate version from https://developer.nvidia.com/nvidia-tensorrt-download

This requires to login with the NVIDIA CUDA program subscription.


=== Legacy NVIDIA 340xx/CUDA 6.5 ===
This repository contains a legacy version of CUDA 6.5 that will works with the NVIDIA 340xx series

Please use the Official link: https://developer.nvidia.com/cuda-toolkit-65

 * RHEL/CentOS 6 {{{
sudo yum install http://developer.download.nvidia.com/compute/cuda/repos/rhel6/x86_64/cuda-repo-rhel6-6.5-14.x86_64.rpm
sudo yum install cuda
}}}
 * Fedora 20 (and later) {{{
sudo yum install install http://developer.download.nvidia.com/compute/cuda/repos/fedora20/x86_64/cuda-repo-fedora20-6.5-14.x86_64.rpm
sudo yum install cuda
}}}

Please verify to have a compatible compiler.

== Community repositories ==
=== RPM Fusion CUDA ===
This repository aims to receive content dedicated for CUDA and is built with the official cuda releases.
Only available for Fedora (latest supported CUDA release) so far and is still a work in progress...


=== AI/ML Fedora nvidia-container-toolkit ===
With the AI-ML working group at fedora, there is this content allowing a fully built from source nvidia-container-toolkit that integrates well with fedora:
See also https://copr.fedorainfracloud.org/coprs/g/ai-ml/nvidia-container-toolkit/


== Known issues ==

=== Newer/Beta driver ===
Sometime with recent CUDA releases, a newer/beta driver version is required. We usually package the such driver in RPM Fusion for rawhide. To ease the installation in stable Fedora branches, you can follow this guideline:
See also https://rpmfusion.org/Howto/NVIDIA#Latest.2FBeta_driver

It can be a good idea to keep using the rawhide drivers by default.

=== GCC version ===
When using a later version of Fedora than what is supported by the NVIDIA CUDA Official repository, you might be unable to compile.
You can either:

Install an older gcc for dedicated for CUDA from COPR (Recommended on Fedora).


 * GCC8 Works up to Fedora 32 for cuda-10.1 and later (up to CUDA 11)
{{{
dnf copr enable kwizart/cuda-gcc-10.1 -y
dnf install cuda-gcc cuda-gcc-c++ -y
}}}

You will need to tell CUDA to use it instead of using the default g++ this can be done for the cuda-samples with:
{{{
export HOST_COMPILER=cuda-g++
}}}

 * Install the appropriate gcc version from developer toolset. It will install in parallel. Please see https://www.softwarecollections.org/en/scls/rhscl/devtoolset-8/

{{{
sudo dnf install https://rpmfind.net/linux/centos/7/extras/x86_64/Packages/centos-release-scl-rh-2-3.el7.centos.noarch.rpm
sudo dnf install devtoolset-8-toolchain
}}}
You cannot install the whole devtoolset-8 collection, but the toolchain is enough , then each time you need to build using cuda, you start by
{{{
scl run devtoolset-8 bash
gcc --version
gcc (GCC) 8.3.1 20190311 (Red Hat 8.3.1-3)
Copyright (C) 2018 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
exit
gcc --version
gcc (GCC) 9.2.1 20190827 (Red Hat 9.2.1-1)
Copyright (C) 2019 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
}}}

 * Tweak the /usr/local/cuda*/targets/x86_64-linux/include/crt/host_defines.h to accept the Fedora default compiler. (Not recommended - not always working).

=== Which driver Package ===
Both "CUDA" and "RPM Fusion" repositories provide the nvidia driver packages. Unfortunately, the packaging method is way too different and can conflicts. We recommends to use the publicly and community based packaging method (RPM Fusion) and avoid the NVIDIA packaged nvidia-driver. From time to time, NVIDIA uses non-publictly released driver, so you will have to wait for a public driver for the RPM Fusion counterpart...

With current RHEL8 repositories, the nvidia-driver is packaged as a module. So it's easy to disable with:
{{{
sudo dnf module disable nvidia-driver
}}}


=== NVIDIA driver higher in CUDA repo ===
Often when NVIDIA release a newer CUDA version or even in the case of pre-release software the NVIDIA driver is at a higher version than the driver provided by RPM Fusion. There is no way for us to provide a version that will match the newer CUDA requirement "ahead" of any NVIDIA public driver release. With that said, the dependencies can sometime be faked at the RPM level with:
{{{
dnf module enable nvidia-driver -y && dnf download cuda-drivers && dnf module disable nvidia-driver -y
rpm -Uvh cuda-drivers*.rpm --nodeps
dnf update
}}}

Please remind to remove the cuda-drivers package when the RPM Fusion provided driver is high enough.
Complain to NVIDIA for this bad behaviour, not to us.

Once a newer version of the driver is available publicly, it will likely be available on the RPM Fusion rawhide repository in the first step, please follow this guide on how to upgrade to the newer driver (This is currently the case with CUDA 11 and 450xx driver serie) :
https://rpmfusion.org/Howto/NVIDIA#Latest.2FBeta_driver


=== NVIDIA provided libOpenCL ===
NVIDIA only advertise OpenCL 1.2 with the binary driver at this time. As a consequence, they provide an old version of libOpenCL.so.1 which works fine with their binary driver.
As most software in Fedora and RPM Fusion are built using a newer libOpenCL, the system linker detects that and issues the following message:
{{{
 /usr/local/cuda-9.2/targets/x86_64-linux/lib/libOpenCL.so.1: no version information available (required by ffmpeg)
}}}

You can either ignore the message or manually delete the libOpenCL.so.1 provided by NVIDIA (run sudo ldconfig once deleted). Please verify to not have other OpenCL providers that might interfere with NVIDIA OpenCL usage.
(looking at /etc/OpenCL/vendors ).

=== Running blender ===
Even when only running blender, you need a CUDA compatible compiler as described above. This is because blender will compile the "CUDA Kernels" optimized for your own GPU. You can run blender with:
{{{
 scl run devtoolset-7 blender
}}}
Once the "CUDA kernels" are compiled, you can run blender normally

Line 47: Line 193:
* CUDA Start quide: https://developer.nvidia.com/compute/cuda/8.0/prod/docs/sidebar/CUDA_Quick_Start_Guide-pdf
* CUDA documentation: https://docs.nvidia.com/cuda/index.html
 * CUDA whatsnew : https://developer.nvidia.com/cuda-toolkit/whatsnew

 
* CUDA documentation: https://docs.nvidia.com/cuda/index.html

Installation

This Howto provides a way to install the official CUDA packages from NVIDIA along our packaged NVIDIA driver at RPM Fusion.

While this repository contains both the NVIDIA driver and CUDA toolkit, We recommend to use our packaged driver instead, in order to receive fixes needed by Fedora kernel update.

NVIDIA official repositories

These repositories contain versions of CUDA that are parallel installable along with another version.

CUDA Toolkit

Please use the Official link: https://developer.nvidia.com/cuda-downloads

  • Fedora 39 and later (if using a compatible compiler, see also

    sudo dnf config-manager --add-repo https://developer.download.nvidia.com/compute/cuda/repos/fedora39/x86_64/cuda-fedora39.repo
    sudo dnf clean all
    sudo dnf module disable nvidia-driver
    sudo dnf -y install cuda
  • RHEL/Rocky/Alma 9

    sudo dnf config-manager --add-repo http://developer.download.nvidia.com/compute/cuda/repos/rhel9/x86_64/cuda-rhel9.repo
    sudo dnf clean all
    sudo dnf module disable nvidia-driver
    sudo dnf -y install cuda
  • RHEL/Rocky/Alma 8

    sudo dnf config-manager --add-repo http://developer.download.nvidia.com/compute/cuda/repos/rhel8/x86_64/cuda-rhel8.repo
    sudo dnf clean all
    sudo dnf module disable nvidia-driver
    sudo dnf -y install cuda
  • RHEL/CentOS 7

    sudo yum-config-manager --add-repo http://developer.download.nvidia.com/compute/cuda/repos/rhel7/x86_64/cuda-rhel7.repo
    sudo yum clean all
    sudo yum install cuda

Machine Learning repository

Please use the official link: https://developer.nvidia.com/nccl/nccl-download

  • RHEL 9/Fedora

Merged with the regular CUDA repository

  • RHEL/CentOS 8

    sudo dnf install https://developer.download.nvidia.com/compute/machine-learning/repos/rhel8/x86_64/nvidia-machine-learning-repo-rhel8-1.0.0-1.x86_64.rpm
    sudo dnf install libcudnn7 libcudnn7-devel libnccl libnccl-devel
  • RHEL/CentOS 7

    sudo yum install https://developer.download.nvidia.com/compute/machine-learning/repos/rhel7/x86_64/nvidia-machine-learning-repo-rhel7-1.0.0-1.x86_64.rpm
    sudo yum install libcudnn7 libcudnn7-devel libnccl libnccl-devel

TensorRT repository

You can download the TensorRT component using the appropriate version from https://developer.nvidia.com/nvidia-tensorrt-download

This requires to login with the NVIDIA CUDA program subscription.

Legacy NVIDIA 340xx/CUDA 6.5

This repository contains a legacy version of CUDA 6.5 that will works with the NVIDIA 340xx series

Please use the Official link: https://developer.nvidia.com/cuda-toolkit-65

  • RHEL/CentOS 6

    sudo yum install http://developer.download.nvidia.com/compute/cuda/repos/rhel6/x86_64/cuda-repo-rhel6-6.5-14.x86_64.rpm
    sudo yum install cuda
  • Fedora 20 (and later)

    sudo yum install install http://developer.download.nvidia.com/compute/cuda/repos/fedora20/x86_64/cuda-repo-fedora20-6.5-14.x86_64.rpm
    sudo yum install cuda

Please verify to have a compatible compiler.

Community repositories

RPM Fusion CUDA

This repository aims to receive content dedicated for CUDA and is built with the official cuda releases. Only available for Fedora (latest supported CUDA release) so far and is still a work in progress...

AI/ML Fedora nvidia-container-toolkit

With the AI-ML working group at fedora, there is this content allowing a fully built from source nvidia-container-toolkit that integrates well with fedora: See also https://copr.fedorainfracloud.org/coprs/g/ai-ml/nvidia-container-toolkit/

Known issues

Newer/Beta driver

Sometime with recent CUDA releases, a newer/beta driver version is required. We usually package the such driver in RPM Fusion for rawhide. To ease the installation in stable Fedora branches, you can follow this guideline: See also https://rpmfusion.org/Howto/NVIDIA#Latest.2FBeta_driver

It can be a good idea to keep using the rawhide drivers by default.

GCC version

When using a later version of Fedora than what is supported by the NVIDIA CUDA Official repository, you might be unable to compile. You can either:

Install an older gcc for dedicated for CUDA from COPR (Recommended on Fedora).

  • GCC8 Works up to Fedora 32 for cuda-10.1 and later (up to CUDA 11)

dnf copr enable kwizart/cuda-gcc-10.1 -y
dnf install cuda-gcc cuda-gcc-c++ -y

You will need to tell CUDA to use it instead of using the default g++ this can be done for the cuda-samples with:

export HOST_COMPILER=cuda-g++

sudo dnf install https://rpmfind.net/linux/centos/7/extras/x86_64/Packages/centos-release-scl-rh-2-3.el7.centos.noarch.rpm
sudo dnf install devtoolset-8-toolchain

You cannot install the whole devtoolset-8 collection, but the toolchain is enough , then each time you need to build using cuda, you start by

scl run devtoolset-8 bash
gcc --version
gcc (GCC) 8.3.1 20190311 (Red Hat 8.3.1-3)
Copyright (C) 2018 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
exit
gcc --version
gcc (GCC) 9.2.1 20190827 (Red Hat 9.2.1-1)
Copyright (C) 2019 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
  • Tweak the /usr/local/cuda*/targets/x86_64-linux/include/crt/host_defines.h to accept the Fedora default compiler. (Not recommended - not always working).

Which driver Package

Both "CUDA" and "RPM Fusion" repositories provide the nvidia driver packages. Unfortunately, the packaging method is way too different and can conflicts. We recommends to use the publicly and community based packaging method (RPM Fusion) and avoid the NVIDIA packaged nvidia-driver. From time to time, NVIDIA uses non-publictly released driver, so you will have to wait for a public driver for the RPM Fusion counterpart...

With current RHEL8 repositories, the nvidia-driver is packaged as a module. So it's easy to disable with:

sudo dnf module disable nvidia-driver

NVIDIA driver higher in CUDA repo

Often when NVIDIA release a newer CUDA version or even in the case of pre-release software the NVIDIA driver is at a higher version than the driver provided by RPM Fusion. There is no way for us to provide a version that will match the newer CUDA requirement "ahead" of any NVIDIA public driver release. With that said, the dependencies can sometime be faked at the RPM level with:

dnf module enable nvidia-driver -y && dnf download cuda-drivers && dnf module disable nvidia-driver -y
rpm -Uvh cuda-drivers*.rpm --nodeps
dnf update

Please remind to remove the cuda-drivers package when the RPM Fusion provided driver is high enough. Complain to NVIDIA for this bad behaviour, not to us.

Once a newer version of the driver is available publicly, it will likely be available on the RPM Fusion rawhide repository in the first step, please follow this guide on how to upgrade to the newer driver (This is currently the case with CUDA 11 and 450xx driver serie) : https://rpmfusion.org/Howto/NVIDIA#Latest.2FBeta_driver

NVIDIA provided libOpenCL

NVIDIA only advertise OpenCL 1.2 with the binary driver at this time. As a consequence, they provide an old version of libOpenCL.so.1 which works fine with their binary driver. As most software in Fedora and RPM Fusion are built using a newer libOpenCL, the system linker detects that and issues the following message:

 /usr/local/cuda-9.2/targets/x86_64-linux/lib/libOpenCL.so.1: no version information available (required by ffmpeg)

You can either ignore the message or manually delete the libOpenCL.so.1 provided by NVIDIA (run sudo ldconfig once deleted). Please verify to not have other OpenCL providers that might interfere with NVIDIA OpenCL usage. (looking at /etc/OpenCL/vendors ).

Running blender

Even when only running blender, you need a CUDA compatible compiler as described above. This is because blender will compile the "CUDA Kernels" optimized for your own GPU. You can run blender with:

 scl run devtoolset-7 blender

Once the "CUDA kernels" are compiled, you can run blender normally

References


CategoryHowto

Howto/CUDA (last edited 2024-10-15 07:26:21 by NicolasChauvet)