My setup for AI work on my Debian PC

setup
Debian
fastai
Python
CUDA
Published

March 3, 2023

Disclaimer: What you should do instead

Introduction

This blog post is highly technical, so with the help of ChatGPT I have included a glossary of terms and concepts for each section. Thanks ChatGPT, I wouldn’t have done that without you!

I am using my home PC for AI work, with Debian GNU/Linux 12 “bookworm” (testing) and an NVIDIA GPU.

The main AI tools that I use are:

  • Jupyter for experiments, development and blogging;
  • fastai for training neural networks;
  • Huggingface tools such as transformers and diffusers;
  • PyTorch, Tensorflow and other lower-level AI libraries; and
  • stable-diffusion-webui for experimenting with AI image generation.

This document is mostly for my own reference. It wasn’t so easy getting everything working nicely and I want to have a record of how I did it. I don’t recommend that you try to do this unless you are experienced with Debian, and you don’t mind taking time to deal with problems when they occur.

I wanted to avoid using docker, conda, or Python venvs. Instead, I installed the necessary Python modules under /usr/local with pip. I was thinking that I would be able to use my AI tools directly from the command line and other scripts, like any other tools.

This worked for a little while, but then Debian started using Python 3.11 for their default Python. Torch isn’t compatible with Python 3.11 yet, and everything broke for me.

After trying to fix it for a while and encountering numerous problems, I decided to switch to using Python 3.10. After more problems, I switched to using Python 3.10 venvs. I think that the original method of installing under /usr/local could work, but the Debian python packages are in a mess state of flux at the moment.

As they say, using a virtual environment is a best practice in Python development and can help avoid issues with package conflicts and dependency management. So, venvs it is!

Glossary
  • conda: a package management system and environment management system for installing multiple versions of software packages and their dependencies, including both Python and non-Python packages.
  • Debian: a free and open-source operating system based on the Linux kernel, widely used for servers and workstations
  • Debian “testing”: a rolling release version of Debian that is in the process of being tested for the next stable release, and contains newer packages than the current stable release. It is not recommended for production use.
  • dependency management: the process of identifying and resolving dependencies between software packages, to ensure that they can be installed and run together without conflicts
  • docker: a platform for developing, shipping, and running applications using containers, which are lightweight, portable, and self-contained environments that run applications and their dependencies.
  • Fastai: a free, open-source deep learning library built on top of PyTorch that provides a high-level API and a range of state-of-the-art models and techniques
  • GNU: A project started by Richard Stallman in 1983 to create a free and open-source operating system, consisting of a complete set of tools and utilities to replace proprietary software. GNU stands for “GNU’s Not Unix”.
  • GNU/Linux: A term used to describe the operating system that consists of the GNU tools and the Linux kernel. The GNU tools provide the user interface and the software development tools, while the Linux kernel provides the low-level system functions, such as process management, memory management, and device drivers.
  • Hugging Face 🤗: a company that develops tools for building machine learning apps, with a focus on natural language processing (NLP).
  • Hugging Face Diffusers: a popular open-source library that provides pretrained vision and audio diffusion models, and serves as a modular toolbox for inference and training
  • Hugging Face Hub: a platform that allows users to share machine learning models and datasets.
  • Jupyter: an open-source web application that allows users to create and share documents that contain live code, equations, visualizations and narrative text
  • Linux: A kernel, or the core component of an operating system, originally created by Linus Torvalds in 1991. Linux is based on Unix, and is released under an open-source license, which allows anyone to modify and distribute the source code. The combination of the Linux kernel and the GNU tools forms the GNU/Linux operating system.
  • Neural networks: a machine learning technique that allows computers to learn from data by adjusting the strengths of connections between neurons
  • NVIDIA GPU: a graphics processing unit manufactured by NVIDIA that is commonly used for machine learning and other compute-intensive tasks
  • package conflicts: situations where two or more Python packages require different versions of the same dependency, leading to issues when trying to install or run the packages together
  • pip: a package installer for Python that allows you to easily install and manage third-party packages and their dependencies. It is commonly used with virtual environments to manage Python package dependencies.
  • PyTorch: a popular open-source machine learning library based on the Torch library
  • Stable-diffusion-webui: an open-source web user interface for state-of-the-art image generation b based on stable diffusion, by automatic1111
  • Tensorflow: an open-source machine learning library developed by Google Brain Team
  • transformers: a state-of-the-art library for natural language processing (NLP) tasks such as text classification and question answering, built by Huggingface
  • /usr/local: a directory in Unix-like operating systems that is typically used for installing software manually, outside of the system package manager.
  • virtual environment (venv): a self-contained Python environment that allows you to install and manage packages without affecting the system-level Python installation or other virtual environments

Debian apt setup

I installed some NVIDIA packages and alternative Python versions from repositories which are intended for Ubuntu, which is a bit hacky, but it can work. The following shows how to create a FrankenDebian install using some packages that were intended for Ubuntu, without totally trashing your system.

Glossary
  • FrankenDebian: a term used to describe a Debian installation that has been modified or customized in non-standard ways, often resulting in instability or other issues
  • NVIDIA packages: software components and drivers provided by NVIDIA Corporation to support their graphics processing units (GPUs)
  • Python versions: different releases of the Python programming language, each with its own set of features and bug fixes
  • repositories: online locations where software packages can be downloaded and installed from, typically maintained by a software distributor or community
  • Ubuntu: a popular distribution of the GNU/Linux operating system, known for its ease of use and large user community

Don’t break Debian: release pinning

Release pinning in Debian is a way of specifying which versions of packages to install from which Debian releases. The code snippet provided shows a file I added called 99dontbreakdebian in the /etc/apt/preferences.d/ directory, with pins for various package releases. These pins specify a release or origin and a priority, which determines which package version to install if multiple versions are available. By setting these pins, I can ensure that my system installs packages from the desired release and avoid accidentally breaking the system by installing incompatible package versions, while still being able to install packages from other sources as needed.

cat /etc/apt/preferences.d/99dontbreakdebian
Package: *
Pin: release o=Debian,a=experimental
Pin-Priority: 1

Package: *
Pin: release o=Debian,a=unstable
Pin-Priority: 90

Package: *
Pin: release a=focal
Pin-Priority: 70

Package: *
Pin: release a=jammy
Pin-Priority: 80

Package: *
Pin: release o=LP-PPA-deadsnakes
Pin-Priority: 90

Package: *
Pin: origin ppa.launchpad.net
Pin-Priority: 90

Package: *
Pin: release o=Debian
Pin-Priority: 990
Glossary
  • Release pinning: A way of specifying which versions of packages to install from which Debian releases.
  • /etc/apt/preferences.d/: A directory in Debian where you can add files to specify package release pins.
  • Pin: A rule that specifies a release or origin and a priority for a package.
  • Experimental: The Debian release channel where packages are the least stable, but often contain the latest features and updates.
  • Unstable: The Debian release channel where packages are more stable than experimental, but still not considered release quality.
  • Focal and Jammy: Ubuntu release code names.
  • LP-PPA-deadsnakes: A Personal Package Archive (PPA) on Launchpad for providing alternative versions of Python for Ubuntu and Debian systems.

My main apt sources.list

cat /etc/apt/sources.list
deb http://deb.debian.org/debian/ bookworm main contrib non-free
deb http://deb.debian.org/debian/ sid main contrib non-free
deb-src http://deb.debian.org/debian/ bookworm main contrib non-free
deb-src http://deb.debian.org/debian/ sid main contrib non-free

deb http://security.debian.org/debian-security bookworm-security main contrib non-free
deb-src http://security.debian.org/debian-security bookworm-security main contrib non-free

deb http://deb.debian.org/debian/ bookworm-updates main contrib non-free
deb-src http://deb.debian.org/debian/ bookworm-updates main contrib non-free

deb http://deb.debian.org/debian/ bookworm-backports main contrib non-free
deb-src http://deb.debian.org/debian/ bookworm-backports main contrib non-free

deb http://deb.debian.org/debian/ experimental main contrib non-free
deb-src http://deb.debian.org/debian/ experimental main contrib non-free
Glossary
  • sources.list: a configuration file in Debian-based systems that specifies the package repositories from which the system can download and install software.
  • deb: a Debian binary package format used to distribute software packages for Debian and its derivatives, including Ubuntu and Mint.
  • bookworm: This is the codename for the Debian 12 release, currently in testing, which will become the next stable release.
  • sid: the codename for Debian’s unstable release.
  • main: This is one of the Debian software repositories, which contains the core packages that make up the Debian operating system.
  • contrib and non-free: two categories of software packages in Debian that include packages with dependencies on non-free or proprietary software. deb-src: This stands for “debian source”. It refers to the software source code repositories for Debian, which users can access in order to download, modify, and compile the source code of various packages.
  • security: This is the Debian repository that contains security updates for Debian packages. It is important to include this repository in your sources.list file to ensure that your system stays secure.
  • backports: This is the Debian repository that contains newer versions of packages that have been backported from newer Debian releases.
  • experimental repo: This is the Debian repository that contains packages that are still in the testing phase and are not yet stable enough for inclusion in the main Debian repositories.

Apt sources for NVIDIA CUDA, cuDNN, TensorRT, and containers

We will use some packages that were built for Ubuntu, but it doesn’t seem to cause a problem.

cd /etc/apt/sources.list.d
cat cuda-debian11-x86_64.list
deb [signed-by=/usr/share/keyrings/cuda-archive-keyring.gpg] https://developer.download.nvidia.com/compute/cuda/repos/debian11/x86_64/ /
cat cudnn-local-debian11-8.6.0.163.list 
deb [signed-by=/usr/share/keyrings/cudnn-local-C922C4FD-keyring.gpg] file:///var/cudnn-local-repo-debian11-8.6.0.163 /
cat nv-tensorrt-local-ubuntu2204-8.5.3-cuda-11.8.list
deb [signed-by=/usr/share/keyrings/nv-tensorrt-local-3E951519-keyring.gpg] file:///var/nv-tensorrt-local-repo-ubuntu2204-8.5.3-cuda-11.8 /
cat nvidia-container-runtime.list
deb https://nvidia.github.io/libnvidia-container/stable/debian11/$(ARCH) /
# deb https://nvidia.github.io/libnvidia-container/experimental/debian10/$(ARCH) /
deb https://nvidia.github.io/nvidia-container-runtime/stable/debian11/$(ARCH) /
# deb https://nvidia.github.io/nvidia-container-runtime/experimental/debian10/$(ARCH) /
Glossary
  • NVIDIA CUDA: a parallel computing platform and programming model for NVIDIA GPUs
  • cuDNN: a GPU-accelerated library for deep neural networks
  • TensorRT: a high-performance deep learning inference optimizer and runtime library
  • Containers: lightweight, standalone executables that include everything needed to run a piece of software, including the code, libraries, and system tools
  • NVIDIA container runtime: a platform developed by NVIDIA to support containerized applications that require access to GPU resources.

Apt sources for old versions of Python

The “deadsnakes” PPA was built for Ubuntu, but works fine on Debian too.

cat deadsnakes.list
deb http://ppa.launchpad.net/deadsnakes/ppa/ubuntu jammy main
deb-src http://ppa.launchpad.net/deadsnakes/ppa/ubuntu jammy main
Glossary
  • deadsnakes: A Personal Package Archive (PPA) on Launchpad for providing alternative versions of Python for Ubuntu and Debian systems.
  • PPA: Stands for Personal Package Archive, a software repository for Ubuntu users that allows them to distribute software and updates that are not available in official Ubuntu repositories.
  • jammy: The code name for Ubuntu 22.04.2 LTS (Jammy Jellyfish), the version of Ubuntu that the deadsnakes PPA was built for.

Install required Debian packages

Python 3.10

As I mentioned, Debian recently started using Python 3.11 for their default Python, and Torch isn’t compatible with Python 3.11 yet:

python3.11 -m pip install --break-system-packages torchvision
ERROR: Could not find a version that satisfies the requirement torchvision (from versions: 0.1.6, 0.1.7, 0.1.8, 0.1.9, 0.2.0, 0.2.1, 0.2.2, 0.2.2.post2, 0.2.2.post3)
ERROR: No matching distribution found for torchvision
: 1

I tried using nightly Torch, but there were more compatibility problems. So I decided to go back to using Python 3.10 for AI work:

sudo apt install -qq python3.10-venv
Some packages could not be installed. This may mean that you have
requested an impossible situation or if you are using the unstable
distribution that some required packages have not yet been created
or been moved out of Incoming.
The following information may help to resolve the situation:

The following packages have unmet dependencies:
 python3.10-venv : Depends: python3.10-distutils but it is not installable
E: Unable to correct problems, you have held broken packages.
: 100

Unfortunately, the packaging for python3.10-venv is currently broken. I solved this by making a python3.10-distutils-bogus package, like this:

cd ~/pkg
cat python3.10-distutils-bogus
Section: python
Priority: optional
Standards-Version: 3.9.2

Package: python3.10-distutils-bogus
Version: 1.0
Maintainer: Sam Watkins <sam@ucm.dev>
Provides: python3.10-distutils
Description: Dummy package to satisfy python3.10-distutils
equivs-build python3.10-distutils-bogus
dpkg-buildpackage: info: source package python3.10-distutils-bogus
dpkg-buildpackage: info: source version 1.0
dpkg-buildpackage: info: source distribution unstable
dpkg-buildpackage: info: source changed by Sam Watkins <sam@ucm.dev>
dpkg-buildpackage: info: host architecture amd64
 dpkg-source --before-build .
 debian/rules clean
dh clean
   dh_clean
 debian/rules binary
dh binary
   dh_update_autotools_config
   dh_autoreconf
   create-stamp debian/debhelper-build-stamp
   dh_prep
   dh_auto_install --destdir=debian/python3.10-distutils-bogus/
   dh_install
   dh_installdocs
   dh_installchangelogs
   dh_perl
   dh_link
   dh_strip_nondeterminism
   dh_compress
   dh_fixperms
   dh_missing
   dh_installdeb
   dh_gencontrol
   dh_md5sums
   dh_builddeb
dpkg-deb: building package 'python3.10-distutils-bogus' in '../python3.10-distutils-bogus_1.0_all.deb'.
 dpkg-genbuildinfo --build=binary -O../python3.10-distutils-bogus_1.0_amd64.buildinfo
 dpkg-genchanges --build=binary -O../python3.10-distutils-bogus_1.0_amd64.changes
dpkg-genchanges: info: binary-only upload (no source code included)
 dpkg-source --after-build .
dpkg-buildpackage: info: binary-only upload (no source included)

The package has been created.
Attention, the package has been created in the current directory,
not in ".." as indicated by the message above!
sudo dpkg -i ./python3.10-distutils-bogus_1.0_all.deb
(Reading database ... 1158515 files and directories currently installed.)
Preparing to unpack .../python3.10-distutils-bogus_1.0_all.deb ...
Unpacking python3.10-distutils-bogus (1.0) over (1.0) ...
Setting up python3.10-distutils-bogus (1.0) ...
sudo apt-get -q install python3.10-venv
Reading package lists...
Building dependency tree...
Reading state information...
python3.10-venv is already the newest version (3.10.10-2).
0 upgraded, 0 newly installed, 0 to remove and 4 not upgraded.
Glossary
  • Debian package dependencies: packages that are required for a specific package to install and function properly
  • python3.11: the default version of Python now used in “bookworm” the latest Debian testing release, and “sid” (unstable))
  • python3.10-venv: A package that provides the “venv” module for Python 3.10, which is used to create Python virtual environments.
  • python3.10-distutils: a module in Python 3.10 that provides tools for building and installing Python modules and packages
  • equivs-build: a tool for creating Debian packages that provide empty or dummy packages to satisfy dependencies
  • dpkg-buildpackage: a tool for building Debian packages from source code
  • broken packages: Packages that cannot be installed due to missing dependencies or conflicts with other packages.
  • Torch: A popular machine learning library, which is not compatible with Python 3.11 yet.
  • Python 3.10: A version of Python that is compatible with Torch.
  • Maintainer: The person or organization responsible for maintaining a package in the Debian package repository.
  • Provides: A field in a Debian package control file that specifies the name of a package that the current package provides. This can be used to satisfy dependencies of other packages that require the provided package.

Old and alpha versions of Python, from deadsnakes

I’m not actually using these alternative Python packages for AI at the moment.

However, it’s often useful to be able to use different versions of Python, so it’s an important part of my setup.

  • Python 3.11 is the new default version for Debian.
  • Python 3.10 is still available in Debian, it’s the one we need to use.
  • I have Python 2.7 left over from a previous Debian release.
  • I installed 3.7, 3.8, 3.9 and 3.12 from deadsnakes.
sudo apt install -qq -t jammy python3.7-venv python3.8-venv python3.9-venv
python3.7-venv is already the newest version (3.7.16-1+jammy1).
python3.8-venv is already the newest version (3.8.16-1+jammy1).
python3.9-venv is already the newest version (3.9.16-1+jammy1).
0 upgraded, 0 newly installed, 0 to remove and 4 not upgraded.
sudo apt install -qq -t jammy python3.12-venv
python3.12-venv is already the newest version (3.12.0~a5-1+jammy2).
0 upgraded, 0 newly installed, 0 to remove and 4 not upgraded.
apt policy python2; echo
for v in `seq 7 12`; do apt policy python3.$v; echo; done
python2:
  Installed: 2.7.18-3
  Candidate: 2.7.18-3
  Version table:
 *** 2.7.18-3 100
        100 /var/lib/dpkg/status

python3.7:
  Installed: 3.7.16-1+jammy1
  Candidate: 3.7.16-1+jammy1
  Version table:
 *** 3.7.16-1+jammy1 100
         80 http://ppa.launchpad.net/deadsnakes/ppa/ubuntu jammy/main amd64 Packages
        100 /var/lib/dpkg/status

python3.8:
  Installed: 3.8.16-1+jammy1
  Candidate: 3.8.16-1+jammy1
  Version table:
 *** 3.8.16-1+jammy1 100
         80 http://ppa.launchpad.net/deadsnakes/ppa/ubuntu jammy/main amd64 Packages
        100 /var/lib/dpkg/status

python3.9:
  Installed: 3.9.16-1+jammy1
  Candidate: 3.9.16-1+jammy1
  Version table:
 *** 3.9.16-1+jammy1 100
         80 http://ppa.launchpad.net/deadsnakes/ppa/ubuntu jammy/main amd64 Packages
        100 /var/lib/dpkg/status

python3.10:
  Installed: 3.10.10-2
  Candidate: 3.10.10-2
  Version table:
 *** 3.10.10-2 100
         90 http://deb.debian.org/debian sid/main amd64 Packages
        100 /var/lib/dpkg/status

python3.11:
  Installed: 3.11.2-4
  Candidate: 3.11.2-4
  Version table:
 *** 3.11.2-4 990
        990 http://deb.debian.org/debian bookworm/main amd64 Packages
         90 http://deb.debian.org/debian sid/main amd64 Packages
        100 /var/lib/dpkg/status
     3.11.2-1+jammy1 80
         80 http://ppa.launchpad.net/deadsnakes/ppa/ubuntu jammy/main amd64 Packages

python3.12:
  Installed: 3.12.0~a5-1+jammy2
  Candidate: 3.12.0~a5-1+jammy2
  Version table:
 *** 3.12.0~a5-1+jammy2 100
         80 http://ppa.launchpad.net/deadsnakes/ppa/ubuntu jammy/main amd64 Packages
        100 /var/lib/dpkg/status
Glossary
  • deadsnakes PPA: A Personal Package Archive (PPA) containing various versions of Python packages, maintained by the deadsnakes team on Launchpad.
  • Python 3.11: The new default version of Python for Debian, which is not yet compatible with Torch.
  • Python 3.10: A compatible version of Python that is still available in Debian and needed to use Torch.
  • Python 2.7: A legacy version of Python that I had installed from a previous Debian release.
  • Python 3.7, 3.8, 3.9, 3.12: Alternative versions of Python that I installed from the deadsnakes PPA; not strictly needed for most AI work.

NVIDIA CUDA

sudo apt -qq install cuda=12.0.1-1 cuda-drivers=525.85.12-1 cuda-11-7 cuda-11-8
cuda is already the newest version (12.0.1-1).
cuda-drivers is already the newest version (525.85.12-1).
cuda-11-7 is already the newest version (11.7.1-1).
cuda-11-8 is already the newest version (11.8.0-1).
0 upgraded, 0 newly installed, 0 to remove and 2 not upgraded.
Glossary
  • NVIDIA CUDA: A parallel computing platform and programming model developed by NVIDIA for general computing on GPUs (graphics processing units).
  • CUDA drivers: Software components that enable communication between the CUDA runtime and the hardware.
  • CUDA 11-7, CUDA 11-8 and CUDA 12-0: Different versions of the CUDA toolkit that are compatible with different GPU architectures.

NVIDIA Container Runtime

sudo apt -qq install nvidia-container-runtime
nvidia-container-runtime is already the newest version (3.12.0-1).
0 upgraded, 0 newly installed, 0 to remove and 4 not upgraded.
Glossary
  • NVIDIA Container Runtime: An open-source container runtime that enables the use of GPUs within containers, allowing applications to leverage the power of NVIDIA GPUs while maintaining the flexibility and portability of containerization. It is designed to work with Docker and other container engines and is optimized for use with NVIDIA GPUs.
  • Docker: A popular platform for developing, deploying, and running applications in containers. It provides a way to package an application and its dependencies into a single container that can be easily moved between environments.
  • GPU: Short for Graphics Processing Unit, a specialized processor designed to accelerate the rendering of images and video. In recent years, GPUs have become increasingly popular for running compute-intensive workloads, such as machine learning and scientific simulations.

NVIDIA cuDNN

sudo apt -qq install libcudnn8-dev
libcudnn8-dev is already the newest version (8.8.0.121-1+cuda12.0).
0 upgraded, 0 newly installed, 0 to remove and 4 not upgraded.
sudo apt -qq install /var/cudnn-local-repo-ubuntu2204-8.6.0.163/libcudnn8-samples_8.6.0.163-1+cuda11.8_amd64.deb
libcudnn8-samples is already the newest version (8.6.0.163-1+cuda11.8).
0 upgraded, 0 newly installed, 0 to remove and 4 not upgraded.
Glossary
  • NVIDIA cuDNN: A GPU-accelerated library for deep neural networks that is used to improve training and inference performance.
  • libcudnn8-dev: A package that provides the development files needed to compile software that uses NVIDIA cuDNN.
  • libcudnn8-samples: A package that contains sample code and programs that demonstrate how to use NVIDIA cuDNN in applications.

NVIDIA TensorRT

NVIDIA TensorRT is an inference accelerator for deep learning models. It optimizes and deploys trained neural networks for inferencing on NVIDIA GPUs. In order to use TensorRT, the python3-libnvinfer package needs to be installed.

Unfortunately, this package has a dependency issue, it requires a version of Python less than 3.11 for the package to work, but I couldn’t downgrade the system’s Python version. To resolve this issue, I modified the package to depend on the python3.10 package instead of python3 < 3.11. I did this by copying the package to a local directory, unpacking it, modifying the dependency information, and repacking it. After this, I was able to successfully install the modified package and its development version using the dpkg command.

Finally, I was able to install the main tensorrt package.

sudo apt-get -q install python3-libnvinfer-dev
Reading package lists...
Building dependency tree...
Reading state information...
Some packages could not be installed. This may mean that you have
requested an impossible situation or if you are using the unstable
distribution that some required packages have not yet been created
or been moved out of Incoming.
The following information may help to resolve the situation:

The following packages have unmet dependencies:
 python3-libnvinfer : Depends: python3 (< 3.11) but 3.11.2-1 is to be installed
E: Unable to correct problems, you have held broken packages.
: 100

I’d like to install it as a python3.10 library at least, but it is demanding that the system Python version, i.e. the version of Debian’s python3 package, should be less than 3.11, and I don’t want to try to change that. So I’ll adjust this python3-libnvinfer package to depend on the python3.10 package instead of python3 < 3.11.

cd /var/nv-tensorrt-local-repo-ubuntu2204-8.5.3-cuda-11.8
ls python3-libnvinfer*
python3-libnvinfer_8.5.3-1+cuda11.8_amd64.deb
python3-libnvinfer-dev_8.5.3-1+cuda11.8_amd64.deb
cp -v python3-libnvinfer* ~/soft-ai/
'python3-libnvinfer_8.5.3-1+cuda11.8_amd64.deb' -> '/home/sam/soft-ai/python3-libnvinfer_8.5.3-1+cuda11.8_amd64.deb'
'python3-libnvinfer-dev_8.5.3-1+cuda11.8_amd64.deb' -> '/home/sam/soft-ai/python3-libnvinfer-dev_8.5.3-1+cuda11.8_amd64.deb'
cd ~/soft-ai
command rm -rf unpacked
dpkg-deb -R python3-libnvinfer_8.5.3-1+cuda11.8_amd64.deb unpacked
grep Depends unpacked/DEBIAN/control
Depends: python3 (>= 3.10), python3 (<< 3.11), libnvinfer8 (= 8.5.3-1+cuda11.8), libnvinfer-plugin8 (= 8.5.3-1+cuda11.8), libnvparsers8 (= 8.5.3-1+cuda11.8), libnvonnxparsers8 (= 8.5.3-1+cuda11.8)
sed -i 's/python3 (>= 3.10), python3 (<< 3.11), /python3.10, /' unpacked/DEBIAN/control
grep Depends unpacked/DEBIAN/control
Depends: python3.10, libnvinfer8 (= 8.5.3-1+cuda11.8), libnvinfer-plugin8 (= 8.5.3-1+cuda11.8), libnvparsers8 (= 8.5.3-1+cuda11.8), libnvonnxparsers8 (= 8.5.3-1+cuda11.8)
dpkg-deb -b unpacked python3-libnvinfer_8.5.3-1+cuda11.8_amd64_fixed.deb
dpkg-deb: building package 'python3-libnvinfer' in 'python3-libnvinfer_8.5.3-1+cuda11.8_amd64_fixed.deb'.
sudo dpkg -i ./python3-libnvinfer_8.5.3-1+cuda11.8_amd64_fixed.deb ./python3-libnvinfer-dev_8.5.3-1+cuda11.8_amd64.deb
Selecting previously unselected package python3-libnvinfer.
(Reading database ... 1157567 files and directories currently installed.)
Preparing to unpack .../python3-libnvinfer_8.5.3-1+cuda11.8_amd64_fixed.deb ...
Unpacking python3-libnvinfer (8.5.3-1+cuda11.8) ...
Selecting previously unselected package python3-libnvinfer-dev.
Preparing to unpack .../python3-libnvinfer-dev_8.5.3-1+cuda11.8_amd64.deb ...
Unpacking python3-libnvinfer-dev (8.5.3-1+cuda11.8) ...
Setting up python3-libnvinfer (8.5.3-1+cuda11.8) ...
Setting up python3-libnvinfer-dev (8.5.3-1+cuda11.8) ...
sudo apt -qq install tensorrt
tensorrt is already the newest version (8.5.3.1-1+cuda11.8).
0 upgraded, 0 newly installed, 0 to remove and 4 not upgraded.
Glossary
  • NVIDIA TensorRT: An inference accelerator for deep learning models that optimizes and deploys trained neural networks for inferencing on NVIDIA GPUs.
  • python3-libnvinfer package: A package required to use TensorRT but has a dependency issue with the system’s Python version.
  • Dependency issue: A problem that arises when a software package requires certain libraries or packages to run, and those libraries or packages are either not installed or not compatible with the system.
  • dpkg command: A command used to install Debian packages.

Install Rust

Some Python modules now depend on Rust to build. Also, I wanted to build Anki from source. I decided to install Rust system-wide, in /opt/rust:

cd /tmp
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs > install_rust.sh
sudo RUSTUP_HOME=/opt/rust CARGO_HOME=/opt/rust sh ./install_rust.sh -y --no-modify-path
. /opt/rust/env
rustup default stable
Glossary
  • Rust: A systems programming language that is known for its speed, reliability, and memory safety features. It is often used for developing web browsers, operating systems, and game engines, and has gained popularity in the field of machine learning for its ease of integration with Python.
  • rustup: A command-line tool for managing Rust installations and its various components, such as different toolchains and target platforms.
  • CARGO_HOME: An environment variable used by Rust to specify the directory where Cargo, the package manager for Rust, stores its configuration and package cache.
  • RUSTUP_HOME: An environment variable used by Rust to specify the directory where rustup stores its configuration and installed toolchains.
  • Anki: a popular open-source flashcard application designed to help users learn and memorize information efficiently. It allows users to create digital flashcards containing text, images, audio, and video, and use various study techniques such as spaced repetition to optimize learning and retention. Anki is available for Windows, macOS, Linux, Android, and iOS platforms.

Python environments

I’m going to add Python venvs under /opt/venvs, and use hard linking to share large files between them.

Folder Depends Uses Purpose
python3.10-ai Debian’s python3.10 torch stable AI development
python3.10-webui python3.10-ai torch stable stable-diffusion-webui

The base venv python3.10-ai

This venv is a base environment for AI development, containing various packages and libraries useful for working with deep learning models and related tasks. These include popular libraries like fastai, PyTorch, TensorFlow, scikit-learn, and NumPy, as well as some more specialized tools. The different packages are described in the glossary for this section.

mkdir -p /opt/venvs
cd /opt/venvs
mkdir -p python3.10-ai
python3.10 -m venv python3.10-ai/venv
printf "%s\n" torch pipdeptree torchvision torchaudio tensorflow \
    jupyter jupyterlab ipywidgets bash_kernel jupyter-c-kernel nbdev fastai \
    pandas matplotlib scipy scikit-learn scikit-image gradio onnx \
    huggingface_hub transformers diffusers accelerate timm safetensors \
    numba fastbook > python3.10-ai/require.txt
(
set -e
. /opt/venvs/python3.10-ai/venv/bin/activate
pip install -qq -U -r python3.10-ai/require.txt
python -m bash_kernel.install
install_c_kernel
)
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
open-clip-torch 2.15.0 requires protobuf==3.20.*, but you have protobuf 3.19.6 which is incompatible.
Installing IPython kernel spec
Installing IPython kernel spec
/opt/venvs/python3.10-ai/venv/bin/install_c_kernel:32: DeprecationWarning: replace is ignored. Installing a kernelspec always replaces an existing installation
  KernelSpecManager().install_kernel_spec(td, 'c', user=user, replace=True, prefix=prefix)

Upgrade to the latest ipywidgets. This supposedly conflicts with fastbook, but is needed for Jupyter Lab.

(
set -e
. /opt/venvs/python3.10-ai/venv/bin/activate
pip install -qq -U ipywidgets
jupyter nbextension enable --py --sys-prefix widgetsnbextension
)
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
fastbook 0.0.29 requires ipywidgets<8, but you have ipywidgets 8.0.4 which is incompatible.
Enabling notebook extension jupyter-js-widgets/extension...
      - Validating: OK

The pydoc command doesn’t seem to be installed properly in venvs, so I added it:

cat <<'END' > python3.10-ai/venv/bin/pydoc
#!/bin/sh
python -m pydoc "$@"
END

chmod +x python3.10-ai/venv/bin/pydoc
Glossary
  • Python venvs: Virtual environments created by the Python venv module that allow users to create isolated Python environments with their own packages, versions, and configurations.
  • hard linking: A method of creating a new file that shares the same content as an existing file without duplicating it, saving storage space and reducing the time needed to create a copy.
  • torch: PyTorch, an open-source machine learning framework for building and training neural networks.
  • torchvision: A package that provides access to popular datasets, model architectures, and image transformations for PyTorch.
  • torchaudio: A package that provides audio processing functionalities for PyTorch, such as loading and decoding audio files, applying transforms, and computing spectrograms.
  • tensorflow: An open-source machine learning framework developed by Google for building and training neural networks.
  • jupyter: Jupyter Notebook, an open-source web application that allows users to create and share documents containing live code, equations, visualizations, and narrative text.
  • jupyterlab: The next-generation web-based user interface for Jupyter Notebook, featuring a more modern and flexible interface, multiple panes, and support for Jupyter extensions.
  • ipywidgets: A library that provides interactive HTML widgets for Jupyter Notebook and JupyterLab, enabling users to create sliders, dropdowns, buttons, and other graphical controls that can be used to modify code outputs and visualizations.
  • bash_kernel: A Jupyter kernel that allows users to run Bash commands and scripts in Jupyter Notebook and JupyterLab.
  • jupyter-c-kernel: A Jupyter kernel that allows users to run C code in Jupyter Notebook and JupyterLab.
  • nbdev: A library that allows users to create Python modules from Jupyter Notebooks, making it easier to develop, test, and publish code.
  • fastai: An open-source library built on top of PyTorch that provides high-level abstractions for training and deploying machine learning models, including computer vision, natural language processing, and tabular data analysis.
  • pandas: A data analysis library for Python that provides powerful data structures and tools for manipulating and analyzing data.
  • matplotlib: A plotting library for Python that provides a variety of visualizations, including line plots, scatter plots, bar charts, histograms, and more.
  • scipy: A library for scientific computing in Python that provides a wide range of mathematical algorithms, including optimization, integration, interpolation, signal processing, and more.
  • scikit-learn: A machine learning library for Python that provides a variety of supervised and unsupervised learning algorithms, including regression, classification, clustering, and dimensionality reduction.
  • scikit-image: An image processing library for Python that provides a variety of algorithms for image enhancement, filtering, segmentation, and feature extraction.
  • gradio: An open-source library that allows users to quickly create custom web interfaces for machine learning models, enabling users to interact with models using sliders, dropdowns, text boxes, and other controls.
  • onnx: Open Neural Network Exchange, an open-source format for representing machine learning models that allows interoperability between different frameworks and platforms.
  • huggingface_hub: A library that provides access to a wide range of pre-trained machine learning models for natural language processing, computer vision, and other tasks, hosted on the Hugging Face Hub.
  • transformers: A library that provides state-of-the-art natural language processing models for tasks such as sentiment analysis, question answering, and language translation, based on transformer architectures.
  • diffusers: A library that provides a set of PyTorch modules for training diffusion models, a type of probabilistic generative model that can be used for tasks such as image synthesis, denoising, and inpainting.
  • pipdeptree: a command-line utility that displays the installed Python packages in the form of a dependency tree
  • Accelerate: a library that enables PyTorch code to run across any distributed configuration with just four lines of code
  • PyTorch Image Models (timm): a deep-learning library that includes a collection of state-of-the-art computer vision models, layers, utilities, optimizers, schedulers, data-loaders, augmentations, and training/validating scripts with the ability to reproduce ImageNet training results
  • Safetensors: a repository that implements a new simple format for storing tensors safely and efficiently, instead of using pickle
  • Numba: a high-performance Python compiler that translates Python functions to optimized machine code at runtime using the industry-standard LLVM compiler library; it offers a range of options for parallelizing Python code for CPUs and GPUs, often with only minor code changes
  • fastbook: the Fast.ai book published as Jupyter Notebooks, that covers deep learning using fastai and PyTorch

The secondary venv python3.10-webui

This is a secondary Python venv called python3.10-webui. It will be used to support the stable-diffusion web user interface, and is based on the python3.10-ai venv. Hard linking will be used to share large files between the two venvs.

cd /opt/venvs
mkdir -p python3.10-webui
command rm -rf python3.10-webui/venv
cp -al python3.10-ai/venv python3.10-webui/venv
yes | venv_move python3.10-webui/venv
(
cat python3.10-ai/require.txt ~/soft-ai/stable-diffusion-webui/requirements.txt 
echo fastapi==0.90.1
) > python3.10-webui/require.txt
(
set -e
. python3.10-webui/venv/bin/activate
pip install -qq -U -r python3.10-webui/require.txt
)
ln -f python3.10-ai/venv/bin/pydoc python3.10-webui/venv/bin/pydoc
Glossary
  • hard linking: a file system feature that allows multiple files to share the same physical storage location on disk. Hard linking a file creates a new file that points to the same location as the original file. This can be used to save disk space and reduce redundancy in a file system.
  • venv_move: this is a Bash script that is used to move a Python virtual environment (venv) to a new location. The script takes one argument, which is the path to the venv directory that needs to be moved.

stable-diffusion-webui

In this section, we will install the automatic1111 stable-diffusion-webui app and set up custom scripts to update and launch it. The stable-diffusion-webui app is a web user interface for the Stable Diffusion model, which is a deep learning model for image classification. The update script will allow us to easily update the app when new changes are pushed to the Git repository. The launch script will activate the correct virtual environment and launch the web user interface.

The app works without these custom scripts, but I wanted to allow for careful updates and custom launch options.

cd ~/soft-ai
[ -d stable-diffusion-webui ] ||
git clone git@github.com:AUTOMATIC1111/stable-diffusion-webui.git
cd stable-diffusion-webui

my update script

cat update
#!/bin/sh
set -e
. /opt/venvs/python3.10-webui/venv/bin/activate
git stash
git pull
git stash pop || true
pip install -r requirements.txt

my launch script

cat launch
#!/bin/bash
set -ae
. /opt/venvs/python3.10-webui/venv/bin/activate
cd "$(dirname "$(readlink -f "$0")")"
SAFETENSORS_FAST_GPU=1
COMMANDLINE_ARGS="--xformers --api $*"  # --no-half-vae
REQS_FILE="requirements.txt"
python launch.py

Next, I launch the webui to install some extra requirements, and check that it works.

Glossary
  • stable-diffusion-webui: A web user interface for the Stable Diffusion model, a deep learning model for image classification. The stable-diffusion-webui app is installed in this section, along with custom scripts for updating and launching it.
  • update script: A custom script for updating the stable-diffusion-webui app. The script activates the correct virtual environment and pulls changes from the Git repository, installs any new requirements, and restarts the app.
  • launch script: A custom script for launching the stable-diffusion-webui app. The script activates the correct virtual environment and sets some environment variables before running the launch.py script.
  • SAFETENSORS_FAST_GPU=1: an environment variable used with the Safetensors library in deep learning applications. Normally, models are loaded to the CPU and then moved to the GPU, which can involve an additional memory copy step. The “SAFETENSORS_FAST_GPU=1” option allows models to be loaded directly onto the GPU, bypassing the memory copy step and potentially improving performance. However, this option is untested and may not be suitable for all applications.
  • COMMANDLINE_ARGS: An environment variable used in the launch script to pass command line arguments to the launch.py script.
  • REQS_FILE: An environment variable used in the launch script to specify which requirements file should be used for the stable-diffusion-webui app.
  • requirements.txt file: a text file that lists the dependencies of a Python project. It contains a list of package names and optional version numbers that are required for the project to run. This file can be used by the pip package installer to automatically install all required packages and their dependencies, like this: pip install -r requirements.txt

Compare the two venvs

In this section, we compare the packages installed in the two Python virtual environments we created earlier. By running the “pip freeze” command in each venv and storing the output in a text file, we can then compare the contents of those files using the “comm” command. The resulting output shows the packages that are installed in one venv but not the other. We can use this information to ensure that both venvs have the necessary packages for our projects.

Running pip check in both venvs can help ensure that all packages are installed correctly and functioning properly. In the output, we see a package conflict related to the version of protobuf installed, but this does not seem to cause any issues in practice as the app still works.

cd /opt/venvs
(
. python3.10-ai/venv/bin/activate
pip freeze | grep -v '^[#-]' | sort > python3.10-ai/freeze.txt
pip check
)
No broken requirements found.
(
. python3.10-webui/venv/bin/activate
pip freeze | grep -v '^[#-]' | sort > python3.10-webui/freeze.txt
pip check
)
tensorflow 2.11.0 has requirement protobuf<3.20,>=3.9.2, but you have protobuf 3.20.0.
: 1
comm -3 python3.10-ai/freeze.txt python3.10-webui/freeze.txt | tee comm.txt
Glossary

pip freeze: A command used to output the names and versions of installed Python packages in the format required for a requirements.txt file. We use pip freeze to generate the requirements.txt files for both venvs so we can compare them.

pip check: A command in pip that checks the consistency of installed packages, verifying that all dependencies are met and all files are intact. It will report any issues or conflicts with installed packages, and can help identify packages that need to be updated or removed.

sort: A Unix utility used to sort the lines of a file alphabetically. We use the sort command to ensure that the lines in the requirements.txt files are in the same order, making it easier to compare them. The comm utility requires that it’s inputs are sorted.

comm: A Unix utility used to compare two files line by line. In this section, we use the comm command to compare the contents of two requirements.txt files and display the lines that are unique to each file.

Check disk usage and savings

In this section, we use the du tool to check the disk usage and savings of the base venv and webui venv. By using hard links, we can see that we are saving nearly 6GB of disk space. We use the du -sh command to show the size of each venv separately and du -csh command to show the total size of both venvs.

du -sh ./python3.10-ai
du -sh ./python3.10-webui
5.9G    ./python3.10-ai
6.2G    ./python3.10-webui
du -csh ./python3.10-{ai,webui}
5.9G    ./python3.10-ai
652M    ./python3.10-webui
6.6G    total
Glossary
  • du: A command-line tool used to estimate file space usage in a file system. It can display the file size in human-readable format.
  • disk usage: The amount of disk space occupied by a file or directory in a file system.
  • hard links: A feature of the file system that allows multiple files to share the same data blocks on a storage device. Hard links allow a file to have multiple names in different directories or in the same directory.

Save the setup in a git repo

In this section, we initialize a new Git repository to save our virtual environment setup. We create a “.gitignore” file to exclude the venvs from version control, add all files to the Git staging area, and commit the changes with a message. This allows us to easily track and version our venv setup and changes over time.

git init
echo venv > .gitignore
git add -A
git commit -m 'venvs'
Reinitialized existing shared Git repository in /opt/venvs/.git/
[main 8fd5a24] venvs
 2 files changed, 15 insertions(+), 1 deletion(-)
Glossary
  • Git: A version control system used for tracking changes to files and collaborating on projects.
  • Repository: A central location in which data is stored and managed.
  • .gitignore: A file used to exclude files and directories from being tracked by Git.
  • Staging area: A place where files are stored before they are committed to the repository.
  • Commit: A permanent record of changes to files in the repository, along with a message that describes the changes.

Optional extras

Building xformers

If you need to build and install xformers from source, this is how to do it. It took nearly half an hour on a fast computer. I think that it didn’t build things in parallel. I used this with stable-diffusion-webui, when the binary packages of xformers weren’t working.

cd ~/soft-ai
[ -d xformers ] || git clone https://github.com/facebookresearch/xformers.git
cd xformers
git submodule update --init --recursive
git pull
git submodule update --recursive
Already up to date.
python setup.py clean --all
running clean
'build/lib.linux-x86_64-cpython-310' does not exist -- can't clean it
'build/bdist.linux-x86_64' does not exist -- can't clean it
'build/scripts-3.10' does not exist -- can't clean it
echo $VIRTUAL_ENV
/opt/venvs/python3.10-ai/venv
CUDA_HOME="/usr/local/cuda-11.7" CC=gcc-11 CXX=g++-11 MAKEFLAGS="-j$(nproc)" \
time pip install -e . 2>&1 | tee build.log
Obtaining file:///home/sam/soft-ai/xformers
  Preparing metadata (setup.py): started
  Preparing metadata (setup.py): finished with status 'done'
Requirement already satisfied: numpy in /opt/venvs/python3.10-ai/venv/lib/python3.10/site-packages (from xformers==0.0.17+b89a493.d20230303) (1.23.5)
Requirement already satisfied: pyre-extensions==0.0.23 in /opt/venvs/python3.10-ai/venv/lib/python3.10/site-packages (from xformers==0.0.17+b89a493.d20230303) (0.0.23)
Requirement already satisfied: torch>=1.12 in /opt/venvs/python3.10-ai/venv/lib/python3.10/site-packages (from xformers==0.0.17+b89a493.d20230303) (1.13.1)
Requirement already satisfied: typing-extensions in /opt/venvs/python3.10-ai/venv/lib/python3.10/site-packages (from pyre-extensions==0.0.23->xformers==0.0.17+b89a493.d20230303) (4.5.0)
Requirement already satisfied: typing-inspect in /opt/venvs/python3.10-ai/venv/lib/python3.10/site-packages (from pyre-extensions==0.0.23->xformers==0.0.17+b89a493.d20230303) (0.8.0)
Requirement already satisfied: nvidia-cuda-nvrtc-cu11==11.7.99 in /opt/venvs/python3.10-ai/venv/lib/python3.10/site-packages (from torch>=1.12->xformers==0.0.17+b89a493.d20230303) (11.7.99)
Requirement already satisfied: nvidia-cudnn-cu11==8.5.0.96 in /opt/venvs/python3.10-ai/venv/lib/python3.10/site-packages (from torch>=1.12->xformers==0.0.17+b89a493.d20230303) (8.5.0.96)
Requirement already satisfied: nvidia-cuda-runtime-cu11==11.7.99 in /opt/venvs/python3.10-ai/venv/lib/python3.10/site-packages (from torch>=1.12->xformers==0.0.17+b89a493.d20230303) (11.7.99)
Requirement already satisfied: nvidia-cublas-cu11==11.10.3.66 in /opt/venvs/python3.10-ai/venv/lib/python3.10/site-packages (from torch>=1.12->xformers==0.0.17+b89a493.d20230303) (11.10.3.66)
Requirement already satisfied: setuptools in /opt/venvs/python3.10-ai/venv/lib/python3.10/site-packages (from nvidia-cublas-cu11==11.10.3.66->torch>=1.12->xformers==0.0.17+b89a493.d20230303) (66.1.1)
Requirement already satisfied: wheel in /opt/venvs/python3.10-ai/venv/lib/python3.10/site-packages (from nvidia-cublas-cu11==11.10.3.66->torch>=1.12->xformers==0.0.17+b89a493.d20230303) (0.38.4)
Requirement already satisfied: mypy_extensions>=0.3.0 in /opt/venvs/python3.10-ai/venv/lib/python3.10/site-packages (from typing-inspect->pyre-extensions==0.0.23->xformers==0.0.17+b89a493.d20230303) (1.0.0)
Installing collected packages: xformers
  Running setup.py develop for xformers
Successfully installed xformers-0.0.17+b89a493.d20230303
1498.23user 47.47system 21:29.43elapsed 119%CPU (0avgtext+0avgdata 1777728maxresident)k
4848inputs+12701968outputs (64major+40118051minor)pagefaults 0swaps
Glossary
  • xformers: a library for accelerating transformer-based models in PyTorch.
  • CUDA_HOME: an environment variable that specifies the path to the CUDA installation directory.
  • gcc-11 and g++-11: the GNU Compiler Collection version 11 for the C and C++ programming languages, respectively. Xformers won’t build with gcc-12, so we need to specify the older version.
  • nproc: a command that outputs the number of processing units available to the current process.
  • setup.py: a Python script that is used to package and distribute Python modules.
  • submodule: a feature in Git that allows a repository to contain another repository as a subdirectory.
  • tee: a command that reads standard input and writes it to both standard output and one or more files.
  • time: a command that displays the system time for a command to execute.

Conclusion

In this post, I described my journey of setting up a development environment for deep learning on Debian, and the challenges I faced while doing so. I went through various steps, such as setting up apt sources, installing required packages (including Python, NVIDIA CUDA, cuDNN, and TensorRT), installing Rust, and setting up Python virtual environments. I also described how I set up stable-diffusion-webui, a how I built Xformers from source.

I hope the post can be useful for anyone looking to set up a development environment for deep learning on Debian, if only as a cautionary tale of what not to do; and in particular I expect that this might be useful as a reference for myself in future.