Building the Library¶
This page outlines the process of building the dtFFT library using CMake, including compiler requirements, configuration options, and integration instructions for downstream projects.
The library supports both host and GPU environments, leveraging modern Fortran and optional dependencies like CUDA, FFTW3, MKL, cuFFT, and VkFFT.
Prerequisites¶
Since dtFFT is primarily written in Fortran, a modern Fortran compiler (2008 standard or later) is required. The library has been successfully tested with:
GNU Fortran (gfortran): Version 12 and above
Intel Fortran (ifort / ifx): Version 18 and above
NVHPC Fortran (nvfortran): Version 24.5 and above
Currently, dtFFT can only be built using CMake (version 3.25 or higher recommended). Ensure CMake is installed and available in your PATH before proceeding.
Requirements:
CMake: Version 3.25 or higher
Modern Fortran compiler: 2008 standard or later
MPI: Message Passing Interface (MPI) implementation
Caliper (optional): For performance profiling and analysis
For CUDA support:
CUDA-aware MPI: Required for GPU acceleration
NCCL (optional): NVIDIA Collective Communications Library (automatically linked if
nvfortranis used)nvfortran (optional): NVHPC Fortran compiler (enables additional features like NCCL and cuFFTMp)
NVTX3 (optional): NVIDIA Tools Extension for profiling and debugging
Configuration Options¶
The build process is controlled via CMake options, listed below. These options enable or disable features such as GPU support, FFT library integration, and additional utilities.
Set them using -D<OPTION>=<VALUE> during CMake configuration.
Option |
Possible Values |
Default |
Description |
|---|---|---|---|
|
|
|
Enables CUDA support. Requires |
|
|
|
Enables FFTW3 support. Requires the |
|
|
|
Enables MKL DFTI support. Requires the |
|
|
|
Enables cuFFT support. Automatically sets |
|
|
|
Enables VkFFT support. Requires the |
|
|
|
Builds the library’s test suite. |
|
|
|
Enables code coverage analysis (gfortran only). |
|
|
|
Builds a shared library instead of a static one. |
|
|
|
Uses the Fortran |
|
|
|
Builds the C and C++ APIs alongside the Fortran API. |
|
|
|
Enables persistent MPI communications for multiple plan executions.
Communications are initialized on the first call to |
|
|
|
Enables profiling. Uses NVTX3 with CUDA support or Caliper otherwise (requires |
|
|
|
Requires the |
|
|
|
Disables use of NCCL shipped with HPC-SDK. NCCL Backends will be unavailable |
|
|
|
Disables use of NVSHMEM-based backends shipped with HPC-SDK. |
|
|
|
Enable error checking for all GPU libraries calls. Can be turned off for best performance. |
|
|
|
Enable MPI RMA backends (currently in beta). It has been noticed that call to |
|
|
|
Enables input parameter checks for plan execution functions. Should be turned off by advanced users to best performance. |
|
|
|
Enables ZFP support for compressed transposes. Requires the |
|
|
|
Enables mock version of CUDA support for testing and development without a CUDA-capable GPU. This option is intended for development and testing purposes only and should not be used in production environments. |
|
|
|
Enables OpenMP support. |
|
|
|
Enables threads library of FFTW instead of OpenMP. This option is only applicable if both |
Building the Library¶
Configure the Build: Run CMake to generate build files, specifying the installation prefix and desired options. For example:
cmake -S . -B build -DCMAKE_INSTALL_PREFIX=/path/to/install -DDTFFT_WITH_CUDA=ON -DDTFFT_WITH_CUFFT=ON
Replace /path/to/install with your target installation directory.
Note
CUDA support in dtFFT does not replace the host version but extends it. For more details, refer to the guide
here and the environment variable DTFFT_PLATFORM.
Build the Library: Compile the library using:
cmake --build build --target install
This compiles and installs dtFFT to the specified prefix.
Integration with CMake Projects¶
Once installed, dtFFT can be integrated into other CMake projects using find_package. Example configuration:
find_package(dtfft REQUIRED)
add_executable(my_prog my_prog.c)
target_link_libraries(my_prog PRIVATE dtfft)
The dtfft target automatically sets include directories and links required libraries. Specify the installation path when configuring your project:
cmake -S . -B build -Ddtfft_DIR=/path/to/install/lib[64]/cmake/dtfft ..
The installation also provides the following CMake variables for conditional compilation:
DTFFT_WITH_CUDA: Indicates CUDA supportDTFFT_WITH_C_CXX_API: Indicates C/C++ API availabilityDTFFT_WITH_MPI_MODULE: Indicates use of thempimoduleDTFFT_WITH_NCCL: Indicates NCCL supportDTFFT_WITH_NVSHMEM: Indicates NVSHMEM supportDTFFT_WITH_OPENMP: Indicates OpenMP support
Python Package¶
dtFFT exposes a large number of CMake configuration options (see Configuration Options above) that cannot all be reflected in pre-built packages. The wheels published on PyPI cover only a limited subset of the available functionality.
For most real-world use cases — especially those requiring a specific MPI implementation, FFTW3, cuFFT, or other optional backends — it is strongly recommended to build the package from source so that all required options can be enabled.
Installing from PyPI¶
Pre-built wheels are available on PyPI for Linux (x86_64 and aarch64) and macOS (Apple Silicon). Choose the variant that matches your environment:
Package |
FFT backend |
MPI |
OS |
Platform |
Extra system requirements |
|---|---|---|---|---|---|
|
none (transpose only) |
OpenMPI |
Linux (x86_64, aarch64), macOS (arm64) |
CPU |
— |
|
none (transpose only) |
MPICH |
Linux (x86_64, aarch64), macOS (arm64) |
CPU |
— |
|
FFTW3 |
OpenMPI |
Linux (x86_64, aarch64), macOS (arm64) |
CPU |
system |
|
FFTW3 |
MPICH |
Linux (x86_64, aarch64) |
CPU |
system |
|
cuFFT |
OpenMPI |
Linux (x86_64, aarch64) |
CPU + NVIDIA GPU (CUDA 12) |
CUDA 12 toolkit, |
|
cuFFT |
MPICH |
Linux (x86_64, aarch64) |
CPU + NVIDIA GPU (CUDA 12) |
CUDA 12 toolkit, |
All packages share the same importable namespace: import dtfft.
To install, simply run:
# Transpose-only (no FFT backend) — OpenMPI
pip install dtfft-openmpi
# Transpose-only (no FFT backend) — MPICH
pip install dtfft-mpich
# With FFTW3 — OpenMPI
pip install dtfft-fftw-openmpi
# CUDA 12 — OpenMPI
pip install dtfft-cuda12x-openmpi
Installing from Source¶
A source distribution (sdist) of dtfft is published to PyPI alongside the pre-built wheels as part of the CI/CD pipeline.
This allows installation with fully custom CMake options on any supported platform — without cloning the repository.
Build-time Python dependencies (scikit-build, cmake, ninja, pybind11) are declared in pyproject.toml and are installed automatically by pip.
Building from source requires:
A Fortran compiler (GCC ≥ 10, Intel, or NVHPC)
CMake ≥ 3.25
An MPI implementation with development headers
Python ≥ 3.9
Warning
The Python environment must contain:
mpi4pycompiled against the exact MPI implementation that dtFFT will be linked with. Installing a pre-built binary viapip install mpi4pymay link against a different MPI and cause runtime errors. Always build it from source:pip install --no-binary mpi4py mpi4py.cupymatching your CUDA toolkit version (e.g.cupy-cuda12xfor CUDA 12) when building with CUDA support. Using a mismatched CuPy version will lead to import failures or silent data corruption.
Install runtime Python dependencies:
# mpi4py must be compiled against the MPI you have installed pip install --no-binary mpi4py mpi4py numpy # for CUDA builds, install cupy matching your CUDA version, e.g.: # pip install cupy-cuda12x
Build and install from source:
Use the
CMAKE_ARGSenvironment variable to enable optional backends. pip will automatically fetch the source distribution from PyPI and build it locally.Transpose-only (no FFT backend):
pip install dtfft
With FFTW3:
CMAKE_ARGS="-DDTFFT_WITH_FFTW=ON" pip install dtfft
With cuFFT (CUDA):
CMAKE_ARGS="-DDTFFT_WITH_CUDA=ON -DDTFFT_WITH_CUFFT=ON" pip install dtfft
Note
If FFTW3 is not in a standard location, pass its path via
CMAKE_ARGSas well:CMAKE_ARGS="-DDTFFT_WITH_FFTW=ON -DFFTWDIR=/path/to/fftw" pip install dtfft
Verify the installation:
import dtfft print(dtfft.is_fftw_enabled()) # True if built with FFTW3 print(dtfft.is_cufft_enabled()) # True if built with cuFFT