C++ API Reference

This page describes all classes, enumerators and functions available in dtFFT C++ API. In order to use them user have to #include <dtfft.hpp>. All API is contained within dtfft namespace.

Predefined Macros

DTFFT_CXX_CALL(call)

Safe call macro.

Should be used to check error codes returned by dtFFT.

Throws an exception with a message explaining the error if one occurs.

Example

DTFFT_CXX_CALL( plan.execute(a, b, dtfft::Execute::FORWARD) )

Enumerators

enum class dtfft::Error

This enum lists the different error codes that dtFFT can return.

Values:

enumerator SUCCESS

Successful execution.

enumerator MPI_FINALIZED

MPI_Init is not called or MPI_Finalize has already been called.

enumerator PLAN_NOT_CREATED

Plan not created.

enumerator INVALID_TRANSPOSE_TYPE

Invalid transpose_type provided.

enumerator INVALID_N_DIMENSIONS

Invalid Number of dimensions provided.

Valid options are 2 and 3

enumerator INVALID_DIMENSION_SIZE

One or more provided dimension sizes <= 0.

enumerator INVALID_COMM_TYPE

Invalid communicator type provided.

enumerator INVALID_PRECISION

Invalid precision parameter provided.

enumerator INVALID_EFFORT

Invalid effort parameter provided.

enumerator INVALID_EXECUTOR

Invalid executor parameter provided.

enumerator INVALID_COMM_DIMS

Number of dimensions in provided Cartesian communicator > Number of dimension passed to create subroutine.

enumerator INVALID_COMM_FAST_DIM

Passed Cartesian communicator with number of processes in 1st (fastest varying) dimension > 1.

enumerator MISSING_R2R_KINDS

For R2R plan, kinds parameter must be passed if executor != Executor::NONE

enumerator INVALID_R2R_KINDS

Invalid values detected in kinds parameter.

enumerator R2C_TRANSPOSE_PLAN

Transpose plan is not supported in R2C, use R2R or C2C plan instead.

enumerator INPLACE_TRANSPOSE

Inplace transpose is not supported.

enumerator INVALID_AUX

Invalid aux buffer provided.

enumerator INVALID_DIM

Invalid dim passed to Plan.get_pencil

enumerator INVALID_USAGE

Invalid API Usage.

enumerator PLAN_IS_CREATED

Trying to create already created plan.

enumerator R2R_FFT_NOT_SUPPORTED

Selected executor does not support R2R FFTs.

enumerator ALLOC_FAILED

Internal call of Plan.mem_alloc failed.

enumerator FREE_FAILED

Internal call of Plan.mem_free failed.

enumerator DLOPEN_FAILED

Failed to dynamically load library.

enumerator DLSYM_FAILED

Failed to dynamically load symbol.

enumerator R2C_TRANSPOSE_CALLED

Calling to Plan.transpose for R2C plan is not allowed.

enumerator PENCIL_ARRAYS_SIZE_MISMATCH

Sizes of starts and counts arrays passed to Pencil constructor do not match.

enumerator PENCIL_ARRAYS_INVALID_SIZES

Sizes of starts and counts < 2 or > 3 provided to Pencil constructor.

enumerator PENCIL_INVALID_COUNTS

Invalid counts provided to Pencil constructor.

enumerator PENCIL_INVALID_STARTS

Invalid starts provided to Pencil constructor.

enumerator PENCIL_SHAPE_MISMATCH

Processes have same lower bounds (starts) but different sizes in some dimensions.

enumerator PENCIL_OVERLAP

Pencil overlap detected, i.e.

two processes share same part of global space

enumerator PENCIL_NOT_CONTINUOUS

Local pencils do not cover the global space without gaps.

enumerator PENCIL_NOT_INITIALIZED

Pencil is not initialized, i.e.

constructor subroutine was not called

enumerator INVALID_ALLOC_BYTES

Invalid alloc_bytes provided.

enumerator GPU_INVALID_STREAM

Invalid stream provided.

enumerator GPU_INVALID_BACKEND

Invalid GPU backend provided.

enumerator GPU_NOT_SET

Multiple MPI Processes located on same host share same GPU which is not supported.

enumerator VKFFT_R2R_2D_PLAN

When using R2R FFT and executor type is vkFFT and plan uses Z-slab optimization, it is required that types of R2R transform are same in X and Y directions.

enumerator GPU_BACKENDS_DISABLED

Passed effort == Effort::PATIENT but all GPU backends have been disabled by Config

enumerator NOT_DEVICE_PTR

One of pointers passed to Plan.execute or Plan.transpose cannot be accessed from device.

enumerator NOT_NVSHMEM_PTR

One of pointers passed to Plan.execute or Plan.transpose is not an NVSHMEM pointer.

enumerator INVALID_PLATFORM

Invalid platform provided.

enumerator INVALID_PLATFORM_EXECUTOR_TYPE

Invalid executor provided for selected platform.

enum class dtfft::Execute

This enum lists valid execute_type parameters that can be passed to Plan.execute.

Values:

enumerator FORWARD

Perform XYZ –> YXZ –> ZXY plan execution (Forward)

enumerator BACKWARD

Perform ZXY –> YXZ –> XYZ plan execution (Backward)

enum class dtfft::Transpose

This enum lists valid transpose_type parameters that can be passed to Plan.transpose.

Values:

enumerator X_TO_Y

Transpose from Fortran X aligned to Fortran Y aligned.

enumerator Y_TO_X

Transpose from Fortran Y aligned to Fortran X aligned.

enumerator Y_TO_Z

Transpose from Fortran Y aligned to Fortran Z aligned.

enumerator Z_TO_Y

Transpose from Fortran Z aligned to Fortran Y aligned.

enumerator X_TO_Z

Transpose from Fortran X aligned to Fortran Z aligned.

Note

This value is valid to pass only in 3D Plan and value returned by Plan.get_z_slab_enabled must be true

enumerator Z_TO_X

Transpose from Fortran Z aligned to Fortran X aligned.

Note

This value is valid to pass only in 3D Plan and value returned by Plan.get_z_slab_enabled must be true

enum class dtfft::Precision

This enum lists valid precision parameters that can be passed to Plan constructors.

Values:

enumerator SINGLE

Use Single precision.

enumerator DOUBLE

Use Double precision.

enum class dtfft::Effort

This enum lists valid effort parameters that can be passed to Plan constructors.

Values:

enumerator ESTIMATE

Create plan as fast as possible.

enumerator MEASURE

Will attempt to find best MPI Grid decomposition.

Passing this flag and MPI Communicator with cartesian topology to any Plan Constructor is same as Effort::ESTIMATE.

enumerator PATIENT

Same as Effort::MEASURE plus cycle through various send and receive MPI_Datatypes.

For GPU Build this flag will run autotune procedure to find best backend

enum class dtfft::Executor

This enum lists available FFT executors.

Values:

enumerator NONE

Do not create any FFT plans.

Creates transpose only plan.

enumerator FFTW3

FFTW3 Executor (Host only)

enumerator MKL

MKL DFTI Executor (Host only)

enumerator CUFFT

CUFFT Executor (GPU Only)

enumerator VKFFT

VkFFT Executor (GPU Only)

enum class dtfft::R2RKind

Real-to-Real FFT kinds available in dtFFT.

Values:

enumerator DCT_1

DCT-I (Logical N=2*(n-1), inverse is R2RKind::DCT_1)

enumerator DCT_2

DCT-II (Logical N=2*n, inverse is R2RKind::DCT_3)

enumerator DCT_3

DCT-III (Logical N=2*n, inverse is R2RKind::DCT_2)

enumerator DCT_4

DCT-IV (Logical N=2*n, inverse is R2RKind::DCT_4)

enumerator DST_1

DST-I (Logical N=2*(n+1), inverse is R2RKind::DST_1)

enumerator DST_2

DST-II (Logical N=2*n, inverse is R2RKind::DST_3)

enumerator DST_3

DST-III (Logical N=2*n, inverse is R2RKind::DST_2)

enumerator DST_4

DST-IV (Logical N=2*n, inverse is R2RKind::DST_4)

enum class dtfft::Backend

Various Backends available in dtFFT.

Note

This enum is only present in the API when dtFFT was compiled with CUDA Support.

Values:

enumerator MPI_DATATYPE

Backend that uses MPI datatypes.

Not really recommended to use, since it is a million times slower than other backends. It is present here just to show how slow MPI Datatypes are for GPU usage

enumerator MPI_P2P

MPI peer-to-peer algorithm.

enumerator MPI_P2P_PIPELINED

MPI peer-to-peer algorithm with overlapping data copying and unpacking.

enumerator MPI_A2A

MPI backend using MPI_Alltoallv.

enumerator NCCL

NCCL backend.

enumerator NCCL_PIPELINED

NCCL backend with overlapping data copying and unpacking.

enumerator CUFFTMP

cuFFTMp backend

enumerator CUFFTMP_PIPELINED

cuFFTMp backend that uses additional buffer to avoid extra copy and gain performance

enum class dtfft::Platform

Enum that specifies runtime platform, e.g.

Host, CUDA, HIP

Values:

enumerator HOST

Host.

enumerator CUDA

CUDA.

Functions

std::string dtfft::get_backend_string(const Backend backend)

Returns string with name of backend provided as argument.

Note

This function is only present in the API when dtFFT was compiled with CUDA Support.

Parameters:

backend[in] Backend to represent

Returns:

String representation of backend.

inline std::string dtfft::get_error_string(Error error_code) noexcept

Returns the string description of an error code.

Parameters:

error_code[in] Error code to convert to string

Returns:

String representation of error_code

inline std::string dtfft::get_precision_string(Precision precision) noexcept

Returns the string representation of a Precision value.

Parameters:

precision[in] Precision level to convert to string

Returns:

String representation of Precision.

inline std::string dtfft::get_executor_string(Executor executor) noexcept

Returns the string representation of an Executor value.

Parameters:

executor[in] Executor type to convert to string

Returns:

String representation of Executor.

inline Error dtfft::set_config(Config &config) noexcept

Sets configuration values to dtFFT.

Must be called before plan creation to take effect.

See also

Config

Returns:

Error::SUCCESS if the call was successful, error code otherwise

Classes

class Version

dtFFT version information

Public Static Functions

static inline int32_t get() noexcept
Returns:

Version Code defined during compilation

static inline constexpr int32_t get(int32_t major, int32_t minor, int32_t patch) noexcept
Returns:

Version Code based on input parameters

Public Static Attributes

static constexpr int32_t MAJOR = DTFFT_VERSION_MAJOR

dtFFT Major Version

static constexpr int32_t MINOR = DTFFT_VERSION_MINOR

dtFFT Minor Version

static constexpr int32_t PATCH = DTFFT_VERSION_PATCH

dtFFT Patch Version

static constexpr int32_t CODE = DTFFT_VERSION_CODE

dtFFT Version Code.

Can be used for version comparison

class Exception : public std::exception

Basic exception class.

Public Functions

inline Exception(Error error_code, std::string msg, const char *file, int line)

Basic exception constructor.

Parameters:
  • error_code[in] Error code

  • msg[in] Message describing the error that occurred

  • file[in] Filename where the exception was thrown

  • line[in] Line number where the exception was thrown

inline const char *what() const noexcept override

Exception explanation.

inline Error get_error_code() const noexcept

Returns error code of exception.

inline const std::string &get_message() const noexcept

Returns error message of exception.

inline const std::string &get_file() const noexcept

Returns file name where exception occurred.

inline int get_line() const noexcept

Returns line number where exception occurred.

class Pencil

Class to handle Pencils.

This is wrapper around dtfft_pencil_t C structure.

There are two ways users might find pencils useful inside dtFFT:

  1. To create a Plan using users’s own grid decomposition, you can pass Pencil to Plan constructor.

  2. To obtain Pencil from Plan in all possible layouts, in order to run FFT not available in dtFFT.

Public Functions

inline Pencil()

Default constructor.

inline explicit Pencil(const int32_t n_dims, const int32_t *starts, const int32_t *counts)

Pencil constructor.

After calling this constructor, this pencil can be used to create Plan

Parameters:
  • n_dims[in] Number of dimensions in pencil

  • starts[in] Local starts in natural Fortran order

  • counts[in] Local counts in natural Fortran order

inline explicit Pencil(const std::vector<int32_t> &starts, const std::vector<int32_t> &counts)

Pencil constructor.

After calling this constructor, this pencil can be used to create Plan

Parameters:
  • starts[in] Local starts in natural Fortran order

  • counts[in] Local counts in natural Fortran order

inline uint8_t get_ndims() const
Returns:

Number of dimensions in a pencil

inline uint8_t get_dim() const
Returns:

Aligned dimension id starting from 1

inline const std::vector<int32_t> &get_starts() const
Returns:

Local starts in natural Fortran order

inline const std::vector<int32_t> &get_counts() const
Returns:

Local counts in natural Fortran order

inline size_t get_size() const
Returns:

Total number of elements in a pencil

inline const dtfft_pencil_t &c_struct() const
Returns:

Underlying C structure

class Config

Class to set additional configuration parameters to dtFFT.

See also

set_config()

Public Functions

inline Config(bool enable_z_slab = true, Platform platform = Platform::HOST, dtfft_stream_t stream = nullptr, Backend backend = Backend::MPI_P2P, bool enable_mpi_backends = false, bool enable_pipelined_backends = true, bool enable_nccl_backends = true, bool enable_nvshmem_backends = true)

Construct a new Config object.

User must call set_config to set this configuration to dtFFT.

Note

This constructor is only present in the API when dtFFT was compiled with CUDA Support.

Parameters:
  • enable_z_slab[in] Whether to enable Z-slab optimization or not

  • platform[in] Platform to use

  • stream[in] Stream to use

  • backend[in] Backend to use

  • enable_mpi_backends[in] Whether to enable MPI backends or not

  • enable_pipelined_backends[in] Whether to enable pipelined backends or not

  • enable_nccl_backends[in] Whether to enable NCCL backends or not

  • enable_nvshmem_backends[in] Whether to enable NVSHMEM backends or not

inline Config &set_enable_z_slab(bool enable_z_slab) noexcept

Sets whether dtFFT use Z-slab optimization or not.

Default is true

One should consider disabling Z-slab optimization in order to resolve Error::VKFFT_R2R_2D_PLAN error OR when underlying FFT implementation of 2D plan is too slow.

In all other cases it is considered that Z-slab is always faster, since it reduces number of data transpositions.

inline Config &set_platform(Platform platform) noexcept

Sets platform to execute plan.

Default is Platform::HOST

This option is only defined with device support build. Even when dtFFT is build with device support it does not necessary means that all plans must be related to device.

inline Config &set_stream(dtfft_stream_t stream) noexcept

Sets Main CUDA stream that will be used in dtFFT.

This parameter is a placeholder for user to set custom stream. Stream that is actually used by dtFFT plan is returned by Plan.get_stream function. When user sets stream he is responsible of destroying it.

Stream must not be destroyed before call to destroy.

Note

This method is only present in the API when dtFFT was compiled with CUDA Support.

inline Config &set_backend(Backend backend) noexcept

Sets GPU Backend that will be used by dtFFT when effort is Effort::ESTIMATE or Effort::MEASURE.

Default is Backend::NCCL

Note

This method is only present in the API when dtFFT was compiled with CUDA Support.

inline Config &set_enable_mpi_backends(bool enable_mpi_backends) noexcept

Sets whether MPI GPU Backends be enabled when effort is DTFFT_PATIENT or not.

Default is false

MPI Backends are disabled by default during autotuning process due to OpenMPI Bug https://github.com/open-mpi/ompi/issues/12849 It was noticed that during plan autotuning GPU memory not being freed completely.

For example: 1024x1024x512 C2C, double precision, single GPU, using Z-slab optimization, with MPI backends enabled, plan autotuning will leak 8Gb GPU memory. Without Z-slab optimization, running on 4 GPUs, will leak 24Gb on each of the GPUs.

One of the workarounds is to disable MPI Backends by default, which is done here.

Other is to pass “–mca btl_smcuda_use_cuda_ipc 0” to mpiexec, but it was noticed that disabling CUDA IPC seriously affects overall performance of MPI algorithms

Note

This method is only present in the API when dtFFT was compiled with CUDA Support.

inline Config &set_enable_pipelined_backends(bool enable_pipelined_backends) noexcept

Sets whether pipelined GPU backends be enabled when effort is Effort::PATIENT or not.

Default is true

Note

Pipelined backends require additional buffer that user has no control over.

Note

This method is only present in the API when dtFFT was compiled with CUDA Support.

inline Config &set_enable_nccl_backends(bool enable_nccl_backends) noexcept

Sets whether NCCL Backends be enabled when effort is Effort::PATIENT or not.

Default is true.

Note

This method is only present in the API when dtFFT was compiled with CUDA Support.

inline Config &set_enable_nvshmem_backends(bool enable_nvshmem_backends) noexcept

Should NVSHMEM Backends be enabled when effort is Effort::PATIENT or not.

Default is true.

Note

This method is only present in the API when dtFFT was compiled with CUDA Support.

inline dtfft_config_t c_struct() const
Returns:

Underlying C structure

class Plan

Abstract plan for all dtFFT plans.

This class does not have any constructors. To create a plan user should use one of the inherited classes.

Subclassed by dtfft::PlanC2C, dtfft::PlanR2C, dtfft::PlanR2R

Public Functions

inline Error get_z_slab_enabled(bool *is_z_slab_enabled) const noexcept

Checks if plan is using Z-slab optimization.

If true then flags Transpose::X_TO_Z and Transpose::Z_TO_X will be valid to pass to Plan.transpose method.

Parameters:

is_z_slab_enabled[out] Boolean value if Z-slab is used.

Returns:

Error::SUCCESS if call was without error, error code otherwise

inline bool get_z_slab_enabled() const

Checks if plan is using Z-slab optimization.

Throws:

Exception – if underlying call fails

Returns:

true if Z-slab is enabled, false otherwise

inline Error report() const noexcept

Prints plan-related information to stdout.

Returns:

Error::SUCCESS if call was without error, error code otherwise

inline Error get_pencil(const int32_t dim, Pencil &pencil) const noexcept

Obtains pencil information from plan.

This can be useful when user wants to use own FFT implementation, that is unavailable in dtFFT.

Parameters:
  • dim[in] Required dimension:

    • 0 for XYZ layout (real space, valid for PlanR2C only)

    • 1 for XYZ layout (real space for C2C and R2R plans and fourier space for R2C plans)

    • 2 for YXZ layout

    • 3 for ZXY layout

  • pencil[out] Created Pencil object

Returns:

Error::SUCCESS on success or error code on failure.

inline Pencil get_pencil(const int32_t dim) const

Get the pencil object.

Parameters:

dim[in] Required dimension:

  • 0 for XYZ layout (real space, valid for PlanR2C only)

  • 1 for XYZ layout (real space for C2C and R2R plans and fourier space for R2C plans)

  • 2 for YXZ layout

  • 3 for ZXY layout

Throws:

Exception – if underlying call fails

Returns:

Created Pencil object

inline Error execute(void *in, void *out, const Execute execute_type, void *aux = nullptr) const noexcept

Plan execution.

Note

Parameters:
  • in[inout] Incoming pointer

  • out[out] Result pointer

  • execute_type[in] Type of execution

  • aux[inout] Optional auxiliary pointer

Returns:

Error::SUCCESS on success or error code on failure.

inline Error transpose(void *in, void *out, const Transpose transpose_type) const noexcept

Transpose data in single dimension, e.g.

X align -> Y align

Attention

in and out cannot be the same pointers

Parameters:
  • in[inout] Incoming pointer

  • out[out] Transposed pointer

  • transpose_type[in] Type of transpose to perform.

Returns:

Error::SUCCESS on success or error code on failure.

inline Error get_alloc_size(size_t *alloc_size) const noexcept

Wrapper around Plan.get_local_sizes to obtain alloc_size only.

Parameters:

alloc_size[out] Minimum number of elements to be allocated for in, out or aux buffers. Size of each element in bytes can be obtained by calling Plan.get_element_size.

Returns:

Error::SUCCESS on success or error code on failure.

inline size_t get_alloc_size() const

Wrapper around Plan.get_local_sizes to obtain alloc_size only.

Throws:

Exception – if underlying call fails

Returns:

Minimum number of elements to be allocated for in, out or aux buffers.

inline Error get_local_sizes(std::vector<int32_t> &in_starts, std::vector<int32_t> &in_counts, std::vector<int32_t> &out_starts, std::vector<int32_t> &out_counts, size_t *alloc_size) const noexcept

Get grid decomposition information.

Results may differ on different MPI processes

Note

Before calling this function, user must ensure that in_starts, in_counts, out_starts and out_counts vectors are large enough to hold the data.

Parameters:
  • in_starts[out] Starts of local portion of data in ‘real’ space in reversed order

  • in_counts[out] Sizes of local portion of data in ‘real’ space in reversed order

  • out_starts[out] Starts of local portion of data in ‘fourier’ space in reversed order

  • out_counts[out] Sizes of local portion of data in ‘fourier’ space in reversed order

  • alloc_size[out] Minimum number of elements to be allocated for in, out or aux buffers. Size of each element in bytes can be obtained by calling Plan.get_element_size.

Returns:

Error::SUCCESS on success or error code on failure.

inline Error get_local_sizes(int32_t *in_starts = nullptr, int32_t *in_counts = nullptr, int32_t *out_starts = nullptr, int32_t *out_counts = nullptr, size_t *alloc_size = nullptr) const noexcept

Get grid decomposition information.

Results may differ on different MPI processes

Parameters:
  • in_starts[out] Starts of local portion of data in ‘real’ space in reversed order

  • in_counts[out] Sizes of local portion of data in ‘real’ space in reversed order

  • out_starts[out] Starts of local portion of data in ‘fourier’ space in reversed order

  • out_counts[out] Sizes of local portion of data in ‘fourier’ space in reversed order

  • alloc_size[out] Minimum number of elements needs to be allocated for in, out or aux buffers. Size of each element in bytes can be obtained by calling Plan.get_element_size.

Returns:

Error::SUCCESS on success or error code on failure.

inline Error get_element_size(size_t *element_size) const noexcept

Obtains number of bytes required to store single element by this plan.

Parameters:

element_size[out] Size of element in bytes

Returns:

Error::SUCCESS on success or error code on failure.

inline size_t get_element_size() const

Obtains number of bytes required to store single element by this plan.

Throws:

Exception – if underlying call fails

Returns:

Size of element in bytes

inline Error get_alloc_bytes(size_t *alloc_bytes) const noexcept

Returns minimum number of bytes required to execute plan.

This function is a combination of two calls: Plan.get_alloc_size and Plan.get_element_size

Parameters:

alloc_bytes[out] Number of bytes required

Returns:

Error::SUCCESS on success or error code on failure.

inline size_t get_alloc_bytes() const

Returns minimum number of bytes required to execute plan.

This function is a combination of two calls: Plan.get_alloc_size and Plan.get_element_size

Throws:

Exception – if underlying call fails

Returns:

Number of bytes of each buffer required to execute plan

inline Error get_executor(Executor *executor) const noexcept

Returns executor used by this plan.

Parameters:

executor[out] Executor used by this plan.

Returns:

Error::SUCCESS on success or error code on failure.

inline Executor get_executor() const

Returns executor used by this plan.

Throws:

Exception – if underlying call fails

Returns:

Executor used by this plan.

inline Error get_precision(Precision *precision) const noexcept

Returns precision of the plan.

Parameters:

precision[out] Precision of the plan.

Returns:

Error::SUCCESS on success or error code on failure.

inline Precision get_precision() const

Returns precision of the plan.

Throws:

Exception – if underlying call fails

Returns:

Precision of the plan.

inline Error get_dims(int8_t *ndims, const int32_t *dims[]) const noexcept

Returns global dimensions of the plan.

Note

Do not free the array, it is freed when the dtfft_plan_t is destroyed.

Parameters:
  • ndims[out] Number of dimensions in the plan. User can pass nullptr if this value is not needed.

  • dims[out] Array of dimensions in natural Fortran order. User can pass nullptr if this value is not needed.

Returns:

Error::SUCCESS on success or error code on failure.

inline std::vector<int32_t> get_dims() const

Returns global dimensions of the plan.

Throws:

Exception – if underlying call fails

Returns:

Vector of dimensions in natural Fortran order. Size of vector is equal to number of dimensions in the plan.

inline Error mem_alloc(size_t alloc_bytes, void **ptr) const noexcept

Allocates memory specific for this plan.

Parameters:
  • alloc_bytes[in] Number of bytes to allocate

  • ptr[out] Allocated pointer

Returns:

Error::SUCCESS on success or error code on failure.

inline void *mem_alloc(size_t alloc_bytes) const

Allocates memory specific for this plan.

Parameters:

alloc_bytes – Number of bytes to allocate

Throws:

Exception – if underlying call fails

Returns:

Pointer to allocated memory

inline Error mem_free(void *ptr) const noexcept

Frees memory specific for this plan.

Parameters:

ptr[inout] Allocated pointer

Returns:

Error::SUCCESS on success or error code on failure.

inline Error destroy() noexcept

Plan Destructor.

To fully clean all internal memory, this should be called before MPI_Finalize

Returns:

Error::SUCCESS on success or error code on failure.

inline Error get_stream(dtfft_stream_t *stream) const noexcept

Returns stream associated with current Plan.

This can either be stream passed by Config.set_stream followed by set_config() or stream created internally. Returns NULL pointer if plan’s platform is Platform::HOST.

Note

This method is only present in the API when dtFFT was compiled with CUDA Support.

Parameters:

stream[out] CUDA stream associated with plan

Returns:

Error::SUCCESS on success or error code on failure.

inline dtfft_stream_t get_stream() const

Returns stream associated with current Plan.

This can either be stream passed by Config.set_stream followed by set_config() or stream created internally. Returns NULL pointer if plan’s platform is Platform::HOST.

Note

This method is only present in the API when dtFFT was compiled with CUDA Support.

Throws:

Exception – if underlying call fails

Returns:

dtFFT stream associated with plan

inline Error get_backend(Backend &backend) const noexcept

Returns selected GPU backend during autotune if effort is Effort::PATIENT.

If effort passed to any create function is Effort::ESTIMATE or Effort::MEASURE returns value set by Config.set_backend followed by set_config() or default value, which is Backend::NCCL.

Note

This method is only present in the API when dtFFT was compiled with CUDA Support.

Returns:

Error::SUCCESS on success or error code on failure.

inline Backend get_backend() const

Returns selected GPU backend during autotune if effort is Effort::PATIENT.

If effort passed to any create function is Effort::ESTIMATE or Effort::MEASURE returns value set by Config.set_backend followed by set_config() or default value, which is Backend::NCCL.

Note

This method is only present in the API when dtFFT was compiled with CUDA Support.

Throws:

Exception – if underlying call fails

Returns:

GPU Backend used by this plan.

inline Error get_platform(Platform &platform) const noexcept

Returns plan execution platform.

Returns:

DTFFT_SUCCESS on success or error code on failure.

inline Platform get_platform() const

Returns plan execution platform.

Throws:

Exception – if underlying call fails

Returns:

Platform::HOST if plan is executed on host, Platform::CUDA if plan is executed on CUDA device.

inline dtfft_plan_t c_struct() const
Returns:

Underlying C structure

inline virtual ~Plan() noexcept = 0

Plan Destructor.

To fully clean all internal memory, this should be called before MPI_Finalize

class PlanC2C : public dtfft::Plan

Complex-to-Complex Plan.

Public Functions

inline explicit PlanC2C(const std::vector<int32_t> &dims, MPI_Comm comm = MPI_COMM_WORLD, const Precision precision = Precision::DOUBLE, const Effort effort = Effort::ESTIMATE, const Executor executor = Executor::NONE)

Complex-to-Complex Plan constructor.

Parameters:
  • dims[in] Vector with global dimensions in reversed order. dims.size() must be 2 or 3

  • comm[in] MPI communicator: MPI_COMM_WORLD or Cartesian communicator

  • precision[in] Precision of transform.

  • effort[in] How thoroughly dtFFT searches for the optimal plan

  • executor[in] Type of external FFT executor

Throws:

Exception – In case error occurs during plan creation

inline explicit PlanC2C(const std::vector<int32_t> &dims, const Precision precision, const Effort effort = Effort::ESTIMATE)

Complex-to-Complex Transpose-only Plan constructor.

Parameters:
  • dims[in] Vector with global dimensions in reversed order. dims.size() must be 2 or 3

  • precision[in] Precision of transform.

  • effort[in] How thoroughly dtFFT searches for the optimal plan

Throws:

Exception – In case error occurs during plan creation

inline explicit PlanC2C(const int8_t ndims, const int32_t *dims, MPI_Comm comm = MPI_COMM_WORLD, const Precision precision = Precision::DOUBLE, const Effort effort = Effort::ESTIMATE, const Executor executor = Executor::NONE)

Complex-to-Complex Generic Plan constructor.

Parameters:
  • ndims[in] Number of dimensions: 2 or 3

  • dims[in] Buffer of size ndims with global dimensions in reversed order.

  • comm[in] MPI communicator: MPI_COMM_WORLD or Cartesian communicator

  • precision[in] Precision of transform.

  • effort[in] How thoroughly dtFFT searches for the optimal plan

  • executor[in] Type of external FFT executor

Throws:

Exception – In case error occurs during plan creation

inline explicit PlanC2C(const Pencil &pencil, const Precision precision, const Effort effort = Effort::ESTIMATE)

Complex-to-Complex Plan constructor using pencil decomposition information.

Note

Parameter executor cannot be Executor::NONE. PlanC2C should be used instead.

Parameters:
  • pencil[in] Iniitalized Pencil object.

  • precision[in] Precision of transform.

  • effort[in] How thoroughly dtFFT searches for the optimal plan

Throws:

Exception – In case error occurs during plan creation

inline explicit PlanC2C(const Pencil &pencil, MPI_Comm comm = MPI_COMM_WORLD, const Precision precision = Precision::DOUBLE, const Effort effort = Effort::ESTIMATE, const Executor executor = Executor::NONE)

Complex-to-Complex Plan constructor using pencil decomposition information.

Note

Parameter executor cannot be Executor::NONE. PlanC2C should be used instead.

Parameters:
  • pencil[in] Iniitalized Pencil object.

  • comm[in] MPI communicator: MPI_COMM_WORLD or Cartesian communicator

  • precision[in] Precision of transform.

  • effort[in] How thoroughly dtFFT searches for the optimal plan

  • executor[in] Type of external FFT executor

Throws:

Exception – In case error occurs during plan creation

class PlanR2C : public dtfft::Plan

Real-to-Complex Plan.

Note

This class is only present in the API when dtFFT was compiled with any external FFT.

Public Functions

inline explicit PlanR2C(const std::vector<int32_t> &dims, const Executor executor, MPI_Comm comm = MPI_COMM_WORLD, const Precision precision = Precision::DOUBLE, const Effort effort = Effort::ESTIMATE)

Real-to-Complex Plan constructor.

Note

Parameter executor cannot be Executor::NONE. PlanC2C should be used instead.

Parameters:
  • dims[in] Vector with global dimensions in reversed order. dims.size() must be 2 or 3

  • comm[in] MPI communicator: MPI_COMM_WORLD or Cartesian communicator

  • executor[in] Type of external FFT executor

  • precision[in] Precision of transform.

  • effort[in] How thoroughly dtFFT searches for the optimal plan

Throws:

Exception – In case error occurs during plan creation

inline explicit PlanR2C(const int8_t ndims, const int32_t *dims, const Executor executor, MPI_Comm comm = MPI_COMM_WORLD, const Precision precision = Precision::DOUBLE, const Effort effort = Effort::ESTIMATE)

Real-to-Complex Generic Plan constructor.

Note

Parameter executor cannot be Executor::NONE. PlanC2C should be used instead.

Parameters:
  • ndims[in] Number of dimensions: 2 or 3

  • dims[in] Buffer of size ndims with global dimensions in reversed order.

  • comm[in] MPI communicator: MPI_COMM_WORLD or Cartesian communicator

  • precision[in] Precision of transform.

  • effort[in] How thoroughly dtFFT searches for the optimal plan

  • executor[in] Type of external FFT executor

Throws:

Exception – In case error occurs during plan creation

inline explicit PlanR2C(const Pencil &pencil, const Executor executor, MPI_Comm comm = MPI_COMM_WORLD, const Precision precision = Precision::DOUBLE, const Effort effort = Effort::ESTIMATE)

Real-to-Complex Plan constructor.

Note

Parameter executor cannot be Executor::NONE. PlanC2C should be used instead.

Parameters:
  • pencil[in] Iniitalized Pencil object.

  • executor[in] Type of external FFT executor

  • comm[in] MPI communicator: MPI_COMM_WORLD or Cartesian communicator

  • precision[in] Precision of transform.

  • effort[in] How thoroughly dtFFT searches for the optimal plan

Throws:

Exception – In case error occurs during plan creation

class PlanR2R : public dtfft::Plan

Real-to-Real Plan.

Public Functions

inline explicit PlanR2R(const std::vector<int32_t> &dims, const std::vector<R2RKind> &kinds = std::vector<R2RKind>(), MPI_Comm comm = MPI_COMM_WORLD, const Precision precision = Precision::DOUBLE, const Effort effort = Effort::ESTIMATE, const Executor executor = Executor::NONE)

Real-to-Real Plan constructor.

Parameters:
  • dims[in] Vector with global dimensions in reversed order. dims.size() must be 2 or 3

  • kinds[in] Real FFT kinds in reversed order. Can be empty vector if executor == Executor::NONE

  • comm[in] MPI communicator: MPI_COMM_WORLD or Cartesian communicator

  • precision[in] Precision of transform.

  • effort[in] How thoroughly dtFFT searches for the optimal plan

  • executor[in] Type of external FFT executor.

Throws:

Exception – In case error occurs during plan creation

inline explicit PlanR2R(const std::vector<int32_t> &dims, const Precision precision, const Effort effort = Effort::ESTIMATE)

Real-to-Real Transpose-only Plan constructor.

Parameters:
  • dims[in] Vector with global dimensions in reversed order. dims.size() must be 2 or 3

  • precision[in] Precision of transform.

  • effort[in] How thoroughly dtFFT searches for the optimal plan

Throws:

Exception – In case error occurs during plan creation

inline explicit PlanR2R(const int8_t ndims, const int32_t *dims, const R2RKind *kinds = nullptr, MPI_Comm comm = MPI_COMM_WORLD, const Precision precision = Precision::DOUBLE, const Effort effort = Effort::ESTIMATE, const Executor executor = Executor::NONE)

Real-to-Real Generic Plan constructor.

Parameters:
  • ndims[in] Number of dimensions: 2 or 3

  • dims[in] Buffer of size ndims with global dimensions in reversed order.

  • kinds[in] Buffer of size ndims with Real FFT kinds in reversed order. Can be nullptr if executor == Executor::NONE

  • comm[in] MPI communicator: MPI_COMM_WORLD or Cartesian communicator

  • precision[in] Precision of transform.

  • effort[in] How thoroughly dtFFT searches for the optimal plan

  • executor[in] Type of external FFT executor.

Throws:

Exception – In case error occurs during plan creation

inline explicit PlanR2R(const Pencil &pencil, const Precision precision, const Effort effort = Effort::ESTIMATE)

Real-to-Real Transpose-only Plan constructor.

Parameters:
  • pencil[in] Initialized Pencil object.

  • precision[in] Precision of transform.

  • effort[in] How thoroughly dtFFT searches for the optimal plan

Throws:

Exception – In case error occurs during plan creation

inline explicit PlanR2R(const Pencil &pencil, const std::vector<R2RKind> &kinds, MPI_Comm comm = MPI_COMM_WORLD, const Precision precision = Precision::DOUBLE, const Effort effort = Effort::ESTIMATE, const Executor executor = Executor::NONE)

Real-to-Real Plan constructor.

Parameters:
  • pencil[in] Initialized Pencil object.

  • kinds[in] Real FFT kinds in reversed order. Can be empty vector if executor == Executor::NONE

  • comm[in] MPI communicator: MPI_COMM_WORLD or Cartesian communicator

  • precision[in] Precision of transform.

  • effort[in] How thoroughly dtFFT searches for the optimal plan

  • executor[in] Type of external FFT executor.

Throws:

Exception – In case error occurs during plan creation

inline explicit PlanR2R(const Pencil &pencil, const R2RKind *kinds = nullptr, MPI_Comm comm = MPI_COMM_WORLD, const Precision precision = Precision::DOUBLE, const Effort effort = Effort::ESTIMATE, const Executor executor = Executor::NONE)

Real-to-Real Generic Plan constructor.

Parameters:
  • pencil[in] Initialized Pencil object.

  • kinds[in] Buffer of size ndims with Real FFT kinds in reversed order. Can be nullptr if executor == Executor::NONE

  • comm[in] MPI communicator: MPI_COMM_WORLD or Cartesian communicator

  • precision[in] Precision of transform.

  • effort[in] How thoroughly dtFFT searches for the optimal plan

  • executor[in] Type of external FFT executor.

Throws:

Exception – In case error occurs during plan creation