Fortran API Reference

The dtFFT library provides a header file, dtfft.f03, which can be included in Fortran source files using #include "dtfft.f03". This file defines the DTFFT_CHECK macro for checking error codes returned by library functions:

call plan%execute(a, b, DTFFT_EXECUTE_FORWARD, aux, error_code)
DTFFT_CHECK(error_code)  ! Checks for execution errors

All library functionality is contained in the Fortran module dtfft, which should be imported with the use dtfft statement in your code.

Error Codes

function  dtfft_get_error_string(error_code)

Gets the string description of an error code

Parameters:

error_code [integer(int32),in] :: Error code to convert to string

Return:

character (len=*) :: Error string explaining error.

All error codes that dtFFT can return are listed below.

DTFFT_SUCCESS

Successful execution

DTFFT_ERROR_MPI_FINALIZED

MPI_Init is not called or MPI_Finalize has already been called

DTFFT_ERROR_PLAN_NOT_CREATED

Plan not created

DTFFT_ERROR_INVALID_TRANSPOSE_TYPE

Invalid transpose_type provided

DTFFT_ERROR_INVALID_N_DIMENSIONS

Invalid Number of dimensions provided. Valid options are 2 and 3

DTFFT_ERROR_INVALID_DIMENSION_SIZE

One or more provided dimension sizes <= 0

DTFFT_ERROR_INVALID_COMM_TYPE

Invalid communicator type provided

DTFFT_ERROR_INVALID_PRECISION

Invalid precision parameter provided

DTFFT_ERROR_INVALID_EFFORT

Invalid effort parameter provided

DTFFT_ERROR_INVALID_EXECUTOR

Invalid executor parameter provided

DTFFT_ERROR_INVALID_COMM_DIMS

Number of dimensions in provided Cartesian communicator > Number of dimension passed to create subroutine

DTFFT_ERROR_INVALID_COMM_FAST_DIM

Passed Cartesian communicator with number of processes in 1st (fastest varying) dimension > 1

DTFFT_ERROR_MISSING_R2R_KINDS

For R2R plan, kinds parameter must be passed if executor != DTFFT_EXECUTOR_NONE

DTFFT_ERROR_INVALID_R2R_KINDS

Invalid values detected in kinds parameter

DTFFT_ERROR_R2C_TRANSPOSE_PLAN

Transpose plan is not supported in R2C, use C2C plan instead

DTFFT_ERROR_INPLACE_TRANSPOSE

Inplace transpose is not supported

DTFFT_ERROR_INVALID_AUX

Invalid aux buffer provided

DTFFT_ERROR_INVALID_DIM

Invalid dim passed to get_pencil()

DTFFT_ERROR_INVALID_USAGE

Invalid API Usage. Probably passed NULL pointer

DTFFT_ERROR_PLAN_IS_CREATED

Trying to create already created plan

DTFFT_ERROR_ALLOC_FAILED

Internal allocation failed

DTFFT_ERROR_FREE_FAILED

Internal memory free failed

DTFFT_ERROR_INVALID_ALLOC_BYTES

Invalid alloc_bytes provided

DTFFT_ERROR_DLOPEN_FAILED

Dynamic library loading failed

DTFFT_ERROR_DLSYM_FAILED

Dynamic library symbol lookup failed

DTFFT_ERROR_PENCIL_ARRAYS_SIZE_MISMATCH

Sizes of starts and counts arrays passed to dtfft_pencil_t constructor do not match.

DTFFT_ERROR_PENCIL_ARRAYS_INVALID_SIZES

Sizes of starts and counts < 2 or > 3 provided to dtfft_pencil_t constructor.

DTFFT_ERROR_PENCIL_INVALID_COUNTS

Invalid counts provided to dtfft_pencil_t constructor.

DTFFT_ERROR_PENCIL_INVALID_STARTS

Invalid starts provided to dtfft_pencil_t constructor.

DTFFT_ERROR_PENCIL_SHAPE_MISMATCH

Processes have same lower bounds but different sizes in some dimensions.

DTFFT_ERROR_PENCIL_OVERLAP

Pencil overlap detected, i.e. two processes share same part of global space

DTFFT_ERROR_PENCIL_NOT_CONTINUOUS

Local pencils do not cover the global space without gaps.

DTFFT_ERROR_PENCIL_NOT_INITIALIZED

Pencil is not initialized, i.e. constructor subroutine was not called

DTFFT_ERROR_INVALID_MEASURE_WARMUP_ITERS

Invalid n_measure_warmup_iters provided

DTFFT_ERROR_INVALID_MEASURE_ITERS

Invalid n_measure_iters provided

DTFFT_ERROR_INVALID_REQUEST

Invalid dtfft_request_t provided

DTFFT_ERROR_TRANSPOSE_ACTIVE

Attempting to execute already active transposition

DTFFT_ERROR_TRANSPOSE_NOT_ACTIVE

Attempting to finalize non-active transposition

DTFFT_ERROR_R2R_FFT_NOT_SUPPORTED

Selected executor do not support R2R FFTs

DTFFT_ERROR_GPU_INVALID_STREAM

Invalid stream provided

DTFFT_ERROR_INVALID_BACKEND

Invalid backend provided

DTFFT_ERROR_GPU_NOT_SET

Multiple MPI Processes located on same host share same GPU which is not supported

DTFFT_ERROR_VKFFT_R2R_2D_PLAN

When using R2R FFT and executor type is vkFFT and plan uses Z-slab optimization, it is required that types of R2R transform are same in X and Y directions

DTFFT_ERROR_BACKENDS_DISABLED

Passed effort == DTFFT_PATIENT but all Backends has been disabled by dtfft_config_t.

DTFFT_ERROR_NOT_DEVICE_PTR

One of pointers passed to execute() or transpose() cannot be accessed from device

DTFFT_ERROR_NOT_NVSHMEM_PTR

One of pointers passed to execute() or transpose() is not and NVSHMEM pointer

DTFFT_ERROR_INVALID_PLATFORM

Invalid platform provided

DTFFT_ERROR_INVALID_PLATFORM_EXECUTOR

Invalid executor provided for selected platform

DTFFT_ERROR_INVALID_PLATFORM_BACKEND

Invalid backend provided for selected platform

DTFFT_ERROR_COMPRESSION_CUDA_NOT_SUPPORTED

CUDA support is not available for compression

DTFFT_ERROR_COMPRESSION_INVALID_RATE

Invalid compression rate

DTFFT_ERROR_COMPRESSION_INVALID_PRECISION

Invalid compression precision

DTFFT_ERROR_COMPRESSION_INVALID_TOLERANCE

Invalid compression tolerance

Basic types

dtfft_execute_t

type  dtfft_execute_t

Enumerated type used to specify the direction of execution in the execute() method.

Type Parameters

DTFFT_EXECUTE_FORWARD

Forward execution: Performs the sequence XYZ to YXZ to ZXY.

DTFFT_EXECUTE_BACKWARD

Backward execution: Performs the sequence ZXY to YXZ to XYZ.


dtfft_transpose_t

type  dtfft_transpose_t

Enumerated type used to specify the transposition direction in the transpose() method.

Type Parameters

DTFFT_TRANSPOSE_X_TO_Y

Transpose from Fortran X-aligned to Fortran Y-aligned

DTFFT_TRANSPOSE_Y_TO_X

Transpose from Fortran Y-aligned to Fortran X-aligned

DTFFT_TRANSPOSE_Y_TO_Z

Transpose from Fortran Y-aligned to Fortran Z aligned

DTFFT_TRANSPOSE_Z_TO_Y

Transpose from Fortran Z-aligned to Fortran Y-aligned

DTFFT_TRANSPOSE_X_TO_Z

Transpose from Fortran X-aligned to Fortran Z-aligned

Note

This value is valid to pass only in 3D Plan and value returned by get_z_slab_enabled() must be .true.

DTFFT_TRANSPOSE_Z_TO_X

Transpose from Fortran Z aligned to Fortran X aligned

Note

This value is valid to pass only in 3D Plan and value returned by get_z_slab_enabled() must be .true.


dtfft_reshape_t

type  dtfft_reshape_t

Enumerated type used to specify the reshape direction in the reshape() method.

Type Parameters

DTFFT_RESHAPE_X_BRICKS_TO_PENCILS

Perform reshape to X-aligned pencils from X-aligned bricks

DTFFT_RESHAPE_X_PENCILS_TO_BRICKS

Perform reshape to X-aligned bricks from X-aligned pencils

DTFFT_RESHAPE_Z_PENCILS_TO_BRICKS

Perform reshape to Z-aligned bricks from Z-aligned pencils

DTFFT_RESHAPE_Z_BRICKS_TO_PENCILS

Perform reshape to Z-aligned pencils from Z-aligned bricks

DTFFT_RESHAPE_Y_PENCILS_TO_BRICKS

Perform reshape to Y-aligned bricks from Y-aligned pencils. This is an alias to DTFFT_RESHAPE_Z_PENCILS_TO_BRICKS for 2D plans

DTFFT_RESHAPE_Y_BRICKS_TO_PENCILS

Perform reshape to Y-aligned pencils from Y-aligned bricks. This is an alias to DTFFT_RESHAPE_Z_BRICKS_TO_PENCILS for 2D plans


dtfft_executor_t

type  dtfft_executor_t

Type that specifies external FFT executor

Type Parameters

DTFFT_EXECUTOR_NONE

Do not create any FFT plans. Creates transpose only plan.

DTFFT_EXECUTOR_FFTW3

FFTW3 Executor (Host only)

DTFFT_EXECUTOR_MKL

MKL DFTI Executor (Host only)

DTFFT_EXECUTOR_CUFFT

CUFFT Executor (GPU Only)

DTFFT_EXECUTOR_VKFFT

VkFFT Executor (GPU Only)


dtfft_effort_t

type  dtfft_effort_t

Type that specifies effort that dtFFT should use when creating plan

Type Parameters

DTFFT_ESTIMATE

Create plan as fast as possible

DTFFT_MEASURE

Will attempt to find best MPI Grid decomposition. Passing this flag and MPI Communicator with cartesian topology to any plan constructor is same as DTFFT_ESTIMATE

DTFFT_PATIENT

This effort option extends DTFFT_MEASURE by selecting the best-performing communication backend for All-to-All communications.

DTFFT_EXHAUSTIVE

This maximum-effort option extends DTFFT_PATIENT by including kernel autotuning (both host and GPU, depending on the execution platform) and selecting the best-performing backend for reshape operations.

This level also enables autotuning of dtfft_transpose_mode_t by executing each generic backend twice. It is not recommended to use this effort level with Global-dimension workflow on a huge number of processes.


dtfft_precision_t

type  dtfft_precision_t

Type that specifies precision of dtFFT plan

Type Parameters

DTFFT_SINGLE

Use Single precision

DTFFT_DOUBLE

Use Double precision

Related Type functions

function  dtfft_get_precision_string(precision)

Gets the string description of an error code

Parameters:

precision [dtfft_precision_t,in] :: Precision level to convert to string

Return:

character (len=*) :: String representation of dtfft_precision_t


dtfft_r2r_kind_t

type  dtfft_r2r_kind_t

Type that specifies various kinds of R2R FFTs

Type Parameters

DTFFT_DCT_1

DCT-I (Logical N=2*(n-1), inverse is DTFFT_DCT_1)

DTFFT_DCT_2

DCT-II (Logical N=2*n, inverse is DTFFT_DCT_3)

DTFFT_DCT_3

DCT-III (Logical N=2*n, inverse is DTFFT_DCT_2)

DTFFT_DCT_4

DCT-IV (Logical N=2*n, inverse is DTFFT_DCT_4)

DTFFT_DST_1

DST-I (Logical N=2*(n+1), inverse is DTFFT_DST_1)

DTFFT_DST_2

DST-II (Logical N=2*n, inverse is DTFFT_DST_3)

DTFFT_DST_3

DST-III (Logical N=2*n, inverse is DTFFT_DST_2 )

DTFFT_DST_4

DST-IV (Logical N=2*n, inverse is DTFFT_DST_4)


dtfft_backend_t

type  dtfft_backend_t

Type that specifies various GPU Backend present in dtFFT

Type Parameters

DTFFT_BACKEND_MPI_DATATYPE

Backend that uses MPI datatypes.

Not really recommended to use when platform is DTFFT_PLATFORM_CUDA, since it is a million times slower than other backends. It is present here just to show how slow MPI Datatypes are for GPU usage

DTFFT_BACKEND_MPI_P2P

MPI peer-to-peer algorithm

DTFFT_BACKEND_MPI_P2P_PIPELINED

MPI peer-to-peer algorithm with overlapping data copying and unpacking

DTFFT_BACKEND_MPI_A2A

MPI backend using MPI_Alltoallv

DTFFT_BACKEND_MPI_RMA

MPI RMA backend

DTFFT_BACKEND_MPI_RMA_PIPELINED

Pipelined MPI RMA backend

DTFFT_BACKEND_MPI_P2P_SCHEDULED

MPI peer-to-peer algorithm with scheduled communication

DTFFT_BACKEND_MPI_P2P_FUSED

MPI peer-to-peer pipelined algorithm with overlapping packing, exchange and unpacking with scheduled communication

DTFFT_BACKEND_MPI_RMA_FUSED

MPI RMA pipelined algorithm with overlapping packing, exchange and unpacking with scheduled communication

DTFFT_BACKEND_MPI_P2P_COMPRESSED

MPI peer-to-peer compressed algorithm

DTFFT_BACKEND_MPI_RMA_COMPRESSED

MPI RMA compressed algorithm

DTFFT_BACKEND_NCCL

NCCL backend

DTFFT_BACKEND_NCCL_PIPELINED

NCCL backend with overlapping data copying and unpacking

DTFFT_BACKEND_NCCL_COMPRESSED

NCCL backend that performs compression before data exchange and decompression after.

DTFFT_BACKEND_CUFFTMP

cuFFTMp backend

DTFFT_BACKEND_CUFFTMP_PIPELINED

cuFFTMp backend that uses additional buffer to avoid extra copy and gain performance.

DTFFT_BACKEND_ADAPTIVE

Adaptive backend that selects best backend for each transpose/reshape operation during plan creation.

Related Type functions

function  dtfft_get_backend_string(backend)

Gets the string description of a GPU backend

Parameters:

backend [dtfft_backend_t,in] :: Backend

Return:

character (len=*) :: Backend string

function  dtfft_get_backend_pipelined(backend)

Check if backend is pipelined

Parameters:

backend [dtfft_backend_t,in] :: Backend to check

Return:

logical :: .true. if backend is pipelined, .false. otherwise


dtfft_transpose_mode_t

type  dtfft_transpose_mode_t

Type that specifies at which stage the local transposition is performed during global exchange.

Type Parameters

DTFFT_TRANSPOSE_MODE_PACK

Perform transposition during the packing stage (Sender side).

DTFFT_TRANSPOSE_MODE_UNPACK

Perform transposition during the unpacking stage (Receiver side).


dtfft_access_mode_t

type  dtfft_access_mode_t

Type that specifies the memory access pattern (optimization target) for local transposition in Generic backends.

Type Parameters

DTFFT_ACCESS_MODE_WRITE

Optimize for contiguous write access (default).

DTFFT_ACCESS_MODE_READ

Optimize for contiguous read access.


dtfft_compression_mode_t

type  dtfft_compression_mode_t

Type that specifies compression mode.

Type Parameters

DTFFT_COMPRESSION_MODE_LOSSLESS

Lossless compression mode.

DTFFT_COMPRESSION_MODE_FIXED_RATE

Fixed rate compression mode.

DTFFT_COMPRESSION_MODE_FIXED_PRECISION

Fixed precision compression mode.

DTFFT_COMPRESSION_MODE_FIXED_ACCURACY

Fixed accuracy compression mode.


dtfft_compression_lib_t

type  dtfft_compression_lib_t

Type that specifies compression library.

Type Parameters

DTFFT_COMPRESSION_LIB_ZFP

ZFP compression library.


dtfft_compression_config_t

type  dtfft_compression_config_t

Type that specifies compression configuration.

Type fields:
  • % compression_lib [dtfft_compression_lib_t] :: Compression library to use.

  • % compression_mode [dtfft_compression_mode_t] :: Compression mode to use.

  • % rate [real(real64)] :: Rate for DTFFT_COMPRESSION_MODE_FIXED_RATE.

  • % precision [integer(int32)] :: Precision for DTFFT_COMPRESSION_MODE_FIXED_PRECISION.

  • % tolerance [real(real64)] :: Tolerance for DTFFT_COMPRESSION_MODE_FIXED_ACCURACY.


dtfft_config_t

type  dtfft_config_t

Type that can be used to set additional configuration parameters to dtFFT

Type fields:
  • % enable_log [logical] ::

    Should dtFFT print additional information during plan creation or not.

    Default is .false.

  • % enable_z_slab [logical] ::

    Should dtFFT use Z-slab optimization or not.

    Default is .true.

    One should consider disabling Z-slab optimization in order to resolve DTFFT_ERROR_VKFFT_R2R_2D_PLAN error or when underlying FFT implementation of 2D plan is too slow.

    In all other cases it is considered that Z-slab is always faster, since it reduces number of data transpositions.

  • % enable_y_slab [logical] ::

    Should dtFFT use Y-slab optimization or not.

    Default is .false.

    One should consider disabling Y-slab optimization in order to resolve DTFFT_ERROR_VKFFT_R2R_2D_PLAN error or when underlying FFT implementation of 2D plan is too slow.

    In all other cases it is considered that Y-slab is always faster, since it reduces number of data transpositions.

  • % n_measure_warmup_iters [integer(int32)] ::

    Number of warmup iterations to execute during backend and kernel autotuning when effort level is DTFFT_MEASURE or higher.

    Default is 2

  • % n_measure_iters [integer(int32)] ::

    Number of iterations to execute during backend and kernel autotuning when effort level is DTFFT_MEASURE or higher.

    Default is 5

    When dtFFT is built with CUDA support, this value also used to determine number of iterations when selecting block of threads for NVRTC transpose kernel

  • % platform [type(dtfft_platform_t)] ::

    Selects platform to execute plan.

    Default is DTFFT_PLATFORM_HOST

    This option is only defined in a build with device support. Even when dtFFT is built with device support, it does not necessarily mean that all plans must be device-related.

    Note

    This field is only present in the API when dtFFT was compiled with CUDA Support.

  • % stream [type(dtfft_stream_t)] ::

    Main CUDA stream that will be used in dtFFT.

    This parameter is a placeholder for user to set custom stream.

    Stream that is actually used by dtFFT plan is returned by f:func:get_stream function.

    When user sets stream he is responsible of destroying it.

    Stream must not be destroyed before call to destroy().

    Note

    This field is only present in the API when dtFFT was compiled with CUDA Support.

  • % backend [type(dtfft_backend_t)] ::

    Backend that will be used by dtFFT when effort is DTFFT_ESTIMATE or DTFFT_MEASURE.

    Default for HOST platform is DTFFT_BACKEND_MPI_DATATYPE.

    Default for CUDA platform is DTFFT_BACKEND_NCCL if NCCL is enabled, otherwise DTFFT_BACKEND_MPI_P2P.

  • % reshape_backend [type(dtfft_backend_t)] ::

    Backend that will be used by dtFFT for reshape operations.

    Defaults are same as backend.

  • % enable_datatype_backend [logical] ::

    Should DTFFT_BACKEND_MPI_DATATYPE be considered for autotuning when effort is DTFFT_PATIENT or DTFFT_EXHAUSTIVE.

    Default is .true.

    This option only works when platform is DTFFT_PLATFORM_HOST. When platform is DTFFT_PLATFORM_CUDA, DTFFT_BACKEND_MPI_DATATYPE is always disabled during autotuning.

  • % enable_mpi_backends [logical] ::

    Should MPI Backends be enabled when effort is DTFFT_PATIENT or DTFFT_EXHAUSTIVE.

    Default is .false.

    The following applies only to CUDA builds. MPI Backends are disabled by default during autotuning process due to OpenMPI Bug https://github.com/open-mpi/ompi/issues/12849 It was noticed that during plan autotuning GPU memory not being freed completely. For example: 1024x1024x512 C2C, double precision, single GPU, using Z-slab optimization, with MPI backends enabled, plan autotuning will leak 8Gb GPU memory. Without Z-slab optimization, running on 4 GPUs, will leak 24Gb on each of the GPUs.

    One of the workarounds is to disable MPI Backends by default, which is done here.

    Other is to pass “–mca btl_smcuda_use_cuda_ipc 0” to mpiexec, but it was noticed that disabling CUDA IPC seriously affects overall performance of MPI algorithms

  • % enable_pipelined_backends [logical] ::

    Should pipelined backends be enabled when effort is DTFFT_PATIENT or DTFFT_EXHAUSTIVE.

    Default is .true.

  • % enable_rma_backends [logical] ::

    Should RMA backends be enabled when effort is DTFFT_PATIENT or DTFFT_EXHAUSTIVE.

    Default is .true.

  • % enable_fused_backends [logical] ::

    Should fused backends be enabled when effort is DTFFT_PATIENT or DTFFT_EXHAUSTIVE.

    Default is .true.

  • % enable_nccl_backends [logical] ::

    Should NCCL Backends be enabled when effort is DTFFT_PATIENT or DTFFT_EXHAUSTIVE.

    Default is .true.

    Note

    This field is only present in the API when dtFFT was compiled with CUDA Support.

  • % enable_nvshmem_backends [logical] ::

    Should NVSHMEM Backends be enabled when effort is DTFFT_PATIENT or DTFFT_EXHAUSTIVE.

    Default is .true.

    Note

    This field is only present in the API when dtFFT was compiled with CUDA Support.

  • % enable_kernel_autotune [logical] ::

    Should dtFFT try to optimize kernel launch parameters during plan creation when effort is below DTFFT_EXHAUSTIVE.

    Default is .false.

    Kernel optimization is always enabled for DTFFT_EXHAUSTIVE effort level. Setting this option to true enables kernel optimization for lower effort levels (DTFFT_ESTIMATE, DTFFT_MEASURE, DTFFT_PATIENT). This may increase plan creation time but can improve runtime performance. Since kernel optimization is performed without data transfers, the time increase is usually minimal.

    This option can also be controlled using the DTFFT_ENABLE_KERNEL_AUTOTUNE environment variable.

  • % enable_fourier_reshape [logical] ::

    Should dtFFT execute reshapes from pencils to bricks and vice versa in Fourier space during calls to execute.

    Default is .false.

    When enabled, data will be in brick layout in Fourier space, which may be useful for certain operations between forward and backward transforms. However, this requires additional data transpositions and will reduce overall FFT performance.

    This option can also be controlled using the DTFFT_ENABLE_FOURIER_RESHAPE environment variable.

  • % transpose_mode [dtfft_transpose_mode_t] ::

    Specifies at which stage the local transposition is performed during global exchange.

    Default is DTFFT_TRANSPOSE_MODE_PACK

  • % access_mode [dtfft_access_mode_t] ::

    Specifies the memory access pattern (write/read) for local transposition in Generic backends.

    Default is DTFFT_ACCESS_MODE_WRITE

    This option can also be controlled using the DTFFT_ACCESS_MODE environment variable.

  • % enable_compressed_backends [logical] ::

    Should compressed backends be enabled when effort is DTFFT_PATIENT or DTFFT_EXHAUSTIVE.

    Default is .false.

    Only fixed-rate compression can be used during autotuning, since it provides predictable performance characteristics and does not require data-dependent decisions at runtime. To enable compressed backends during autotuning, set this option to .true., set compression type to DTFFT_COMPRESSION_MODE_FIXED_RATE and provide desired compression rate.

    This option can also be controlled using the DTFFT_ENABLE_COMPRESSED_BACKENDS environment variable.

  • % compression_config_transpose [dtfft_compression_config_t] :: Options for compression approach during transpositions.

  • % compression_config_reshape [dtfft_compression_config_t] :: Options for compression approach during reshape operations.

Related Type functions

function  dtfft_create_config(config)

Creates dtfft_config_t object and sets default values to it.

Parameters:

config [dtfft_config_t,out] :: Constructed dtFFT config ready to be set by call to dtfft_set_config()


function  dtfft_config_t([enable_log, enable_z_slab, enable_y_slab, n_measure_warmup_iters, n_measure_iters, platform, stream, backend, reshape_backend, enable_datatype_backend, enable_mpi_backends, enable_pipelined_backends, enable_rma_backends, enable_fused_backends, enable_nccl_backends, enable_nvshmem_backends, enable_kernel_autotune, enable_fourier_reshape, transpose_mode, access_mode, enable_compressed_backends, compression_config_transpose, compression_config_reshape])

Type bound constructor. All parameters are optional. Once constructed user can pass config to dtfft_set_config() function in order to set custom configuration to dtFFT.

Note

Some of the parameters are only available when dtFFT is built with CUDA or with Compression support.

Options:
  • enable_log [logical,in, optional] :: Should dtFFT print additional information during plan creation or not.

  • enable_z_slab [logical,in, optional] :: Should dtFFT use Z-slab optimization or not.

  • enable_y_slab [logical,in, optional] :: Should dtFFT use Y-slab optimization or not.

  • n_measure_warmup_iters [integer(int32),in, optional] :: Number of warmup iterations to execute during backend and kernel autotuning when effort level is DTFFT_MEASURE or higher.

  • n_measure_iters [integer(int32),in, optional] :: Number of iterations to execute during backend and kernel autotuning when effort level is DTFFT_MEASURE or higher.

  • platform [dtfft_platform_t,in, optional] :: Selects platform to execute plan.

  • stream [dtfft_stream_t,in, optional] :: Main CUDA stream that will be used in dtFFT.

  • backend [dtfft_backend_t,in, optional] :: Backend that will be used by dtFFT when effort is DTFFT_ESTIMATE or DTFFT_MEASURE.

  • reshape_backend [dtfft_backend_t,in, optional] :: Backend that will be used by dtFFT for data reshaping from bricks to pencils and vice versa when effort is DTFFT_ESTIMATE or DTFFT_MEASURE.

  • enable_datatype_backend [logical,in, optional] :: Should DTFFT_BACKEND_MPI_DATATYPE be considered for autotuning when effort is DTFFT_PATIENT or DTFFT_EXHAUSTIVE.

  • enable_mpi_backends [logical,in, optional] :: Should MPI Backends be enabled when effort is DTFFT_PATIENT or DTFFT_EXHAUSTIVE.

  • enable_pipelined_backends [logical,in, optional] :: Should pipelined backends be enabled when effort is DTFFT_PATIENT or DTFFT_EXHAUSTIVE.

  • enable_rma_backends [logical,in, optional] :: Should RMA backends be enabled when effort is DTFFT_PATIENT or DTFFT_EXHAUSTIVE.

  • enable_fused_backends [logical,in, optional] :: Should fused backends be enabled when effort is DTFFT_PATIENT or DTFFT_EXHAUSTIVE.

  • enable_nccl_backends [logical,in, optional] :: Should NCCL Backends be enabled when effort is DTFFT_PATIENT or DTFFT_EXHAUSTIVE.

  • enable_nvshmem_backends [logical,in, optional] :: Should NVSHMEM Backends be enabled when effort is DTFFT_PATIENT or DTFFT_EXHAUSTIVE.

  • enable_kernel_autotune [logical,in, optional] :: Should dtFFT try to optimize kernel launch parameters during plan creation when effort is below DTFFT_EXHAUSTIVE.

  • enable_fourier_reshape [logical,in, optional] :: Should dtFFT execute reshapes from pencils to bricks in Fourier space.

  • transpose_mode [dtfft_transpose_mode_t,in, optional] :: Specifies at which stage the local transposition is performed during global exchange.

  • access_mode [dtfft_access_mode_t,in, optional] :: Specifies the memory access pattern (write/read) for local transposition in Generic backends.

  • enable_compressed_backends [logical,in, optional] :: Should compressed backends be enabled when effort is DTFFT_PATIENT or DTFFT_EXHAUSTIVE.

  • compression_config_transpose [dtfft_compression_config_t,in, optional] :: Options for compression approach during transpositions.

  • compression_config_reshape [dtfft_compression_config_t,in, optional] :: Options for compression approach during reshape operations.

Return:

dtfft_config_t :: Constructed dtFFT config ready to be set by call to dtfft_set_config()


subroutine  dtfft_set_config(config[, error_code])

Set configuration values to dtFFT.

In order to take effect should be called before plan creation

Parameters:

config [dtfft_config_t,in] :: Config to set

Options:

error_code [integer(int32),out, optional] :: Optional error code returned to user


dtfft_pencil_t

type  dtfft_pencil_t

Type used to hold pencil decomposition info.

There are two ways users might find pencils useful inside dtFFT:

  1. To create a Plan using users’s own grid decomposition, you can pass Pencil to Plan constructors.

  2. To obtain Pencil from Plan in all possible layouts, in order to run FFT not available in dtFFT.

When pencil is returned from get_pencil(), all pencil properties are defined.

Type fields:
  • % dim [int(int8)] :: Aligned dimension id starting from 1

  • % ndims [int(int8)] :: Number of dimensions in a pencil

  • % starts (*) [int(int32),allocatable] :: Local starts in natural Fortran order

  • % counts (*) [int(int32),allocatable] :: Local counts in natural Fortran order

  • % size [int(int64)] :: Total number of elements in a pencil

Related Type functions

function  dtfft_pencil_t(starts, counts)

Type bound constructor

Parameters:
  • starts (*) [int(int32),in] :: Local starts in natural Fortran order

  • counts (*) [int(int32),in] :: Local counts in natural Fortran order


dtfft_platform_t

type  dtfft_platform_t

Type that specifies the execution platform, such as Host, CUDA, or HIP

Type Parameters

DTFFT_PLATFORM_HOST

Create HOST-related plan

DTFFT_PLATFORM_CUDA

Create CUDA-related plan


dtfft_stream_t

type  dtfft_stream_t

dtFFT stream representation.

Type fields:
  • % stream [type(c_ptr)] :: Actual stream pointer

Related Type functions

function  dtfft_stream_t(stream)

C-pointer constructor

Parameters:

stream [type(c_ptr),in] :: Stream pointer

Return:

dtfft_stream_t :: Stream object

function  dtfft_stream_t(stream)

CUDA-Fortran stream constructor

Parameters:

stream [integer(cuda_stream_kind),in] :: CUDA-Fortran stream

Return:

dtfft_stream_t :: Stream object

function  dtfft_get_cuda_stream(stream)

Gets CUDA stream from dtfft_stream_t object

Parameters:

stream [dtfft_stream_t,in] :: Stream object

Return:

integer (cuda_stream_kind) :: CUDA-Fortran stream

dtfft_request_t

type  dtfft_request_t

Helper type to manage asynchronous operations

See transpose_start(), transpose_end(), reshape_start(), reshape_end()

Version handling

Parameters

DTFFT_VERSION_MAJOR

dtFFT Major Version

DTFFT_VERSION_MINOR

dtFFT Minor Version

DTFFT_VERSION_PATCH

dtFFT Patch Version

DTFFT_VERSION_CODE

dtFFT Version Code. Can be used in Version comparison


Functions

function  dtfft_get_version()
Return:

integer (int32) :: Version Code defined during compilation

function  dtfft_get_version(major, minor, patch)

Computes Version Code based on Major, Minor and Patch versions

Parameters:
  • major [integer(int32)] :: Major version

  • minor [integer(int32)] :: Minor version

  • patch [integer(int32)] :: Patch version

Return:

integer (int32) :: Requested Version Code


Abstract plan

type  dtfft_plan_t

Abstract class for all dtFFT plans

Type bound procedures

reshape

subroutine  reshape(in, out, reshape_type[, aux, error_code])

Performs reshape from bricks to pencils layout or vice versa

Parameters:
  • in [type(*), dimension(..),inout] :: Incoming buffer of any rank and kind.

  • out [type(*), dimension(..),inout] :: Resulting buffer of any rank and kind

  • reshape_type [dtfft_reshape_t,in] :: Type of reshape

Options:
  • aux [type(*), dimension(..),inout, optional] :: Optional auxiliary buffer. If provided, size must be at least the value returned by get_aux_size_reshape().

  • error_code [integer(int32),out, optional] :: Optional error code returned to user


reshape_ptr

subroutine  reshape_ptr(in, out, reshape_type, aux[, error_code])

Performs reshape from bricks to pencils layout or vice versa using pointers

Parameters:
  • in [type(c_ptr),in] :: Incoming pointer

  • out [type(c_ptr),in] :: Resulting pointer

  • reshape_type [dtfft_reshape_t,in] :: Type of reshape

  • aux [type(c_ptr),in] :: Auxiliary pointer. Not optional. Must pass c_null_ptr if not used. If provided, size must be at least the value returned by get_aux_size_reshape().

Options:

error_code [integer(int32),out, optional] :: Optional error code returned to user


reshape_start

function  reshape_start(in, out, reshape_type [, aux, error_code]) result(request)

Starts asynchronous reshape operation

Parameters:
  • in [type(*), dimension(..),inout] :: Incoming buffer of any rank and kind.

  • out [type(*), dimension(..),inout] :: Resulting buffer of any rank and kind

  • reshape_type [dtfft_reshape_t,in] :: Type of reshape

Options:
  • aux [type(*), dimension(..),inout, optional] :: Optional auxiliary buffer. If provided, size must be at least the value returned by get_aux_size_reshape().

  • error_code [integer(int32),out, optional] :: Optional error code returned to user

Return:

request [dtfft_request_t] :: Asynchronous handle describing started reshape operation


reshape_start_ptr

function  reshape_start_ptr(in, out, reshape_type, aux [, error_code]) result(request)

Starts asynchronous reshape operation using pointers

Parameters:
  • in [type(c_ptr),in] :: Incoming pointer

  • out [type(c_ptr),in] :: Resulting pointer

  • reshape_type [dtfft_reshape_t,in] :: Type of reshape

  • aux [type(c_ptr),in] :: Auxiliary pointer. Not optional. Must pass c_null_ptr if not used. If provided, size must be at least the value returned by get_aux_size_reshape().

Options:

error_code [integer(int32),out, optional] :: Optional error code returned to user

Return:

request [dtfft_request_t] :: Asynchronous handle describing started reshape operation


reshape_end

subroutine  reshape_end(request[, error_code])

Ends previously started reshape operation

Parameters:

request [dtfft_request_t,inout] :: Asynchronous handle describing started reshape operation

Options:

error_code [integer(int32),out, optional] :: Optional error code returned to user


transpose

subroutine  transpose(in, out, transpose_type[, aux, error_code])

Performs single transposition

Parameters:
  • in [type(*), dimension(..),inout] :: Incoming buffer of any rank and kind.

  • out [type(*), dimension(..),inout] :: Resulting buffer of any rank and kind

  • transpose_type [dtfft_transpose_t,in] :: Type of transposition

Options:
  • aux [type(*), dimension(..),inout, optional] :: Optional auxiliary buffer. If provided, size must be at least the value returned by get_aux_size_transpose().

  • error_code [integer(int32),out, optional] :: Optional error code returned to user


transpose_ptr

subroutine  transpose_ptr(in, out, transpose_type, aux[, error_code])

Performs single transposition

Parameters:
  • in [type(c_ptr),in] :: Incoming pointer

  • out [type(c_ptr),in] :: Resulting pointer

  • transpose_type [dtfft_transpose_t,in] :: Type of transposition

  • aux [type(c_ptr),in] :: Auxiliary pointer. Not optional. Must pass c_null_ptr if not used. If provided, size must be at least the value returned by get_aux_size_transpose().

Options:

error_code [integer(int32),out, optional] :: Optional error code returned to user


transpose_start

function  transpose_start(in, out, transpose_type[, aux, error_code])

Starts an asynchronous transpose operation.

Parameters:
  • in [type(*), dimension(..),inout] :: Incoming buffer of any rank and kind.

  • out [type(*), dimension(..),inout] :: Resulting buffer of any rank and kind

  • transpose_type [dtfft_transpose_t,in] :: Type of transposition

Options:
  • aux [type(*), dimension(..),inout, optional] :: Optional auxiliary buffer. If provided, size must be at least the value returned by get_aux_size_transpose().

  • error_code [integer(int32),out, optional] :: Optional error code returned to user

Return:

request [dtfft_request_t] :: Asynchronous handle describing started transpose operation


transpose_start_ptr

function  transpose_start_ptr(in, out, transpose_type, aux[, error_code])

Starts an asynchronous transpose operation using type(c_ptr) pointers instead of buffers

Parameters:
  • in [type(c_ptr),in] :: Incoming pointer

  • out [type(c_ptr),in] :: Resulting pointer

  • transpose_type [dtfft_transpose_t,in] :: Type of transposition

  • aux [type(c_ptr),in] :: Auxiliary pointer. Not optional. Must pass c_null_ptr if not used. If provided, size must be at least the value returned by get_aux_size_transpose().

Options:

error_code [integer(int32),out, optional] :: Optional error code returned to user

Return:

request [dtfft_request_t] :: Asynchronous handle describing started transpose operation


transpose_end

subroutine  transpose_end(request[, error_code])

Ends previously started transposition

Parameters:

request [dtfft_request_t,inout] :: Asynchronous handle describing started transpose operation

Options:

error_code [integer(int32),out, optional] :: Optional error code returned to user


execute

subroutine  execute(in, out, execute_type[, aux, error_code])

Executes plan

Parameters:
  • in [type(*), dimension(..),inout] :: Incoming buffer of any rank and kind.

  • out [type(*), dimension(..),inout] :: Resulting buffer of any rank and kind

  • execute_type [dtfft_execute_t,in] :: Type of execution

Options:
  • aux [type(*), dimension(..),inout, optional] :: Optional auxiliary buffer. If provided, size must be at least the value returned by get_aux_size().

  • error_code [integer(int32),out, optional] :: Optional error code returned to user


execute_ptr

subroutine  execute_ptr(in, out, execute_type, aux[, error_code])

Executes plan

Parameters:
  • in [type(c_ptr),in] :: Incoming pointer

  • out [type(c_ptr),in] :: Resulting pointer

  • execute_type [dtfft_execute_t,in] :: Type of execution

  • aux [type(c_ptr),in] :: Auxiliary pointer. Not optional. Must pass c_null_ptr if not used. If provided, size must be at least the value returned by get_aux_bytes().

Options:

error_code [integer(int32),out, optional] :: Optional error code returned to user


destroy

subroutine  destroy([error_code])

Destroys plan, frees all memory

Options:

error_code [integer(int32),out] :: Optional error code returned to user


get_local_sizes

subroutine  get_local_sizes([in_starts, in_counts, out_starts, out_counts, alloc_size, error_code])

Obtain local starts and counts in real and fourier spaces

Options:
  • in_starts (*) [integer(int32),out, optional] :: Start indexes in real space (0-based)

  • in_counts (*) [integer(int32),out, optional] :: Number of elements in real space

  • out_starts (*) [integer(int32),out, optional] :: Start indexes in fourier space (0-based)

  • out_counts (*) [integer(int32),out, optional] :: Number of elements in fourier space

  • alloc_size (*) [integer(int64),out, optional] :: Minimum number of elements to be allocated for in, out buffers required by execute(), transpose(), and reshape(). Size of each element in bytes can be obtained by calling get_element_size().

  • error_code [integer(int32),out, optional] :: Optional error code returned to user


get_alloc_size

function  get_alloc_size([error_code])

Wrapper around get_local_sizes() to obtain number of elements only

Options:

error_code [integer(int32),out, optional] :: Optional error code returned to user

Return:

integer (int64) :: Minimum number of elements to be allocated for in, out buffers required by execute(), transpose(), and reshape().


get_element_size

function  get_element_size([error_code])

Returns number of bytes required to store single element.

Options:

error_code [integer(int32),out, optional] :: Optional error code returned to user

Return:

integer (int64) :: Size of element in bytes


get_alloc_bytes

function  get_alloc_bytes([error_code])

Returns minimum number of bytes required for in and out buffers. This also returns minimum number of bytes required for aux buffer in transpose and reshape operations. Minimum number of aux bytes required by execute can be obtained by calling get_aux_bytes.

Options:

error_code [integer(int32),out, optional] :: Optional error code returned to user

Return:

integer (int64) :: Minimum number of bytes to be allocated for in and out buffers required by execute. This also returns minimum number of bytes required for aux buffer required by transpose and reshape.


get_aux_size

function  get_aux_size([error_code])

Returns minimum number of elements required for auxiliary buffer which may be different from alloc_size when backend is pipelined.

Options:

error_code [integer(int32),out, optional] :: Optional error code returned to user

Return:

integer (int64) :: Minimum number of elements required for auxiliary buffer.


get_aux_bytes

function  get_aux_bytes([error_code])

Returns minimum number of bytes required for auxiliary buffer.

Options:

error_code [integer(int32),out, optional] :: Optional error code returned to user

Return:

integer (int64) :: Minimum number of bytes required for auxiliary buffer.


get_aux_size_reshape

function  get_aux_size_reshape([error_code])

Returns minimum number of elements required for reshape() auxiliary buffer

Options:

error_code [integer(int32),out, optional] :: Optional error code returned to user

Return:

integer (int64) :: Minimum number of elements required for auxiliary buffer.


get_aux_bytes_reshape

function  get_aux_bytes_reshape([error_code])

Returns minimum number of bytes required for reshape() auxiliary buffer

Options:

error_code [integer(int32),out, optional] :: Optional error code returned to user

Return:

integer (int64) :: Minimum number of bytes required for auxiliary buffer.


get_aux_size_transpose

function  get_aux_size_transpose([error_code])

Returns minimum number of elements required for transpose() auxiliary buffer

Options:

error_code [integer(int32),out, optional] :: Optional error code returned to user

Return:

integer (int64) :: Minimum number of elements required for auxiliary buffer.


get_aux_bytes_transpose

function  get_aux_bytes_transpose([error_code])

Returns minimum number of bytes required for transpose() auxiliary buffer

Options:

error_code [integer(int32),out, optional] :: Optional error code returned to user

Return:

integer (int64) :: Minimum number of bytes required for auxiliary buffer.


mem_alloc

Allocates memory tailored to the specific needs of the plan.

subroutine  mem_alloc(alloc_size, ptr[, lbound, error_code])
Parameters:
  • alloc_size [integer(int64),in] :: Number of elements to allocate

  • ptr (*) [type(*),pointer, out] :: 1D pointer to allocate

Options:
  • lbound [integer(int32),in, optional] :: Lower boundary of allocated pointer

  • error_code [integer(int32),out, optional] :: Optional error code returned to user


subroutine  mem_alloc(alloc_size, ptr, sizes[, lbounds, error_code])
Parameters:
  • alloc_size [integer(int64),in] :: Number of elements to allocate

  • ptr (..) [type(*),pointer, out] :: 2D or 3D pointer to allocate

  • sizes (*) [integer(int32),in] :: Sizes of each dimension in natural Fortran order. Size of sizes must match rank of pointer.

Options:
  • lbounds (*) [integer(int32),in, optional] :: Lower boundaries of allocated pointer. Size of lbounds must match rank of pointer.

  • error_code [integer(int32),out, optional] :: Optional error code returned to user


mem_alloc_ptr

Allocates memory tailored to the specific needs of the plan.

function  mem_alloc_ptr(alloc_bytes[, error_code])
Parameters:

alloc_bytes [integer(int64),in] :: Number of bytes to allocate

Options:

error_code [integer(int32),out, optional] :: Optional error code returned to user

Return:

type (c_ptr) :: Allocated pointer


mem_free

Frees memory previously allocated by mem_alloc().

subroutine  mem_free(ptr[, error_code])
Parameters:

ptr (..) [type(*),inout] :: Pointer allocated with mem_alloc

Options:

error_code [integer(int32),out, optional] :: Optional error code returned to user


mem_free_ptr

Frees memory previously allocated by mem_alloc_ptr().

subroutine  mem_free_ptr(ptr[, error_code])
Parameters:

ptr [type(c_ptr),in] :: Pointer allocated with mem_alloc_ptr

Options:

error_code [integer(int32),out, optional] :: Optional error code returned to user


get_z_slab_enabled

function  get_z_slab_enabled([error_code])

Returns logical value is Z-slab optimization enabled internally

Options:

error_code [integer(int32),out, optional] :: Optional error code returned to user

Return:

logical :: Boolean value if Z-slab is used.


get_y_slab_enabled

function  get_y_slab_enabled([error_code])

Returns logical value is Y-slab optimization enabled internally

Options:

error_code [integer(int32),out, optional] :: Optional error code returned to user

Return:

logical :: Boolean value if Y-slab is used.


get_pencil

function  get_pencil(layout[, error_code])

Obtains pencil information from plan. This can be useful when user wants to use own FFT implementation, that is unavailable in dtFFT.

Parameters:

layout [type(dtfft_layout_t),in] :: Required layout

Options:

error_code [integer(int32),out, optional] :: Optional error code returned to user

Return:

dtfft_pencil_t :: Pencil data


report

subroutine  report([error_code])

Prints plan-related information to stdout

Options:

error_code [integer(int32),out, optional] :: Optional error code returned to user


report_compression

subroutine  report_compression([error_code])

Reports compression ratios for all operations where compression was performed. This function can be repeatedly called after plan creation and after execution to see how compression ratios evolve.

Options:

error_code [integer(int32),out, optional] :: Optional error code returned to user


get_executor

function  get_executor([error_code])

Returns FFT Executor associated with plan

Options:

error_code [integer(int32),out, optional] :: Optional error code returned to user

Return:

dtfft_executor_t :: FFT Executor used by this plan.


get_precision

function  get_precision([error_code])

Returns precision of the plan

Options:

error_code [integer(int32),out, optional] :: Optional error code returned to user

Return:

dtfft_precision_t :: Precision of the plan.


get_dims

subroutine  get_dims(dims[, error_code])

Returns global dimensions of the plan.

Parameters:

dims [integer(int32),out, pointer] ::

Global dimensions of the plan.

Users should not attempt to change values in this pointer.

Options:

error_code [integer(int32),out, optional] :: Optional error code returned to user


get_grid_dims

subroutine  get_grid_dims(dims[, error_code])

Returns grid decomposition dimensions

Parameters:

dims [integer(int32),out, pointer] ::

Grid dimensions of the plan.

Users should not attempt to change values in this pointer.

Options:

error_code [integer(int32),out, optional] :: Optional error code returned to user


get_backend

function  get_backend([error_code])

Returns backend that is used during data transpositions.

If effort is DTFFT_ESTIMATE or DTFFT_MEASURE, returns the value set by dtfft_set_config() or via environment variable DTFFT_BACKEND.

Options:

error_code [integer(int32),out, optional] :: Optional error code returned to user

Return:

dtfft_backend_t :: Selected backend


get_reshape_backend

function  get_reshape_backend([error_code])

Returns backend that is used during data reshapes from bricks to pencils and vice versa.

If effort is DTFFT_ESTIMATE or DTFFT_MEASURE, returns the value set by dtfft_set_config() or via environment variable DTFFT_RESHAPE_BACKEND.

Options:

error_code [integer(int32),out, optional] :: Optional error code returned to user

Return:

dtfft_backend_t :: Selected backend


get_platform

function  get_platform([error_code])

Returns execution platform of the plan (HOST or CUDA)

Note

This method is only present in the API when dtFFT was compiled with CUDA Support.

Options:

error_code [integer(int32),out, optional] :: Optional error code returned to user

Return:

dtfft_platform_t :: Execution platform


get_stream

This method is overloaded to support both CUDA and dtFFT streams.

subroutine  get_stream(stream[, error_code])

Returns CUDA stream associated with plan

Note

This method is only present in the API when dtFFT was compiled with CUDA Support.

Parameters:

integer (cuda_stream_kind) :: CUDA stream associated with plan

Options:

error_code [integer(int32),out, optional] :: Optional error code returned to user

subroutine  get_stream(stream[, error_code])

Returns dtFFT stream associated with plan

Note

This method is only present in the API when dtFFT was compiled with CUDA Support.

Parameters:

type (dtfft_stream_t) :: dtFFT stream associated with plan

Options:

error_code [integer(int32),out, optional] :: Optional error code returned to user

Real-to-Real plan

type  dtfft_plan_r2r_t

Real-to-real plan class

Extends dtfft_plan_t

Type bound procedures

create

subroutine  create(dims[, kinds, comm, precision, effort, executor, error_code])

R2R Plan Constructor.

Parameters:

dims (*) [integer(int32),in] :: Global dimensions of the transform as an integer array.

Options:
  • kinds (*) [dtfft_r2r_kind_t,in, optional] :: Kinds of R2R transforms, default = empty.

  • comm [MPI_Comm,in, optional] :: Communicator for parallel execution, default = MPI_COMM_WORLD.

  • precision [dtfft_precision_t,in, optional] :: Precision of the transform, default = DTFFT_DOUBLE.

  • effort [dtfft_effort_t,in, optional] :: How hard dtFFT should look for best plan, default = DTFFT_ESTIMATE.

  • executor [dtfft_executor_t,in, optional] :: Type of external FFT executor, default = DTFFT_EXECUTOR_NONE.

  • error_code [integer(int32),out, optional] :: Optional error code returned to the user


subroutine  create(pencil[, kinds, comm, precision, effort, executor, error_code])

R2R Plan Constructor using local pencil information

Parameters:

pencil [dtfft_pencil_t,in] :: Local pencil of data to be transformed

Options:
  • kinds (*) [dtfft_r2r_kind_t,in, optional] :: Kinds of R2R transforms, default = empty.

  • comm [MPI_Comm,in, optional] :: Communicator for parallel execution, default = MPI_COMM_WORLD.

  • precision [dtfft_precision_t,in, optional] :: Precision of the transform, default = DTFFT_DOUBLE.

  • effort [dtfft_effort_t,in, optional] :: How hard dtFFT should look for best plan, default = DTFFT_ESTIMATE.

  • executor [dtfft_executor_t,in, optional] :: Type of external FFT executor, default = DTFFT_EXECUTOR_NONE.

  • error_code [integer(int32),out, optional] :: Optional error code returned to the user


Complex-to-Complex plan

type  dtfft_plan_c2c_t

Complex-to-complex plan class

Extends dtfft_plan_t

Type bound procedures

create

subroutine  create(dims[, comm, precision, effort, executor, error_code])

C2C Plan Constructor.

Parameters:

dims (*) [integer(int32),in] :: Global dimensions of the transform as an integer array.

Options:
  • comm [MPI_Comm,in, optional] :: Communicator for parallel execution, default = MPI_COMM_WORLD.

  • precision [dtfft_precision_t,in, optional] :: Precision of the transform, default = DTFFT_DOUBLE.

  • effort [dtfft_effort_t,in, optional] :: How hard dtFFT should look for best plan, default = DTFFT_ESTIMATE.

  • executor [dtfft_executor_t,in, optional] :: Type of external FFT executor, default = DTFFT_EXECUTOR_NONE.

  • error_code [integer(int32),out, optional] :: Optional error code returned to the user


subroutine  create(pencil[, comm, precision, effort, executor, error_code])

C2C Plan Constructor using local pencil information

Parameters:

pencil [dtfft_pencil_t,in] :: Local pencil of data to be transformed

Options:
  • comm [MPI_Comm,in, optional] :: Communicator for parallel execution, default = MPI_COMM_WORLD.

  • precision [dtfft_precision_t,in, optional] :: Precision of the transform, default = DTFFT_DOUBLE.

  • effort [dtfft_effort_t,in, optional] :: How hard dtFFT should look for best plan, default = DTFFT_ESTIMATE.

  • executor [dtfft_executor_t,in, optional] :: Type of external FFT executor, default = DTFFT_EXECUTOR_NONE.

  • error_code [integer(int32),out, optional] :: Optional error code returned to the user


Real-to-Complex plan

type  dtfft_plan_r2c_t

Real-to-complex plan class

Extends dtfft_plan_t

Type bound procedures

create

subroutine  create(dims[, comm, precision, effort, executor, error_code])

R2C Plan Constructor.

Parameters:

dims (*) [integer(int32),in] :: Global dimensions of the transform as an integer array.

Options:
  • comm [MPI_Comm,in, optional] :: Communicator for parallel execution, default = MPI_COMM_WORLD.

  • precision [dtfft_precision_t,in, optional] :: Precision of the transform, default = DTFFT_DOUBLE.

  • effort [dtfft_effort_t,in, optional] :: How hard dtFFT should look for best plan, default = DTFFT_ESTIMATE.

  • executor [dtfft_executor_t,in, optional] :: Type of external FFT executor, default = DTFFT_EXECUTOR_NONE.

  • error_code [integer(int32),out, optional] :: Optional error code returned to the user


subroutine  create(pencil[, comm, precision, effort, executor, error_code])

R2C Plan Constructor using local pencil information

Parameters:

pencil [dtfft_pencil_t,in] :: Local pencil of data to be transformed

Options:
  • comm [MPI_Comm,in, optional] :: Communicator for parallel execution, default = MPI_COMM_WORLD.

  • precision [dtfft_precision_t,in, optional] :: Precision of the transform, default = DTFFT_DOUBLE.

  • effort [dtfft_effort_t,in, optional] :: How hard dtFFT should look for best plan, default = DTFFT_ESTIMATE.

  • executor [dtfft_executor_t,in, optional] :: Type of external FFT executor, default = DTFFT_EXECUTOR_NONE.

  • error_code [integer(int32),out, optional] :: Optional error code returned to the user