Fortran API Reference¶
The dtFFT library provides a header file, dtfft.f03, which can be included in Fortran source files using #include "dtfft.f03".
This file defines the DTFFT_CHECK macro for checking error codes returned by library functions:
call plan%execute(a, b, DTFFT_EXECUTE_FORWARD, aux, error_code)
DTFFT_CHECK(error_code) ! Checks for execution errors
All library functionality is contained in the Fortran module dtfft, which should be imported with the use dtfft statement in your code.
Error Codes¶
- function dtfft_get_error_string(error_code)¶
Gets the string description of an error code
- Parameters:
error_code [integer(int32),in] :: Error code to convert to string
- Return:
character (len=*) :: Error string explaining error.
All error codes that dtFFT can return are listed below.
- DTFFT_SUCCESS¶
Successful execution
- DTFFT_ERROR_MPI_FINALIZED¶
MPI_Init is not called or MPI_Finalize has already been called
- DTFFT_ERROR_PLAN_NOT_CREATED¶
Plan not created
- DTFFT_ERROR_INVALID_TRANSPOSE_TYPE¶
Invalid
transpose_typeprovided
- DTFFT_ERROR_INVALID_N_DIMENSIONS¶
Invalid Number of dimensions provided. Valid options are 2 and 3
- DTFFT_ERROR_INVALID_DIMENSION_SIZE¶
One or more provided dimension sizes <= 0
- DTFFT_ERROR_INVALID_COMM_TYPE¶
Invalid communicator type provided
- DTFFT_ERROR_INVALID_PRECISION¶
Invalid
precisionparameter provided
- DTFFT_ERROR_INVALID_EFFORT¶
Invalid
effortparameter provided
- DTFFT_ERROR_INVALID_EXECUTOR¶
Invalid
executorparameter provided
- DTFFT_ERROR_INVALID_COMM_DIMS¶
Number of dimensions in provided Cartesian communicator > Number of dimension passed to create subroutine
- DTFFT_ERROR_INVALID_COMM_FAST_DIM¶
Passed Cartesian communicator with number of processes in 1st (fastest varying) dimension > 1
- DTFFT_ERROR_MISSING_R2R_KINDS¶
For R2R plan,
kindsparameter must be passed ifexecutor!=DTFFT_EXECUTOR_NONE
- DTFFT_ERROR_INVALID_R2R_KINDS¶
Invalid values detected in
kindsparameter
- DTFFT_ERROR_R2C_TRANSPOSE_PLAN¶
Transpose plan is not supported in R2C, use C2C plan instead
- DTFFT_ERROR_INPLACE_TRANSPOSE¶
Inplace transpose is not supported
- DTFFT_ERROR_INVALID_AUX¶
Invalid
auxbuffer provided
- DTFFT_ERROR_INVALID_DIM¶
Invalid
dimpassed toget_pencil()
- DTFFT_ERROR_INVALID_USAGE¶
Invalid API Usage. Probably passed NULL pointer
- DTFFT_ERROR_PLAN_IS_CREATED¶
Trying to create already created plan
- DTFFT_ERROR_ALLOC_FAILED¶
Internal allocation failed
- DTFFT_ERROR_FREE_FAILED¶
Internal memory free failed
- DTFFT_ERROR_INVALID_ALLOC_BYTES¶
Invalid
alloc_bytesprovided
- DTFFT_ERROR_DLOPEN_FAILED¶
Dynamic library loading failed
- DTFFT_ERROR_DLSYM_FAILED¶
Dynamic library symbol lookup failed
- DTFFT_ERROR_R2C_TRANSPOSE_CALLED¶
Calling to
transposemethod for R2C plan is not allowed
- DTFFT_ERROR_PENCIL_ARRAYS_SIZE_MISMATCH¶
Sizes of starts and counts arrays passed to dtfft_pencil_t constructor do not match.
- DTFFT_ERROR_PENCIL_ARRAYS_INVALID_SIZES¶
Sizes of starts and counts < 2 or > 3 provided to dtfft_pencil_t constructor.
- DTFFT_ERROR_PENCIL_INVALID_COUNTS¶
Invalid counts provided to dtfft_pencil_t constructor.
- DTFFT_ERROR_PENCIL_INVALID_STARTS¶
Invalid starts provided to dtfft_pencil_t constructor.
- DTFFT_ERROR_PENCIL_SHAPE_MISMATCH¶
Processes have same lower bounds but different sizes in some dimensions.
- DTFFT_ERROR_PENCIL_OVERLAP¶
Pencil overlap detected, i.e. two processes share same part of global space
- DTFFT_ERROR_PENCIL_NOT_CONTINUOUS¶
Local pencils do not cover the global space without gaps.
- DTFFT_ERROR_PENCIL_NOT_INITIALIZED¶
Pencil is not initialized, i.e. constructor subroutine was not called
- DTFFT_ERROR_R2R_FFT_NOT_SUPPORTED¶
Selected
executordo not support R2R FFTs
- DTFFT_ERROR_GPU_INVALID_STREAM¶
Invalid stream provided
- DTFFT_ERROR_GPU_INVALID_BACKEND¶
Invalid GPU backend provided
- DTFFT_ERROR_GPU_NOT_SET¶
Multiple MPI Processes located on same host share same GPU which is not supported
- DTFFT_ERROR_VKFFT_R2R_2D_PLAN¶
When using R2R FFT and executor type is vkFFT and plan uses Z-slab optimization, it is required that types of R2R transform are same in X and Y directions
- DTFFT_ERROR_GPU_BACKENDS_DISABLED¶
Passed
effort==DTFFT_PATIENTbut all GPU Backends has been disabled bydtfft_config_t.
- DTFFT_ERROR_NOT_DEVICE_PTR¶
One of pointers passed to
execute()ortranspose()cannot be accessed from device
- DTFFT_ERROR_NOT_NVSHMEM_PTR¶
One of pointers passed to
execute()ortranspose()is not andNVSHMEMpointer
- DTFFT_ERROR_INVALID_PLATFORM¶
Invalid platform provided
- DTFFT_ERROR_INVALID_PLATFORM_EXECUTOR_TYPE¶
Invalid executor provided for selected platform
Basic types¶
dtfft_execute_t¶
- type dtfft_execute_t¶
Enumerated type used to specify the direction of execution in the
execute()method.
Type Parameters¶
- DTFFT_EXECUTE_FORWARD¶
Forward execution: Performs the sequence XYZ to YXZ to ZXY.
- DTFFT_EXECUTE_BACKWARD¶
Backward execution: Performs the sequence ZXY to YXZ to XYZ.
dtfft_transpose_t¶
- type dtfft_transpose_t¶
Enumerated type used to specify the transposition direction in the
transpose()method.
Type Parameters¶
- DTFFT_TRANSPOSE_X_TO_Y¶
Transpose from Fortran X-aligned to Fortran Y-aligned
- DTFFT_TRANSPOSE_Y_TO_X¶
Transpose from Fortran Y-aligned to Fortran X-aligned
- DTFFT_TRANSPOSE_Y_TO_Z¶
Transpose from Fortran Y-aligned to Fortran Z aligned
- DTFFT_TRANSPOSE_Z_TO_Y¶
Transpose from Fortran Z-aligned to Fortran Y-aligned
- DTFFT_TRANSPOSE_X_TO_Z¶
Transpose from Fortran X-aligned to Fortran Z-aligned
Note
This value is valid to pass only in 3D Plan and value returned by get_z_slab_enabled() must be .true.
- DTFFT_TRANSPOSE_Z_TO_X¶
Transpose from Fortran Z aligned to Fortran X aligned
Note
This value is valid to pass only in 3D Plan and value returned by get_z_slab_enabled() must be .true.
dtfft_executor_t¶
- type dtfft_executor_t¶
Type that specifies external FFT executor
Type Parameters¶
- DTFFT_EXECUTOR_NONE¶
Do not create any FFT plans. Creates transpose only plan.
- DTFFT_EXECUTOR_FFTW3¶
FFTW3 Executor (Host only)
- DTFFT_EXECUTOR_MKL¶
MKL DFTI Executor (Host only)
- DTFFT_EXECUTOR_CUFFT¶
CUFFT Executor (GPU Only)
- DTFFT_EXECUTOR_VKFFT¶
VkFFT Executor (GPU Only)
dtfft_effort_t¶
- type dtfft_effort_t¶
Type that specifies effort that
dtFFTshould use when creating plan
Type Parameters¶
- DTFFT_ESTIMATE¶
Create plan as fast as possible
- DTFFT_MEASURE¶
Will attempt to find best MPI Grid decomposition. Passing this flag and MPI Communicator with cartesian topology to any plan constructor is same as
DTFFT_ESTIMATE
- DTFFT_PATIENT¶
Same as
DTFFT_MEASUREplus cycle through various send and recieve MPI_Datatypes.For GPU Build of the library this value will cycle through enabled GPU Backend in order to find the fastest.
dtfft_precision_t¶
- type dtfft_precision_t¶
Type that specifies precision of
dtFFTplan
Type Parameters¶
- DTFFT_SINGLE¶
Use Single precision
- DTFFT_DOUBLE¶
Use Double precision
Related Type functions¶
- function dtfft_get_precision_string(precision)¶
Gets the string description of an error code
- Parameters:
precision [dtfft_precision_t,in] :: Precision level to convert to string
- Return:
character (len=*) :: String representation of dtfft_precision_t
dtfft_r2r_kind_t¶
- type dtfft_r2r_kind_t¶
Type that specifies various kinds of R2R FFTs
Type Parameters¶
- DTFFT_DCT_1¶
DCT-I (Logical N=2*(n-1), inverse is
DTFFT_DCT_1)
- DTFFT_DCT_2¶
DCT-II (Logical N=2*n, inverse is
DTFFT_DCT_3)
- DTFFT_DCT_3¶
DCT-III (Logical N=2*n, inverse is
DTFFT_DCT_2)
- DTFFT_DCT_4¶
DCT-IV (Logical N=2*n, inverse is
DTFFT_DCT_4)
- DTFFT_DST_1¶
DST-I (Logical N=2*(n+1), inverse is
DTFFT_DST_1)
- DTFFT_DST_2¶
DST-II (Logical N=2*n, inverse is
DTFFT_DST_3)
- DTFFT_DST_3¶
DST-III (Logical N=2*n, inverse is
DTFFT_DST_2)
- DTFFT_DST_4¶
DST-IV (Logical N=2*n, inverse is
DTFFT_DST_4)
dtfft_backend_t¶
- type dtfft_backend_t¶
Type that specifies various GPU Backend present in
dtFFT
Note
This type is only present in the API when dtFFT was compiled with CUDA Support.
Type Parameters¶
- DTFFT_BACKEND_MPI_DATATYPE¶
Backend that uses MPI datatypes.
Not really recommended to use, since it is a million times slower than other backends. It is present here just to show how slow MPI Datatypes are for GPU usage
- DTFFT_BACKEND_MPI_P2P¶
MPI peer-to-peer algorithm
- DTFFT_BACKEND_MPI_P2P_PIPELINED¶
MPI peer-to-peer algorithm with overlapping data copying and unpacking
- DTFFT_BACKEND_MPI_A2A¶
MPI backend using MPI_Alltoallv
- DTFFT_BACKEND_NCCL¶
NCCL backend
- DTFFT_BACKEND_NCCL_PIPELINED¶
NCCL backend with overlapping data copying and unpacking
- DTFFT_BACKEND_CUFFTMP¶
cuFFTMp backend
- DTFFT_BACKEND_CUFFTMP_PIPELINED¶
cuFFTMp backend that uses additional buffer to avoid extra copy and gain performance.
Related Type functions¶
- function dtfft_get_backend_string(backend)¶
Gets the string description of a GPU backend
This function is only present in the API when
dtFFTwas compiled with CUDA Support.- Parameters:
backend [dtfft_backend_t,in] :: GPU backend
- Return:
character (len=*) :: Backend string
dtfft_config_t¶
- type dtfft_config_t¶
Type that can be used to set additional configuration parameters to
dtFFT- Type fields:
% enable_z_slab [logical] ::
Should
dtFFTuse Z-slab optimization or not.Default is
.true.One should consider disabling Z-slab optimization in order to resolve
DTFFT_ERROR_VKFFT_R2R_2D_PLANerror OR when underlying FFT implementation of 2D plan is too slow.In all other cases it is considered that Z-slab is always faster, since it reduces number of data transpositions.
% platform [type(dtfft_platform_t)] ::
Selects platform to execute plan.
Default is
DTFFT_PLATFORM_HOSTThis option is only defined in a build with device support. Even when dtFFT is built with device support, it does not necessarily mean that all plans must be device-related.
Note
This field is only present in the API when
dtFFTwas compiled with CUDA Support.% stream [type(dtfft_stream_t)] ::
Main CUDA stream that will be used in dtFFT.
This parameter is a placeholder for user to set custom stream.
Stream that is actually used by dtFFT plan is returned by f:func:get_stream function.
When user sets stream he is responsible of destroying it.
Stream must not be destroyed before call to
destroy().Note
This field is only present in the API when
dtFFTwas compiled with CUDA Support.% backend [type(dtfft_backend_t)] ::
Backend that will be used by dtFFT when
effortisDTFFT_ESTIMATEorDTFFT_MEASURE.Default is
DTFFT_BACKEND_NCCLNote
This field is only present in the API when
dtFFTwas compiled with CUDA Support.% enable_mpi_backends [logical] ::
Should MPI GPU Backends be enabled when
effortisDTFFT_PATIENTor not.Default is
.false.MPI Backends are disabled by default during autotuning process due to OpenMPI Bug https://github.com/open-mpi/ompi/issues/12849 It was noticed that during plan autotuning GPU memory not being freed completely. For example: 1024x1024x512 C2C, double precision, single GPU, using Z-slab optimization, with MPI backends enabled, plan autotuning will leak 8Gb GPU memory. Without Z-slab optimization, running on 4 GPUs, will leak 24Gb on each of the GPUs.
One of the workarounds is to disable MPI Backends by default, which is done here.
Other is to pass “–mca btl_smcuda_use_cuda_ipc 0” to
mpiexec, but it was noticed that disabling CUDA IPC seriously affects overall performance of MPI algorithmsNote
This field is only present in the API when
dtFFTwas compiled with CUDA Support.% enable_pipelined_backends [logical] ::
Should pipelined GPU backends be enabled when
effortisDTFFT_PATIENTor not.Default is
.true.Pipelined backends require additional buffer that user has no control over.
Note
This field is only present in the API when
dtFFTwas compiled with CUDA Support.% enable_nccl_backends [logical] ::
Should NCCL Backends be enabled when
effortisDTFFT_PATIENTor not.Default is
.true.Note
This field is only present in the API when
dtFFTwas compiled with CUDA Support.% enable_nvshmem_backends [logical] ::
Should NCCL Backends be enabled when
effortisDTFFT_PATIENTor not.Default is
.true.Note
This field is only present in the API when
dtFFTwas compiled with CUDA Support.
Related Type functions¶
- function dtfft_create_config(config)¶
Creates dtfft_config_t objects and sets default values to it.
- Parameters:
config [dtfft_config_t,out] :: Constructed
dtFFTconfig ready to be set by call todtfft_set_config()
- function dtfft_config_t(enable_z_slab)
Type bound constructor
- Options:
enable_z_slab [logical,in, optional] :: Should dtFFT use Z-slab optimization or not.
- Return:
dtfft_config_t :: Constructed
dtFFTconfig ready to be set by call todtfft_set_config()
- function dtfft_config_t(enable_z_slab, platform, stream, backend, enable_mpi_backends, enable_pipelined_backends, enable_nccl_backends, enable_nvshmem_backends)
Type bound constructor
Note
This version of constructor is only present in the API when
dtFFTwas compiled with CUDA Support.- Options:
enable_z_slab [logical,in, optional] :: Should dtFFT use Z-slab optimization or not.
platform [dtfft_platform_t,in, optional] :: Selects platform to execute plan.
stream [dtfft_stream_t,in, optional] :: Main CUDA stream that will be used in dtFFT.
backend [dtfft_backend_t,in, optional] :: Backend that will be used by dtFFT when
effortisDTFFT_ESTIMATEorDTFFT_MEASURE.enable_mpi_backends [logical,in, optional] :: Should MPI GPU Backends be enabled when
effortisDTFFT_PATIENTor not.enable_pipelined_backends [logical,in, optional] :: Should pipelined GPU backends be enabled when
effortisDTFFT_PATIENTor not.enable_nccl_backends [logical,in, optional] :: Should NCCL Backends be enabled when
effortisDTFFT_PATIENTor not.enable_nvshmem_backends [logical,in, optional] :: Should NVSHMEM Backends be enabled when
effortisDTFFT_PATIENTor not.
- Return:
dtfft_config_t :: Constructed
dtFFTconfig ready to be set by call todtfft_set_config()
- subroutine dtfft_set_config(config[, error_code])¶
Set configuration values to
dtFFT.In order to take effect should be called before plan creation
- Parameters:
config [dtfft_config_t,in] :: Config to set
- Options:
error_code [integer(int32),out, optional] :: Optional error code returned to user
dtfft_pencil_t¶
- type dtfft_pencil_t¶
Type used to hold pencil decomposition info.
There are two ways users might find pencils useful inside dtFFT:
To create a Plan using users’s own grid decomposition, you can pass Pencil to Plan constructors.
To obtain Pencil from Plan in all possible layouts, in order to run FFT not available in dtFFT.
When pencil is returned from
get_pencil(), all pencil properties are defined.- Type fields:
% dim [int(int8)] :: Aligned dimension id starting from 1
% ndims [int(int8)] :: Number of dimensions in a pencil
% starts (*) [int(int32),allocatable] :: Local starts in natural Fortran order
% counts (*) [int(int32),allocatable] :: Local counts in natural Fortran order
% size [int(int64)] :: Total number of elements in a pencil
Related Type functions¶
- function dtfft_pencil_t(starts, counts)
Type bound constructor
- Parameters:
starts (*) [int(int32),in] :: Local starts in natural Fortran order
counts (*) [int(int32),in] :: Local counts in natural Fortran order
dtfft_platform_t¶
- type dtfft_platform_t¶
Type that specifies the execution platform, such as Host, CUDA, or HIP
Type Parameters¶
- DTFFT_PLATFORM_HOST¶
Create HOST-related plan
- DTFFT_PLATFORM_CUDA¶
Create CUDA-related
dtfft_stream_t¶
- type dtfft_stream_t¶
dtFFTstream representation.- Type fields:
% stream [type(c_ptr)] :: Actual stream
Related Type functions¶
- function dtfft_stream_t(stream)
C-pointer constructor
- Parameters:
stream [type(c_ptr),in] :: Stream pointer
- Return:
dtfft_stream_t :: Stream object
- function dtfft_stream_t(stream)
CUDA-Fortran stream constructor
- Parameters:
stream [integer(cuda_stream_kind),in] :: CUDA-Fortran stream
- Return:
dtfft_stream_t :: Stream object
- function dtfft_get_cuda_stream(stream)¶
Gets CUDA stream from dtfft_stream_t object
- Parameters:
stream [dtfft_stream_t,in] :: Stream object
- Return:
integer (cuda_stream_kind) :: CUDA-Fortran stream
Version handling¶
Parameters¶
- DTFFT_VERSION_MAJOR¶
dtFFTMajor Version
- DTFFT_VERSION_MINOR¶
dtFFTMinor Version
- DTFFT_VERSION_PATCH¶
dtFFTPatch Version
- DTFFT_VERSION_CODE¶
dtFFTVersion Code. Can be used in Version comparison
Functions¶
- function dtfft_get_version()¶
- Return:
integer (int32) :: Version Code defined during compilation
- function dtfft_get_version(major, minor, patch)
Computes Version Code based on Major, Minor and Patch versions
- Parameters:
major [integer(int32)] :: Major version
minor [integer(int32)] :: Minor version
patch [integer(int32)] :: Patch version
- Return:
integer (int32) :: Requested Version Code
Abstract plan¶
- type dtfft_plan_t¶
Abstract class for all
dtFFTplans
Type bound procedures¶
transpose¶
- subroutine transpose(in, out, transpose_type[, error_code])¶
Performs single transposition
- Parameters:
in [type(*), dimension(..),inout] :: Incoming buffer of any rank and kind.
out [type(*), dimension(..),inout] :: Resulting buffer of any rank and kind
transpose_type [dtfft_transpose_t,in] :: Type of transposition
- Options:
error_code [integer(int32),out, optional] :: Optional error code returned to user
transpose_ptr¶
- subroutine transpose_ptr(in, out, transpose_type[, error_code])¶
Performs single transposition
- Parameters:
in [type(c_ptr),in] :: Incoming pointer
out [type(c_ptr),in] :: Resulting pointer
transpose_type [dtfft_transpose_t,in] :: Type of transposition
- Options:
error_code [integer(int32),out, optional] :: Optional error code returned to user
execute¶
- subroutine execute(in, out, execute_type[, aux, error_code])¶
Executes plan
- Parameters:
in [type(*), dimension(..),inout] :: Incoming buffer of any rank and kind.
out [type(*), dimension(..),inout] :: Resulting buffer of any rank and kind
execute_type [dtfft_execute_t,in] :: Type of execution
- Options:
aux [type(*), dimension(..),inout, optional] :: Optional auxiliary buffer.
error_code [integer(int32),out, optional] :: Optional error code returned to user
execute_ptr¶
- subroutine execute_ptr(in, out, execute_type, aux[, error_code])¶
Executes plan
- Parameters:
in [type(c_ptr),in] :: Incoming pointer
out [type(c_ptr),in] :: Resulting pointer
execute_type [dtfft_execute_t,in] :: Type of execution
aux [type(c_ptr),in] :: Auxiliary pointer. Not optional. Must pass
c_null_ptrif not used.
- Options:
error_code [integer(int32),out, optional] :: Optional error code returned to user
destroy¶
- subroutine destroy([error_code])¶
Destroys plan, frees all memory
- Options:
error_code [integer(int32),out] :: Optional error code returned to user
get_local_sizes¶
- subroutine get_local_sizes([in_starts, in_counts, out_starts, out_counts, alloc_size, error_code])¶
Obtain local starts and counts in real and fourier spaces
- Options:
in_starts (*) [integer(int32),out, optional] :: Start indexes in real space (0-based)
in_counts (*) [integer(int32),out, optional] :: Number of elements in real space
out_starts (*) [integer(int32),out, optional] :: Start indexes in fourier space (0-based)
out_counts (*) [integer(int32),out, optional] :: Number of elements in fourier space
alloc_size (*) [integer(int64),out, optional] :: Minimum number of elements needs to be allocated for
in,outorauxbuffers. Size of each element in bytes can be obtained by callingget_element_size().error_code [integer(int32),out, optional] :: Optional error code returned to user
get_alloc_size¶
- function get_alloc_size([error_code])¶
Wrapper around
get_local_sizes()to obtain number of elements only- Options:
error_code [integer(int32),out, optional] :: Optional error code returned to user
- Return:
integer (int64) :: Minimum number of elements needs to be allocated for
in,outorauxbuffers. Size of each element in bytes can be obtained by callingget_element_size().
get_element_size¶
- function get_element_size([error_code])¶
Returns number of bytes required to store single element.
- Options:
error_code [integer(int32),out, optional] :: Optional error code returned to user
- Return:
integer (int64) :: Size of element in bytes
get_alloc_bytes¶
- function get_alloc_bytes([error_code])¶
Returns minimum number of bytes required to execute plan
- Options:
error_code [integer(int32),out, optional] :: Optional error code returned to user
- Return:
integer (int64) :: Minimum number of bytes needs to be allocated for
in,outorauxbuffers.
mem_alloc¶
Allocates memory tailored to the specific needs of the plan.
- subroutine mem_alloc(alloc_size, ptr[, lbound, error_code])¶
- Parameters:
alloc_size [integer(int64),in] :: Number of elements to allocate
ptr (*) [type(*),pointer, out] :: 1D pointer to allocate
- Options:
lbound [integer(int32),in, optional] :: Lower boundary of allocated pointer
error_code [integer(int32),out, optional] :: Optional error code returned to user
- subroutine mem_alloc(alloc_size, ptr, sizes[, lbounds, error_code])
- Parameters:
alloc_size [integer(int64),in] :: Number of elements to allocate
ptr (..) [type(*),pointer, out] :: 2D or 3D pointer to allocate
sizes (*) [integer(int32),in] :: Sizes of each dimension in natural Fortran order. Size of
sizesmust match rank of pointer.
- Options:
lbounds (*) [integer(int32),in, optional] :: Lower boundaries of allocated pointer. Size of
lboundsmust match rank of pointer.error_code [integer(int32),out, optional] :: Optional error code returned to user
mem_alloc_ptr¶
Allocates memory tailored to the specific needs of the plan.
- subroutine mem_alloc_ptr(alloc_bytes, ptr[, error_code])¶
- Parameters:
alloc_bytes [integer(int64),in] :: Number of bytes to allocate
ptr [type(c_ptr),out] :: Allocated pointer
- Options:
error_code [integer(int32),out, optional] :: Optional error code returned to user
mem_free¶
Frees memory previously allocated by mem_alloc().
- subroutine mem_free(ptr[, error_code])¶
- Parameters:
ptr (..) [type(*),inout] :: Pointer allocated with
mem_alloc- Options:
error_code [integer(int32),out, optional] :: Optional error code returned to user
mem_free_ptr¶
Frees memory previously allocated by mem_alloc_ptr().
- subroutine mem_free_ptr(ptr[, error_code])¶
- Parameters:
ptr [type(c_ptr),in] :: Pointer allocated with
mem_alloc_ptr- Options:
error_code [integer(int32),out, optional] :: Optional error code returned to user
get_z_slab_enabled¶
- function get_z_slab_enabled([error_code])¶
Returns logical value is Z-slab optimization enabled internally
- Options:
error_code [integer(int32),out, optional] :: Optional error code returned to user
- Return:
logical :: Boolean value if Z-slab is used.
get_pencil¶
- function get_pencil(dim[, error_code])¶
Obtains pencil information from plan. This can be useful when user wants to use own FFT implementation, that is unavailable in
dtFFT.- Parameters:
dim [integer(int32),in] ::
- Required dimension:
0 for XYZ layout (real space, valid for PlanR2C only)
1 for XYZ layout (real space for C2C and R2R plans and fourier space for R2C plans)
2 for YXZ layout
3 for ZXY layout
- Options:
error_code [integer(int32),out, optional] :: Optional error code returned to user
- Return:
dtfft_pencil_t :: Pencil data
report¶
- subroutine report([error_code])¶
Prints plan-related information to stdout
- Options:
error_code [integer(int32),out, optional] :: Optional error code returned to user
get_executor¶
- function get_executor([error_code])¶
Returns FFT Executor associated with plan
- Options:
error_code [integer(int32),out, optional] :: Optional error code returned to user
- Return:
dtfft_executor_t :: FFT Executor used by this plan.
get_precision¶
- function get_precision([error_code])¶
Returns precision of the plan
- Options:
error_code [integer(int32),out, optional] :: Optional error code returned to user
- Return:
dtfft_precision_t :: Precision of the plan.
get_dims¶
- subroutine get_dims(dims[, error_code])¶
Returns global dimensions of the plan.
- Parameters:
dims [integer(int32),out, pointer] ::
Global dimensions of the plan.
Users should not attempt to change values in this pointer.
- Options:
error_code [integer(int32),out, optional] :: Optional error code returned to user
get_backend¶
- function get_backend([error_code])¶
Returns the fastest detected GPU backend if
effortisDTFFT_PATIENT.If
effortisDTFFT_ESTIMATEorDTFFT_MEASURE, returns the value set bydtfft_set_config()or the default,DTFFT_BACKEND_NCCL.Note
This method is only present in the API when
dtFFTwas compiled with CUDA Support.- Options:
error_code [integer(int32),out, optional] :: Optional error code returned to user
- Return:
dtfft_backend_t :: Selected GPU backend
get_platform¶
- function get_platform([error_code])¶
Returns execution platform of the plan (HOST or CUDA)
Note
This method is only present in the API when
dtFFTwas compiled with CUDA Support.- Options:
error_code [integer(int32),out, optional] :: Optional error code returned to user
- Return:
dtfft_platform_t :: Execution platform
get_stream¶
This method is overloaded to support both CUDA and dtFFT streams.
- subroutine get_stream(stream[, error_code])¶
Returns CUDA stream associated with plan
Note
This method is only present in the API when
dtFFTwas compiled with CUDA Support.- Parameters:
integer (cuda_stream_kind) :: CUDA stream associated with plan
- Options:
error_code [integer(int32),out, optional] :: Optional error code returned to user
- subroutine get_stream(stream[, error_code])
Returns dtFFT stream associated with plan
Note
This method is only present in the API when
dtFFTwas compiled with CUDA Support.- Parameters:
type (dtfft_stream_t) :: dtFFT stream associated with plan
- Options:
error_code [integer(int32),out, optional] :: Optional error code returned to user
Real-to-Real plan¶
- type dtfft_plan_r2r_t¶
Real-to-real plan class
Extends
dtfft_plan_t
Type bound procedures¶
create¶
- subroutine create(dims[, kinds, comm, precision, effort, executor, error_code])¶
R2R Plan Constructor.
- Parameters:
dims (*) [integer(int32),in] :: Global dimensions of the transform as an integer array.
- Options:
kinds (*) [dtfft_r2r_kind_t,in, optional] :: Kinds of R2R transforms, default = empty.
comm [MPI_Comm,in, optional] :: Communicator for parallel execution, default = MPI_COMM_WORLD.
precision [dtfft_precision_t,in, optional] :: Precision of the transform, default =
DTFFT_DOUBLE.effort [dtfft_effort_t,in, optional] :: How hard
dtFFTshould look for best plan, default =DTFFT_ESTIMATE.executor [dtfft_executor_t,in, optional] :: Type of external FFT executor, default =
DTFFT_EXECUTOR_NONE.error_code [integer(int32),out, optional] :: Optional error code returned to the user
- subroutine create(pencil[, kinds, comm, precision, effort, executor, error_code])
R2R Plan Constructor using local pencil information
- Parameters:
pencil [dtfft_pencil_t,in] :: Local pencil of data to be transformed
- Options:
kinds (*) [dtfft_r2r_kind_t,in, optional] :: Kinds of R2R transforms, default = empty.
comm [MPI_Comm,in, optional] :: Communicator for parallel execution, default = MPI_COMM_WORLD.
precision [dtfft_precision_t,in, optional] :: Precision of the transform, default =
DTFFT_DOUBLE.effort [dtfft_effort_t,in, optional] :: How hard
dtFFTshould look for best plan, default =DTFFT_ESTIMATE.executor [dtfft_executor_t,in, optional] :: Type of external FFT executor, default =
DTFFT_EXECUTOR_NONE.error_code [integer(int32),out, optional] :: Optional error code returned to the user
Complex-to-Complex plan¶
- type dtfft_plan_c2c_t¶
Complex-to-complex plan class
Extends
dtfft_plan_t
Type bound procedures¶
create¶
- subroutine create(dims[, comm, precision, effort, executor, error_code])
C2C Plan Constructor.
- Parameters:
dims (*) [integer(int32),in] :: Global dimensions of the transform as an integer array.
- Options:
comm [MPI_Comm,in, optional] :: Communicator for parallel execution, default = MPI_COMM_WORLD.
precision [dtfft_precision_t,in, optional] :: Precision of the transform, default =
DTFFT_DOUBLE.effort [dtfft_effort_t,in, optional] :: How hard
dtFFTshould look for best plan, default =DTFFT_ESTIMATE.executor [dtfft_executor_t,in, optional] :: Type of external FFT executor, default =
DTFFT_EXECUTOR_NONE.error_code [integer(int32),out, optional] :: Optional error code returned to the user
- subroutine create(pencil[, comm, precision, effort, executor, error_code])
C2C Plan Constructor using local pencil information
- Parameters:
pencil [dtfft_pencil_t,in] :: Local pencil of data to be transformed
- Options:
comm [MPI_Comm,in, optional] :: Communicator for parallel execution, default = MPI_COMM_WORLD.
precision [dtfft_precision_t,in, optional] :: Precision of the transform, default =
DTFFT_DOUBLE.effort [dtfft_effort_t,in, optional] :: How hard
dtFFTshould look for best plan, default =DTFFT_ESTIMATE.executor [dtfft_executor_t,in, optional] :: Type of external FFT executor, default =
DTFFT_EXECUTOR_NONE.error_code [integer(int32),out, optional] :: Optional error code returned to the user
Real-to-Complex plan¶
- type dtfft_plan_r2c_t¶
Real-to-complex plan class
Extends
dtfft_plan_t
Note
This type is only present in the API when dtFFT is compiled with FFT support.
Type bound procedures¶
create¶
- subroutine create(dims, executor[, comm, precision, effort, error_code])
R2C Plan Constructor.
- Parameters:
dims (*) [integer(int32),in] :: Global dimensions of the transform as an integer array.
executor [dtfft_executor_t,in] ::
Type of external FFT executor.
Must not be
DTFFT_EXECUTOR_NONE.
- Options:
comm [MPI_Comm,in, optional] :: Communicator for parallel execution, default = MPI_COMM_WORLD.
precision [dtfft_precision_t,in, optional] :: Precision of the transform, default =
DTFFT_DOUBLE.effort [dtfft_effort_t,in, optional] :: How hard
dtFFTshould look for best plan, default =DTFFT_ESTIMATE.error_code [integer(int32),out, optional] :: Optional error code returned to the user
- subroutine create(pencil, executor[, comm, precision, effort, error_code])
R2C Plan Constructor using local pencil information
- Parameters:
pencil [dtfft_pencil_t,in] :: Local pencil of data to be transformed
executor [dtfft_executor_t,in] ::
Type of external FFT executor.
Must not be
DTFFT_EXECUTOR_NONE.
- Options:
comm [MPI_Comm,in, optional] :: Communicator for parallel execution, default = MPI_COMM_WORLD.
precision [dtfft_precision_t,in, optional] :: Precision of the transform, default =
DTFFT_DOUBLE.effort [dtfft_effort_t,in, optional] :: How hard
dtFFTshould look for best plan, default =DTFFT_ESTIMATE.error_code [integer(int32),out, optional] :: Optional error code returned to the user