Fast Trigonometric Transforms For Non-Equispaced Nodes

In many areas, computers are increasingly used. Thus, the need for fast algorithms in fundamental tasks is constantly growing. One of the most frequent problems are certainly transformations such as DFT (discrete fourier transform), DCT (discrete cosine transform), and DST (discrete sine transform). Fast algorithms have been known for a longer period of time now, however, they all come with the same restriction. The nodes have to be on an equispaced grid. Today, as a matter of fact, there is an increasing number of applications working with non-uniform sampling sets. To allow for this trend, we propose fast algorithms and provide an implementation.

We consider the index set $I_{a}^{b, d} := {k \in Z^{d} : a \leq_{c} k <_{c} b}$ for $a, b \in Z^{d}$ , where inequalities hold componentwise. Furthermore, let the pointwise product of two vectors $v = k ⊙ x$ be $v := (k_{0} x_{0}, k_{1} x_{1}, \dots, k_{d - 1} x_{d - 1})^{⊤}$ . We define the cosine and sine of a vector $v \in R^{d}$ as the tensorproduct of its components, i.e., $\cos (v) := \cos (v_{0}) \cos (v_{1}) \dots \cos (v_{d - 1})$ and $\sin (v) := \sin (v_{0}) \sin (v_{1}) \dots \sin (v_{d - 1})$ , respectively.
Let $x_{j} \in [0, \frac{1}{2}]^{d}$ , $(j = 0, . . ., M - 1)$ , ${\hat{f}}_{k}^{C} \in R$ , $(k \in I_{0}^{N, d})$ , and ${\hat{f}}_{k}^{S} \in R$ , $(k \in I_{1}^{N, d})$ . Then the $d$ -variate NDCT and NDST are defined as $f_{j}^{C} := f^{C} (x_{j}) = \sum_{k \in I_{0}^{N, d}} {\hat{f}}_{k}^{C} \cos (2 π (k ⊙ x_{j}))$ and $f_{j}^{S} := f^{S} (x_{j}) = \sum_{k \in I_{1}^{N, d}} {\hat{f}}_{k}^{S} \sin (2 π (k ⊙ x_{j}))$ respectively.

The libraries are written in C, containing the NFCT and NFST as well as their inverse transforms iNFCT and iNFST.

Example

Given a set of $M$ arbitrary points $x_{j} \in [0, \frac{1}{2}]^{d}$ in the $d$ -dimensional space and samples $f (x_{j})$ , we aim to find the coefficients of the $d$ -variate trigonometric polynomial $g (x_{j})$ that suffices $f (x_{j}) \approx g (x_{j})$ . By using an iterative equation system solver in combination with the NFCT or NFST algorithms we accomplish to solve this problem known as Scattered Data Interpolation. We use the glacier data set, as an example for the problem stated above:



Figure 1: These images are the results of two-dimensional scattered data interpolations employing the iterative solver CGNR that uses the NFCT (left) and NFST (right), respectively, for fast matrix-vector multiplication.

Runtime

May $N = (N_{0}, \dots, N_{d - 1})^{⊤}$ be the bandwidths, $M$ the number of nodes $x$ , $d$ the dimension and $m$ the truncation parameter. Then, our fast algorithms (NFCT and NFST) have a complexity of $O (Π (N) \log Π (N) + m^{d} M)$ while the slow direct summations perform the transform in $O (Π (N) M)$ arithmetic operations, where $Π (N) := N_{0} N_{1} \dots N_{d - 1}$ .



Figure 2: These graphs show running times (in seconds) of the one-dimensional (left) and two-dimensional (right) NFCT and NFST in respect to $N$ for random Fourier coefficients and nodes. Times are compared to the direct summation (NDCT/NDST).

Approximation Error

The NFCT and NFST are approximate algorithms. Consequently, we make an approximation error $E (x_{j})$ , that we estimate with $E_{\infty} (x_{j})$ as the maximum of the sum of the aliasing error $| E_{a} (x_{j}) |$ (cutting in frequency domain) and the truncation error $| E t (x_{j}) |$ (cutting in spatial domain), ( $j = 0, \dots, M - 1$ ). The locality of the kernel (i.e. Gauss, B-Spline, Sinc or Kaiser-Bessel) that is used in both spatial and frequency domain, plays a significant role concerning the accuray of the results. With the truncation parameter $m$ and the oversampling-factor $σ$ , we can conveniently control the outcome, possibly at the cost of runtime.



Figure 3: These graphs show the approximation error E_∞ of the NFCT (left) and NFST (right) in respect to $m$ and various kernels for the dimension $d = 1$ .

Usage

Two simple examples how to use the NFCT/NFST and iNFCT/iNFST are shown below ("R" is used as a wildcard for "c" and "s"):

void my_2d_nfRt_trafo();
{
  int j,k;
  nfRt_plan my_plan;

  // initializing plan for two dimensional transform
  // with bandwitdhs 32, 32 and 1024 nodes
  nfRt_init_2d( &my_plan, 32, 32, 1024);

  // initializing the nodes
  for( j = 0;  j < my_plan.d * my_plan.M_total;  j++)
    my_plan.x[j]  = 0.5 * ((double)rand()) / RAND_MAX;

  // if PRE_PSI is set, then do precomputation
  if( my_plan.nfRt_flags & PRE_PSI)
    nfRt_precompute_psi( &my_plan);

  // initializing coefficients
  for( k = 0;  k < my_plan.N_total;  k++)
    my_plan.f_hat[k] = ((double)rand()) / RAND_MAX;

  // execute transform
  nfRt_trafo( &my_plan);

  // process results (i.e. print to stdout)
  for( j = 0; j <  my_plan.M_total;  j++)
    printf( "f[%d] = %e\n", j, my_plan.f[j]);

  // deallocate memory
  nfRt_finalize( &my_plan);
}

void my_2d_inv_nfRt();
{
  int j,k;
  int iter, iter_end = 5;
  nfRt_plan  my_plan;
  infRt_plan my_inv_plan;

  // initializing plan for two dimensional transform
  // with bandwidths 32, 32 and 1024 nodes
  nfRt_init_2d( &my_plan, 32, 32, 1024);

  // initializing plan for an inverse transform
  infRt_init( &my_inv_plan, &my_plan);

  // initializing nodes
  for( j = 0; j < my_plan.d * my_plan.M_total;  j++)
    my_plan.x[j]  = 0.5 * ((double)rand()) / RAND_MAX;

  // if PRE_PSI is set, then do precomputation
  if( my_plan.nfRt_flags & PRE_PSI)
    nfRt_precompute_psi( &my_plan);

  // initializing samples
  for( j = 0; j < my_plan.M_total; j++)
    my_inv_plan.y[j] = ((double)rand()) / RAND_MAX;
 
  // initializing fourier coefficients
  for( k = 0; k < my_plan.N_total;  k++)
    my_inv_plan.f_hat_iter[k] = 0.0;

  // precomputations for the iterative solver
  infRt_before_loop( &my_inv_plan);

  // execute iterations
  for( iter = 0;  iter < iter_end;  iter++)
    infRt_loop_one_step( &my_inv_plan);

  // further processing  (i.e. print to stdout)
  for( k = 0;  k < my_plan.N_total;  k++)
    printf( "%e\n", my_inv_plan.f_hat_iter[k]);
  
  // deallocate memory
  infRt_finalize( &my_inv_plan);
  nfRt_finalize( &my_plan);
}

Steffen Klatt, Nov. 2005

The algorithms are implemented by Steffen Klatt in ./examples/nfst and ./examples/nfct. The OpenMP parallelization and the Matlab interface in ./matlab/nfct and ./matlab/nfst were implemented by Toni Volkmer. The Julia interface was implemented by Michael Schmischke in ./julia/nfct and ./julia/nfst. Related paper are

Fenn, M. and Potts, D.
Fast summation based on fast trigonometric transforms at nonequispaced nodes.
Numer. Linear Algebra Appl. 12, 161-169. (full paper ps, pdf), 2005