Learn R Programming

spectral (version 2.0)

spec.lomb: Lomb-Scargle Periodigram

Description

The Lomb-Scargle periodigram represents a statistical estimator for the amplitude and phase at a given frequency. This function takes also multivariate (n-dimensional) input data.

Usage

spec.lomb(
  x = NULL,
  y = stop("Missing y-Value"),
  f = NULL,
  ofac = 1,
  w = NULL,
  mode = "normal",
  maxMem = 8,
  cl = NULL
)

Arguments

x

sampling vector or data frame data.frame(x1, x2, x3, ...)

y

input data vector or data frame data.frame(x1, x2, ..., val)

f

optional frequency vector / data frame. If not supplied f is calculated.

ofac

in case f=NULL this value controlls the amount of frequency oversampling.

w

weights for data. It must be a 1D vector.

mode

"normal" calculates the normal Lomb-Scargle periodogram; "generalized" calculates the generalized Lomb-Scargle periodogram including floating average and weights.

maxMem

sets the amount of memory (in MB) to utilize, as a rough approximate.

cl

if numeric, it defines the number of workers to use, or provides a cluster definition of class cluster or SocketCluster from parallel package

Value

The spec.lomb function returns an object of the class lomb, which is a list containg the following information:

A

A vector with amplitude spectrum

f

corresponding frequency vector

phi

phase vector

PSD

power spectral density normalized to the sample variance

floatAvg

floating average value only in case of mode == "generalized"

w

if, mode == "generalized" contains the weighting vector

x,y

original data

p

p-value False Alarm Probability

Speed Up

In general the function calculates everything in a vectorized manner, which speeds up the procedure. If the memory requirement is more than maxMem, the calculation is split into chunks which fit in the memory (cache). Depending on the problem size (number of frequencies and data size) a tuning of this value enhances speed.

Please consider to replpace the BLAS library by a multithreaded version. For example https://prs.ism.ac.jp/~nakama/SurviveGotoBLAS2/binary/windows/x64/ is hosting some Windows RBlas.dll files. Refer to https://mattstats.wordpress.com/2016/02/07/r-with-gotoblas-on-windows-10/ for further information.

The parameter cl controls a possible cluster, which can be invoked. It takes an integer number of workers (i. e. cl = 4), a list with node names c("localhost",...) or an object of class 'cluster' or similar. The first two options cause the function to create the cluster internally. This takes time due to the initialization. The faster way is to provide an already initialized cluster to the function.

Details

Since the given time series does not need to be evenly sampled, the data mainly consists of data pairs x1, x2, x3, ... (sampling points) and (one) corresponding value y, which stores the realisation/measurement data. As can be seen from the data definition above, multivariate (n-dimensional) input data is allowed and properly processed.

Two different methods are implemented: the standard Lomb-Scargle method with

\(y(t) = a * cos(\omega (t - \tau)) + b * sin(\omega (t - \tau))\)

as model function and the generalized Lomb-Scargle (after Zechmeister 2009) method with

\(y(t) = a * cos(\omega t) + b * sin(\omega t) + c\)

as model function, which investigates a floating average parameter \(c\) as well.

Both methods can be supplied by an artifical dense frequency vector f. In conjunction with the resulting phase information the user might be able to build a "Fourier"-like spectrum to reconstruct or interpolate the timeseries in equally spaced sampling. Remind the band limitation which must be fulfilled for this.

f

The frequencies should be stored in a 1D vector or -- in case of multivariate analysis -- in a data.frame structure to preserve variable names

ofac

If the user does not provide a corresponding frequency vector, the ofac parameter causes the function to estimate $$nf = ofac*length(x)/2$$ equidistant frequencies.

p-value

The p-value (aka false alarm probability FAP) gives the probability, wheter the estimated amplitude is NOT significant. However, if p tends to zero the amplidutde is significant. The user must decide which maximum value is acceptable, until an amplitude is not valid.

If missing values NA or NaN appear in any column, the corresponding row is excluded from calculation.

References

A. Mathias, F. Grond, R. Guardans, D. Seese, M. Canela, H. H. Diebner, and G. Baiocchi, "Algorithms for spectral analysis of irregularly sampled time series", Journal of Statistical Software, 11(2), pp. 1--30, 2004.

J. D. Scargle, "Studies in astronomical time series analysis. II - Statistical aspects of spectral analysis of unevenly spaced data", The Astrophysical Journal, 263, pp. 835--853, 1982.

M. Zechmeister and M. Kurster, "The generalised Lomb-Scargle periodogram. A new formalism for the floating-mean and Keplerian periodograms", Astronomy & Astrophysics, 496(2), pp. 577--584, 2009.

See Also

filter.lomb

Examples

Run this code
# NOT RUN {
# create two sin-functions
x_orig <- seq(0,1,by=1e-2)
y_orig <- 2*sin(10*2*pi*x_orig) + 1.5*sin(2*2*pi*x_orig)

# make a 10% gap
i <- round(length(x_orig)*0.2) : round(length(x_orig)*0.3)
x <- x_orig
y <- y_orig
x[i] <- NA
y[i] <- NA


# calculating the lomb periodogram
l <- spec.lomb(x = x, y = y,ofac = 20,mode = "normal")

# select a frequency range
m <- rbind(c(9,11))
# select and reconstruct the most significant component
l2 = filter.lomb(l, x_orig, filt = m)

# plot everything
par(mfrow=c(2,1),mar = c(4,4,2,4))
plot(x,y,"l", main = "Gapped signal")
lines(l2$x, l2$y,lty=2)
legend("bottomleft",c("gapped","10Hz component"),lty=c(1,2))

plot(l,main = "Spectrum")

summary(l)

### Multivariate -- 3D Expample ###
require(lattice)
fx <- 8.1
fy <- 5
fz <- 2

# creating frequency space
f <- expand.grid( fx = seq(-10,10,by = 0.5)
                  ,fy = seq(-10,10,by = 0.5)
                  ,fz = 0:3
)

# creating spatial space
pts <- expand.grid( x = seq(0,1,by = 0.02)
                   ,y = seq(0,1,by = 0.02)
                   ,z = seq(0,1,by = 0.02)
)

# gapping 30%
i <- sample(1:dim(pts)[1],0.7*dim(pts)[1])
pts <- pts[i,]

# caluculating function
pts$val <- cos(2*pi*(  fx*pts$x
                     + fy*pts$y
                     + fz*pts$z
                    ) + pi/4
              ) +
  0.5 * cos(2*pi*(  - 0.5 * fx*pts$x
              + 0.5*fy*pts$y
              + 1 * pts$z
  ) + pi/4
  )

# display with lattice
levelplot(val~x+y,pts,subset = z == 0,main = "with z = 0")

# calculating lomb takes a while
# or we sample only a few points
# which enlarges the noise but accelerates the calculation
l <- spec.lomb(y = pts[sample(1:dim(pts)[1],2e3),]
               ,f = f
               # ,mode = "generalized"
               )

# name the stripes
l$fz_lev <- factor(x = paste("fz =",l$fz)
)

# display output
levelplot(PSD~fx+fy|fz_lev,l)

# the result is an oversampled spectrum of a non equidistant
# sampled function. We recognize a 3D analysis in all provided
# spatial directions x, y, z.

summary(l)
# }

Run the code above in your browser using DataLab