Subsampling

Functions

This package provides a series of methods for partitioning time-series data based on jackkniving and bootstrapping.

Jackknife

The first group of algorithms includes the following generalisation of the jackknife for dependent data:

  • the block jackknife (Kunsch, 1989);
  • the artificial delete-$d$ jackknife (Pellegrino, 2022).
MessyTimeSeries.block_jackknifeFunction
block_jackknife(Y::Union{FloatMatrix, JMatrix{Float64}}, subsample::Float64)

Generate block jackknife (Kunsch, 1989) samples. This implementation is described in Pellegrino (2022).

This technique subsamples a time series dataset by removing, in turn, all the blocks of consecutive observations with a given size.

Arguments

  • Y: Observed measurements (nxT), where n and T are the number of series and observations.
  • subsample: Block size as a percentage of number of observed periods. It is bounded between 0 and 1.

References

Kunsch (1989) and Pellegrino (2022).

source
MessyTimeSeries.artificial_jackknifeFunction
artificial_jackknife(Y::Union{FloatMatrix, JMatrix{Float64}}, subsample::Float64, max_samples::Int64, seed::Int64=1)

Generate artificial jackknife samples as in Pellegrino (2022).

The artificial delete-$d$ jackknife is an extension of the delete-$d$ jackknife for dependent data problems.

  • This technique replaces the actual data removal step with a fictitious deletion, which consists of imposing $d$-dimensional (artificial) patterns of missing observations to the data.
  • This approach does not alter the data order nor destroy the correlation structure.

Arguments

  • Y: Observed measurements (nxT), where n and T are the number of series and observations.
  • subsample: $d$ as a percentage of the original sample size. It is bounded between 0 and 1.
  • max_samples: If $\binomial{nT,d}$ is too large, artificial_jackknife generates max_samples jackknife samples.
  • seed: Random seed (default: 1).

References

Pellegrino (2022).

source
MessyTimeSeries.optimal_dFunction
optimal_d(n::Int64, T::Int64)

Select the optimal value for $d$. See ?artificial_jackknife for more details on $d$.

Arguments

  • n: Number of series
  • T: Number of observations
source

Bootstrap

The second group includes the following bootstrap versions compatible with time series:

  • the moving block bootstrap (Kunsch, 1989; Liu and Singh, 1992);
  • the stationary block bootstrap (Politis and Romano, 1994).
MessyTimeSeries.moving_block_bootstrapFunction
moving_block_bootstrap(Y::Union{FloatMatrix, JMatrix{Float64}}, subsample::Float64, samples::Int64, seed::Int64=1)

Generate moving block bootstrap samples.

The moving block bootstrap randomly subsamples a time series into ordered and overlapped blocks of consecutive observations.

Arguments

  • Y: Observed measurements (nxT), where n and T are the number of series and observations.
  • subsample: Block size as a percentage of number of observed periods. It is bounded between 0 and 1.
  • samples: Number of bootstrap samples.
  • seed: Random seed (default: 1).

References

Kunsch (1989) and Liu and Singh (1992).

source
MessyTimeSeries.stationary_block_bootstrapFunction
stationary_block_bootstrap(Y::Union{FloatMatrix, JMatrix{Float64}}, subsample::Float64, samples::Int64, seed::Int64=1)

Generate stationary block bootstrap samples.

The stationary bootstrap is similar to the block bootstrap proposed in independently in Kunsch (1989) and Liu and Singh (1992).

There are two main differences:

  • The blocks have random length
  • In order to achieve stationarity, the stationary (block) bootstrap "wraps" the data around in a "circle" so that the first observation follows the last.

Note: Block size is exponentially distributed with mean Int64(ceil(subsample*T)).

Arguments

  • Y: Observed measurements (nxT), where n and T are the number of series and observations.
  • subsample: Block size as a percentage of number of observed periods. It is bounded between 0 and 1.
  • samples: Number of bootstrap samples.
  • seed: Random seed (default: 1).

References

Politis and Romano (1994).

source

Index