Learn R Programming

highfrequency (version 0.7.0.1)

tradesCleanup: Cleans trade data

Description

This is a wrapper function for cleaning the trade data of all stock data inside the folder dataSource. The result is saved in the folder dataDestination.

In case you supply the argument "rawtData", the on-disk functionality is ignored. The function returns a vector indicating how many trades were removed at each cleaning step in this case. and the function returns an xts or data.table object.

The following cleaning functions are performed sequentially: noZeroPrices, selectExchange, salesCondition, mergeTradesSameTimestamp.

Since the function rmTradeOutliersUsingQuotes also requires cleaned quote data as input, it is not incorporated here and there is a separate wrapper called tradesCleanupUsingQuotes.

Usage

tradesCleanup(
  dataSource = NULL,
  dataDestination = NULL,
  exchanges,
  tDataRaw = NULL,
  report = TRUE,
  selection = "median",
  validConds = c("", "@", "E", "@E", "F", "FI", "@F", "@FI", "I", "@I"),
  saveAsXTS = TRUE,
  tz = "EST"
)

Arguments

dataSource

character indicating the folder in which the original data is stored.

dataDestination

character indicating the folder in which the cleaned data is stored.

exchanges

vector of stock exchange symbols for all data in dataSource, e.g. exchanges = c("T","N") retrieves all stock market data from both NYSE and NASDAQ. The possible exchange symbols are:

  • A: AMEX

  • N: NYSE

  • B: Boston

  • P: Arca

  • C: NSX

  • T/Q: NASDAQ

  • D: NASD ADF and TRF

  • X: Philadelphia

  • I: ISE

  • M: Chicago

  • W: CBOE

  • Z: BATS

tDataRaw

xts object containing (for ONE stock only) raw trade data. This argument is NULL by default. Enabling it means the arguments from, to, dataSource and dataDestination will be ignored. (only advisable for small chunks of data)

report

boolean and TRUE by default. In case it is true the function returns (also) a vector indicating how many trades remained after each cleaning step.

selection

argument to be passed on to the cleaning routine mergeTradesSameTimestamp. The default is "median".

validConds

character vector containing valid sales conditions. Passed through to salesCondition.

saveAsXTS

indicates whether data should be saved in xts format instead of data.table when using on-disk functionality. TRUE by default.

tz

timezone to use

Value

For each day an xts or data.table object is saved into the folder of that date, containing the cleaned data. This procedure is performed for each stock in "ticker". The function returns a vector indicating how many trades remained after each cleaning step.

In case you supply the argument "rawtData", the on-disk functionality is ignored and the function returns a list with the cleaned trades as xts object (see examples).

References

Barndorff-Nielsen, O. E., P. R. Hansen, A. Lunde, and N. Shephard (2009). Realized kernels in practice: Trades and quotes. Econometrics Journal 12, C1-C32.

Brownlees, C.T. and Gallo, G.M. (2006). Financial econometric analysis at ultra-high frequency: Data handling concerns. Computational Statistics & Data Analysis, 51, pp. 2232-2245.

Examples

Run this code
# NOT RUN {
# Consider you have raw trade data for 1 stock for 2 days 
head(sampleTDataRawMicroseconds)
dim(sampleTDataRawMicroseconds)
tDataAfterFirstCleaning <- tradesCleanup(tDataRaw = sampleTDataRaw, exchanges = list("N"))
tDataAfterFirstCleaning$report
dim(tDataAfterFirstCleaning$tData)

#In case you have more data it is advised to use the on-disk functionality
#via "dataSource" and "dataDestination" arguments

# }

Run the code above in your browser using DataLab