Learn R Programming

genio (version 1.1.2)

read_eigenvec: Read Plink eigenvec file

Description

This function reads a Plink eigenvec file, parsing columns strictly. First two must be 'fam' and 'id', which are strings, and all remaining columns (eigenvectors) must be numeric.

Usage

read_eigenvec(
  file,
  ext = "eigenvec",
  plink2 = FALSE,
  comment = if (plink2) "" else "#",
  verbose = TRUE
)

Value

A list with two elements:

  • eigenvec: A numeric R matrix containing the parsed eigenvectors. If plink2 = TRUE, the original column names will be preserved in this matrix.

  • fam: A tibble with two columns, fam and id, which are the first two columns of the parsed file. These column names are always the same even if plink2 = TRUE (i.e. they won't be FID or IID).

Arguments

file

The input file path, potentially excluding extension.

ext

File extension (default "eigenvec") can be changed if desired. Set to NA to force file to exist as-is.

plink2

If TRUE, the header is parsed and preserved in the returned data. The first two columns must be FID and IID, which are mandatory.

comment

A string used to identify comments. Any text after the comment characters will be silently ignored. Passed to readr::read_table(). '#' (default when plink2 = FALSE) works for Plink 2 eigenvec files, which have a header lines that starts with this character (the header is therefore ignored). However, plink2 = TRUE forces the header to be parsed instead.

verbose

If TRUE (default) function reports the path of the file being written (after autocompleting the extension).

See Also

write_eigenvec() for writing an eigenvec file.

Plink 1 eigenvec format reference: https://www.cog-genomics.org/plink/1.9/formats#eigenvec

Plink 2 eigenvec format reference: https://www.cog-genomics.org/plink/2.0/formats#eigenvec

GCTA eigenvec format reference: https://cnsgenomics.com/software/gcta/#PCA

Examples

Run this code
# to read "data.eigenvec", run like this:
# data <- read_eigenvec("data")
# this also works
# data <- read_eigenvec("data.eigenvec")
#
# either way you get a list with these two items:
# numeric eigenvector matrix
# data$eigenvec
# fam/id tibble
# data$fam

# The following example is more awkward
# because package sample data has to be specified in this weird way:

# read an existing *.eigenvec file created by GCTA
file <- system.file("extdata", 'sample-gcta.eigenvec', package = "genio", mustWork = TRUE)
data <- read_eigenvec(file)
# numeric eigenvector matrix
data$eigenvec
# fam/id tibble
data$fam

# can specify without extension
file <- sub('\\.eigenvec$', '', file) # remove extension from this path on purpose
file # verify .eigenvec is missing
data <- read_eigenvec(file) # load it anyway!
data$eigenvec

# read an existing *.eigenvec file created by Plink 2
file <- system.file("extdata", 'sample-plink2.eigenvec', package = "genio", mustWork = TRUE)
# this version ignores header
data <- read_eigenvec(file)
# numeric eigenvector matrix
data$eigenvec
# fam/id tibble
data$fam

# this version uses header
data <- read_eigenvec(file, plink2 = TRUE)
# numeric eigenvector matrix
data$eigenvec
# fam/id tibble
data$fam

Run the code above in your browser using DataLab