createLDsets: Create Linkage Disequilibrium (LD) Sets

Description

This function partitions a vector of LD scores into sets or blocks based on cumulative LD scores, with constraints on block size and separation. The method ensures that blocks are evenly distributed and reduces overlap based on LD patterns.

Usage

createLDsets(
  ldscores = NULL,
  msize = 200,
  maxsize = 2000,
  nsplit = 200,
  verbose = FALSE
)

Value

A list where each element is a vector of marker names corresponding to a block of LD scores.

Arguments

ldscores: A numeric vector of LD scores for markers, where names correspond to marker identifiers.
msize: An integer specifying the minimum block size for averaging LD scores. Default is 200.
maxsize: An integer specifying the maximum block size. Default is 2000.
nsplit: An integer specifying the number of splits (blocks) to create. Default is 200.
verbose: A logical value. If TRUE, the function will generate diagnostic plots showing the block sizes and cumulative LD score patterns. Default is FALSE.

Details

The function uses cumulative sums of LD scores to create blocks of markers that minimize overlap while satisfying block size constraints. Blocks are defined iteratively by identifying regions with low cumulative LD scores and expanding them until they reach the defined block size limits.

- **Cumulative LD Calculation:** The function calculates the average cumulative LD scores over sliding windows of size msize. - **Block Splitting:** Regions with the lowest cumulative LD scores are selected iteratively to define block boundaries. - **Plotting:** If verbose = TRUE, the function generates two diagnostic plots: - Block sizes for each LD set. - Cumulative LD scores across genome positions, highlighting split positions.