Learn R Programming

bio3d (version 2.1-2)

geostas: GeoStaS Domain Finder

Description

Identifies geometrically stable domains in biomolecules

Usage

geostas(xyz, amsm = NULL, k = 3, method = "pairwise", fit = TRUE, ...)

amsm.xyz(xyz, ncore = NULL)

Arguments

xyz
numeric matrix of xyz coordinates as obtained e.g. by read.ncdf, read.dcd, or mktrj.
amsm
a numeric matrix as obtained by amsm.xyz (convenient e.g. for re-doing only the clustering analysis of the AMSM matrix).
k
an integer scalar or vector with the desired number of groups.
method
character string specifing the approach for clustering the atomic movement similarity matrix (pariwise or columnwise).
fit
logical, if TRUE coordinate superposition on identified core atoms is performed prior to the calculation of the AMS matrix.
ncore
number of CPU cores used to do the calculation. ncore>1 requires package parallel installed.
...
additional arguments to amsm.xyz.

Value

  • Returns a list object with the following components:
  • grpsa numeric vector containing the domain assignment per residue.
  • amsma numeric matrix of atomic movement similarity (AMSM).
  • fit.indsa numeric vector of xyz indices used for fitting.

Details

This function attempts to identify rigid domains in a protein (or nucleic acid) structure based on an structural ensemble, e.g. obtained from NMR experiments, molecular dynamics simulations, or normal mode analysis. The algorithm is based on a geometric approach for comparing pairwise traces of atomic motion and the search for their best superposition using a quaternion representation of rotation. The result is stored in a NxN atomic movement similarity matrix (AMSM) describing the correspondence between all pairs of atom motion. Rigid domains are obtained by clustering the elements of the AMS matrix (method=pairwise), or alternatively, the columns similarity (method=columnwise), using K-means clustering (see function kmeans()).

Compared to the conventional cross-correlation matrix (see function dccm) the geostas approach provide functionality to also detect domains involved in rotational motions (i.e. two atoms located on opposite sides of a rotating domain will appear as anti-correlated in the cross-correlation matrix, but should obtain a high similarity coefficient in the AMS matrix).

See examples for more details.

References

Romanowska, J. et al. (2012) JCTC 8, 2588--2599. Grant, B.J. et al. (2006) Bioinformatics 22, 2695--2696.

See Also

read.ncdf, read.dcd, read.pdb, mktrj.

Examples

Run this code
#### NMR-ensemble example
## Read a multi-model PDB file 
pdb <- read.pdb("1d1d", multi=TRUE)
ca.inds <- atom.select(pdb, 'calpha')

## Find domains and write PDB
gs  <- geostas(pdb)

## Plot a atomic movement similarity matrix
plot.dccm(gs$amsm, at=seq(0, 1, 0.1), main="AMSM with Domain Assignment",
          col.regions=rev(heat.colors(200)), margin.segments=gs$grps, contour=FALSE)

## Fit all frames to the 'first' domain
domain.inds <- atom2xyz(which(gs$grps==1))

xyz <- fit.xyz(pdb$xyz[1, ca.inds$xyz],
               pdb$xyz[, ca.inds$xyz],
               fixed.inds = domain.inds,
               mobile.inds = domain.inds)

#write.pdb(xyz=xyz, chain=gs$grps)

#### NMA example
## Fetch stucture
pdb <- read.pdb("1crn")

## Calculate (vibrational) normal modes
modes <- nma(pdb)

## Mode trajectories
trj <- rbind(mktrj(modes, mode=7),
             mktrj(modes, mode=8),
             mktrj(modes, mode=9),
             mktrj(modes, mode=10),
             mktrj(modes, mode=11))

## Find domains
gs <- geostas(trj, k=2)

## Write NMA trajectory with domain assignment
mktrj(modes, mode=7, chain=gs$grps)

## Redo geostas domain clustering 
gs <- geostas(trj, amsm=gs$amsm, k=5)




#### Trajectory example
## Read inn DCD trajectory file, fit coordinates
dcdfile <- system.file("examples/hivp.dcd", package = "bio3d")
trj <- read.dcd(dcdfile)
xyz <- fit.xyz(trj[1,], trj)

## Find domains
gs <- geostas(xyz, k=3, fit=FALSE)

## Principal component analysis 
pc.md <- pca.xyz(xyz)

## Visualize PCs with colored domains (chain ID)
mktrj(pc.md, pc=1, chain=gs$grps)




#### X-ray ensemble GroEL subunits
# Define the ensemble PDB-ids
ids <- c("1sx4_[A,B,H,I]", "1xck_[A-B]", "1sx3_[A-B]", "4ab3_[A-B]")

# Download and split PDBs by chain ID
raw.files <- get.pdb(ids, path = "raw_pdbs", gzip = TRUE)
files <- pdbsplit(raw.files, ids, path = "raw_pdbs/split_chain/")

# Align and superimpose coordinates
pdbs <- pdbaln(files)
gs <- geostas(pdbs, k=4, fit=TRUE)

# Principal component analysis 
gaps.pos <- gap.inspect(pdbs$xyz)
xyz <- fit.xyz(pdbs$xyz[1, gaps.pos$f.inds],
               pdbs$xyz[, gaps.pos$f.inds],
               fixed.inds=gs$fit.inds,
               mobile.inds=gs$fit.inds)
pc.xray <- pca.xyz(xyz)

## Visualize PCs with colored domains (chain ID)
mktrj(pc.xray, pc=1, chain=gs$grps)

Run the code above in your browser using DataLab