| Page 3 | Janelia Research Campus

custom | custom

Search Results

filters_region_cap | custom

Filter

facetapi-Q2b17qCsTdECvJIqZJgYMaGsr8vANl1n | block

Associated Lab

facetapi-W9JlIB1X0bjs93n1Alu3wHJQTTgDCBGe | block

Associated Project Team

facetapi-61yz1V0li8B1bixrCWxdAe2aYiEXdhd0 | block

Associated Support Team

facetapi-PV5lg7xuz68EAY8eakJzrcmwtdGEnxR0 | block

Publication Date

general_search_page-panel_pane_1 | views_panes

30 Janelia Publications

Showing 21-30 of 30 results

Your Criteria:

Eddy/Rivas Lab

10/01/13 | nhmmer: DNA homology search with profile HMMs.

Wheeler TJ, Eddy SR

Bioinformatics. 2013 Oct 1;29:2487-9. doi: 10.1093/bioinformatics/btt403

+ Expand Abstract

SUMMARY: Sequence database searches are an essential part of molecular biology, providing information about the function and evolutionary history of proteins, RNA molecules and DNA sequence elements. We present a tool for DNA/DNA sequence comparison that is built on the HMMER framework, which applies probabilistic inference methods based on hidden Markov models to the problem of homology search. This tool, called nhmmer, enables improved detection of remote DNA homologs, and has been used in combination with Dfam and RepeatMasker to improve annotation of transposable elements in the human genome. AVAILABILITY: nhmmer is a part of the new HMMER3.1 release. Source code and documentation can be downloaded from http://hmmer.org. HMMER3.1 is freely licensed under the GNU GPLv3 and should be portable to any POSIX-compliant operating system, including Linux and Mac OS/X. CONTACT: wheelert@janelia.hhmi.org.

View Publication Page

Eddy/Rivas Lab

12/07/11 | Phosphorylation at the interface.

Davis FP

Structure . 2011 Dec 7;19:1726-7. doi: 10.1016/j.str.2011.11.006

+ Expand Abstract

Proteomic studies have identified thousands of eukaryotic phosphorylation sites (phosphosites), but few are functionally characterized. Nishi et al., in this issue of Structure, characterize phosphosites at protein-protein interfaces and estimate the effect of their phosphorylation on interaction affinity, by combining proteomics data with protein structures.

View Publication Page

Eddy/Rivas Lab

09/19/08 | Probabilistic phylogenetic inference with insertions and deletions.

Rivas E, Sean R. Eddy

PLoS Computational Biology. 2008 Sep 19;4(9):e1000172. doi: 10.1371/journal.pcbi.1000172

+ Expand Abstract

A fundamental task in sequence analysis is to calculate the probability of a multiple alignment given a phylogenetic tree relating the sequences and an evolutionary model describing how sequences change over time. However, the most widely used phylogenetic models only account for residue substitution events. We describe a probabilistic model of a multiple sequence alignment that accounts for insertion and deletion events in addition to substitutions, given a phylogenetic tree, using a rate matrix augmented by the gap character. Starting from a continuous Markov process, we construct a non-reversible generative (birth-death) evolutionary model for insertions and deletions. The model assumes that insertion and deletion events occur one residue at a time. We apply this model to phylogenetic tree inference by extending the program dnaml in phylip. Using standard benchmarking methods on simulated data and a new "concordance test" benchmark on real ribosomal RNA alignments, we show that the extended program dnamlepsilon improves accuracy relative to the usual approach of ignoring gaps, while retaining the computational efficiency of the Felsenstein peeling algorithm.

View Publication Page

Eddy/Rivas Lab

03/30/07 | Query-dependent banding (QDB) for faster RNA similarity searches.

Nawrocki EP, Eddy SR

PLoS Computational Biology. 2007 Mar 30;3(3):e56. doi: 10.1371/journal.pcbi.0030056

+ Expand Abstract

When searching sequence databases for RNAs, it is desirable to score both primary sequence and RNA secondary structure similarity. Covariance models (CMs) are probabilistic models well-suited for RNA similarity search applications. However, the computational complexity of CM dynamic programming alignment algorithms has limited their practical application. Here we describe an acceleration method called query-dependent banding (QDB), which uses the probabilistic query CM to precalculate regions of the dynamic programming lattice that have negligible probability, independently of the target database. We have implemented QDB in the freely available Infernal software package. QDB reduces the average case time complexity of CM alignment from LN(2.4) to LN(1.3) for a query RNA of N residues and a target database of L residues, resulting in a 4-fold speedup for typical RNA queries. Combined with other improvements to Infernal, including informative mixture Dirichlet priors on model parameters, benchmarks also show increased sensitivity and specificity resulting from improved parameterization.

View Publication Page

Eddy/Rivas Lab

01/28/15 | Rfam 12.0: updates to the RNA families database.

Nawrocki EP, Burge SW, Bateman A, Daub J, Eberhardt RY, Eddy SR, Floden EW, Gardner PP, Jones TA, Tate J, Finn RD

Nucleic Acids Research. 2015 Jan 28;43(Database issue):D130-7. doi: 10.1093/nar/gku1063

+ Expand Abstract

The Rfam database (available at http://rfam.xfam.org) is a collection of non-coding RNA families represented by manually curated sequence alignments, consensus secondary structures and annotation gathered from corresponding Wikipedia, taxonomy and ontology resources. In this article, we detail updates and improvements to the Rfam data and website for the Rfam 12.0 release. We describe the upgrade of our search pipeline to use Infernal 1.1 and demonstrate its improved homology detection ability by comparison with the previous version. The new pipeline is easier for users to apply to their own data sets, and we illustrate its ability to annotate RNAs in genomic and metagenomic data sets of various sizes. Rfam has been expanded to include 260 new families, including the well-studied large subunit ribosomal RNA family, and for the first time includes information on short sequence- and structure-based RNA motifs present within families.

View Publication Page

Eddy/Rivas Lab

01/01/09 | Rfam: updates to the RNA families database.

Gardner PP, Daub J, Tate JG, Nawrocki EP, Kolbe DL, Lindgreen S, Wilkinson AC, Finn RD, Griffiths-Jones S, Eddy SR, Bateman A

Nucleic Acids Research. 2009 Jan;37(Database issue):D136-40. doi: 10.1093/nar/gkn766

+ Expand Abstract

Rfam is a collection of RNA sequence families, represented by multiple sequence alignments and covariance models (CMs). The primary aim of Rfam is to annotate new members of known RNA families on nucleotide sequences, particularly complete genomes, using sensitive BLAST filters in combination with CMs. A minority of families with a very broad taxonomic range (e.g. tRNA and rRNA) provide the majority of the sequence annotations, whilst the majority of Rfam families (e.g. snoRNAs and miRNAs) have a limited taxonomic range and provide a limited number of annotations. Recent improvements to the website, methodologies and data used by Rfam are discussed. Rfam is freely available on the Web at http://rfam.sanger.ac.uk/and http://rfam.janelia.org/.

View Publication Page

Eddy/Rivas Lab

01/01/11 | Rfam: Wikipedia, clans and the "decimal" release.

Gardner PP, Daub J, Tate J, Moore BL, Osuch IH, Griffiths-Jones S, Finn RD, Nawrocki EP, Kolbe DL, Eddy SR, Bateman A

Nucleic Acids Research. 2011 Jan;39(Database issue):D141-5. doi: 10.1093/nar/gkq1129

+ Expand Abstract

The Rfam database aims to catalogue non-coding RNAs through the use of sequence alignments and statistical profile models known as covariance models. In this contribution, we discuss the pros and cons of using the online encyclopedia, Wikipedia, as a source of community-derived annotation. We discuss the addition of groupings of related RNA families into clans and new developments to the website. Rfam is available on the Web at http://rfam.sanger.ac.uk.

View Publication Page

Eddy/Rivas LabScientific Computing Software

01/13/14 | Skylign: a tool for creating informative, interactive logos representing sequence alignments and profile hidden Markov models.

Wheeler TJ, Clements J, Finn RD

BMC Bioinformatics. 2014 Jan 13;15:7. doi: 10.1186/1471-2105-15-7

+ Expand Abstract

BACKGROUND: Logos are commonly used in molecular biology to provide a compact graphical representation of the conservation pattern of a set of sequences. They render the information contained in sequence alignments or profile hidden Markov models by drawing a stack of letters for each position, where the height of the stack corresponds to the conservation at that position, and the height of each letter within a stack depends on the frequency of that letter at that position. RESULTS: We present a new tool and web server, called Skylign, which provides a unified framework for creating logos for both sequence alignments and profile hidden Markov models. In addition to static image files, Skylign creates a novel interactive logo plot for inclusion in web pages. These interactive logos enable scrolling, zooming, and inspection of underlying values. Skylign can avoid sampling bias in sequence alignments by down-weighting redundant sequences and by combining observed counts with informed priors. It also simplifies the representation of gap parameters, and can optionally scale letter heights based on alternate calculations of the conservation of a position. CONCLUSION: Skylign is available as a website, a scriptable web service with a RESTful interface, and as a software package for download. Skylign’s interactive logos are easily incorporated into a web page with just a few lines of HTML markup. Skylign may be found at http://skylign.org.

View Publication Page

Eddy/Rivas Lab

02/01/10 | The overlap of small molecule and protein binding sites within families of protein structures.

Davis FP, Sali A

PLoS Computational Biology. 2010 Feb;6(2):e1000668. doi: 10.1371/journal.pcbi.1000668

+ Expand Abstract

Protein-protein interactions are challenging targets for modulation by small molecules. Here, we propose an approach that harnesses the increasing structural coverage of protein complexes to identify small molecules that may target protein interactions. Specifically, we identify ligand and protein binding sites that overlap upon alignment of homologous proteins. Of the 2,619 protein structure families observed to bind proteins, 1,028 also bind small molecules (250-1000 Da), and 197 exhibit a statistically significant (p<0.01) overlap between ligand and protein binding positions. These "bi-functional positions", which bind both ligands and proteins, are particularly enriched in tyrosine and tryptophan residues, similar to "energetic hotspots" described previously, and are significantly less conserved than mono-functional and solvent exposed positions. Homology transfer identifies ligands whose binding sites overlap at least 20% of the protein interface for 35% of domain-domain and 45% of domain-peptide mediated interactions. The analysis recovered known small-molecule modulators of protein interactions as well as predicted new interaction targets based on the sequence similarity of ligand binding sites. We illustrate the predictive utility of the method by suggesting structural mechanisms for the effects of sanglifehrin A on HIV virion production, bepridil on the cellular entry of anthrax edema factor, and fusicoccin on vertebrate developmental pathways. The results, available at http://pibase.janelia.org, represent a comprehensive collection of structurally characterized modulators of protein interactions, and suggest that homologous structures are a useful resource for the rational design of interaction modulators.

View Publication Page

Eddy/Rivas Lab

01/01/10 | The Pfam protein families database.

Finn RD, Mistry J, Tate J, Coggill P, Heger A, Pollington JE, Gavin OL, Gunasekaran P, Ceric G, Forslund K, Holm L, Sonnhammer EL, Eddy SR, Bateman A

Nucleic Acids Research. 2010 Jan;38:D211-22. doi: 10.1093/nar/gkp985

+ Expand Abstract

Pfam is a widely used database of protein families and domains. This article describes a set of major updates that we have implemented in the latest release (version 24.0). The most important change is that we now use HMMER3, the latest version of the popular profile hidden Markov model package. This software is approximately 100 times faster than HMMER2 and is more sensitive due to the routine use of the forward algorithm. The move to HMMER3 has necessitated numerous changes to Pfam that are described in detail. Pfam release 24.0 contains 11,912 families, of which a large number have been significantly updated during the past two years. Pfam is available via servers in the UK (http://pfam.sanger.ac.uk/), the USA (http://pfam.janelia.org/) and Sweden (http://pfam.sbc.su.se/).

View Publication Page

Main Menu (Mobile)- Block

Main Menu - Block

Filter

Associated Lab

Associated Project Team

Associated Support Team

Publication Date

30 Janelia Publications