Sigillvm Vniversitatis Hafniensis (The faculty of Science)

About BayesMD

BayesMD is a flexible, fully Bayesian model for motif discovery consisting of motif, background and alignment modules.Our modular approach builds-in biological knowledge about the statistical properties of binding sites, background sequences, positional preferences and the number of occurences of sites, providing a flexible and comprehensive framework for the investigation of cis-regulatory elements. The Bayesian inference is carried out using a combination of exact marginalization and sampling. Robust sampling results are achieved using the advanced sampling method parallel tempering. BayesMD can be customised to different kind of biological applications, e.g microarray, ChIP-chip, ditag, CAGE data analysis by integrating appropriately chosen features and functionalities.

Features of our tool

  • The motif model consists of a multinomial at each position with a mixture of Dirichlet distributions as priors. The mixture prior is learnt empirically from Transfac matrices.
  • The background is modelled with a higher order Markov model, where the prior is a mixture of Dirichlet distributions learnt from organism-specific promoter sequences.
  • The alignment prior enables the user to specify positional biases of the location of motif instances (softmasking), based for example upon conservation, low-complexity or tiling array information.
  • a priori probabilities for the number of occurrences of the motifs in a sequence.
  • Comprehensive post-analysis and visualization tools to derive the putative cis-regulatory elements.

Getting BayesMD

  • BayesMD was developed with Matlab release 7.3 (2006b). The source code runs on both Unix and Windows XP machines using Matlab 7.0 (R14) and higher.
  • We recommend toggling off the java virtual machine (option: -nojvm) in order to achieve better performances.
  • We provide the source code for Matlab users and a pre-compiled standalone that requires no software installation.
BayesMD Matlab package



The source code includes Matlab scripts and functions for the data handling, sampling and post-analysis. The user customizable main script fileBayesMDinput.m is used to launch the program in the command window. Below, we provide source code, optional tools, trained background and motif mixture models and example files. BayesMD standalone




Here we provide precompiled binaries for non-Matlab users. It requires the MCR (Matlab Component Runtime) to operate.
Below we provide binaries for both X86 and AMD 64 architectures.

BayesMD Webserver

  • BayesMD is available as a webservice for online motif prediction. The user shall provide the dataset in fasta format and set the input parameters via our interface. The results are sent back to the users after analysis by email.
  • Please click here to access the BayesMD webserver

Example datasets

Here we provide the datasets that have been used to perform the assessment of our tool
  • Sequence files and phastcons tracks for p53 enriched ditag sequences (Wei et al.) click here
  • Synthetic data from assessment of the NestedMICA motif finder (Down et al.) click here
  • Datasets used in the assessment of 13 motif discovery tools by Tompa et al. click here

Acknowledgements

  • Thanks to Albin Sandelin for his valuable comments on the BayesMD manuscript and Thomas Down for sharing datasets. Thanks to Kristoffer Smed and Hanne Munkholm for their help in setting up the webserver. This work was supported by a grant from the Novo Nordisk Foundation to the Bioinformatics Center.

Contact



Last update : 15/04/2009