aeruginosa virulence factors. Proc Natl Acad Sci USA 1999,96(5):2408–2413.PubMedCrossRef 43. Dagley S, Dawes EA, Morrison GA: Inhibition of growth of Aerobacter aerogenes; the mode of action of phenols, alcohols, acetone, and ethyl acetate. J Bacteriol 1950,60(4):369–379.PubMed Authors’ Selleck STI571 contributions DS carried out the assays with VD help and participated in the design of the manuscript. AM designed the study, wrote the manuscript and analyzed most of the data. LM and MH were involved in the in vitro microscopy assays and analysis. XL helped to design and writes the manuscript. NO and MF were involved in designing the study. All authors read and approved the final manuscript.”
“Background Microbial
ecology studies routinely utilize 454 pyrosequencing of ribosomal RNA gene amplicons in order to determine composition and functionality of environmental communities [1–6]. Where it was once costly to generate RG7204 cell line libraries of a few hundred 16S rRNA gene sequences, so called next-generation sequencing methods now allow researchers to deeply probe a microbial community at relatively little cost per sequence. Taxonomic classification
is a key part of these studies as it allows researchers to correlate relative abundance of particular sequences with taxonomic groupings. These kinds of informative data can also allow for hypothesis generation concerning the community function in the context of a given biological or ecological question. A large CDK inhibitor number of groups [1–6] utilize the Ribosomal Database Project’s Naïve Bayesian Classifier (RDP-NBC) [7] for the classification of rRNA sequences into the new higher-order taxonomy, such as that proposed in Bergey’s Taxonomic Outline of the Prokaryotes [8]. Bayesian classifiers assign the most likely class to a given example described by its feature vector based on applying Bayes’ theorem. Developing such classifiers can be greatly simplified by assuming that features are independent given
class (naïve independence assumptions). Because independent variables are assumed, only the variances of the variables for each class need to be determined and not the entire covariance matrix. Despite this unrealistic assumption, the resulting classifier is remarkably successful in practice, often competing with much more sophisticated techniques [9, 10]. The practical advantages of the RDP-NBC are that classification are straightforward (putting sequences in a predetermined taxonomic context), computationally efficient (building a statistical model based on k-mers in the training set), can analyze thousands of sequences, and does not require full-length 16S sequences (making it an appropriate tool for next generation sequencing based studies). The RDP-NBC relies on an accurate training set – on reference sequences used to train the model and a taxonomic designation file to generate the classification results.