/**
* <h2>Introduction</h2>
*
* <p>This package provides an implementation of a MapReduce-enabled Naïve Bayes classifier. It
* is a very simple classifier that counts the occurrences of words in association with a label which
* can then be used to determine the likelihood that a new document, and its words, should be assigned a particular
* label.
* </p>
*
* <h2>Implementation</h2>
*
* <p>The implementation is divided up into three parts:</p>
*
* <ol>
* <li>The Trainer -- responsible for doing the counting of the words and the labels</li>
* <li>The Model -- responsible for holding the training data in a useful way</li>
* <li>The Classifier -- responsible for using the trainers output to determine the category of previously unseen
* documents</li>
* </ol>
*
* <h3>The Trainer</h3>
* <p>The trainer is manifested in several classes:</p>
*
* <ol>
* <li>{@link org.apache.mahout.classifier.bayes.mapreduce.bayes.BayesDriver} -- Creates the Hadoop Naive Bayes job and outputs
* the model. This Driver encapsulates a lot of intermediate Map-Reduce Classes</li>
* <li>{@link org.apache.mahout.classifier.bayes.mapreduce.common.BayesFeatureDriver}</li>
* <li>{@link org.apache.mahout.classifier.bayes.mapreduce.common.BayesTfIdfDriver}</li>
* <li>{@link org.apache.mahout.classifier.bayes.mapreduce.common.BayesWeightSummerDriver}</li>
* <li>{@link org.apache.mahout.classifier.bayes.mapreduce.bayes.BayesThetaNormalizerDriver}</li>
* </ol>
*
* <p>The trainer assumes that the input files are in the {@link org.apache.hadoop.mapred.KeyValueTextInputFormat},
* i.e. the first token of the line is the label and separated from the remaining tokens on the line by a
* tab-delimiter. The remaining tokens are the unique features (words). Thus, input documents might look like:</p>
*
* <pre>
* hockey puck stick goalie forward defenseman referee ice checking slapshot helmet
* football field football pigskin referee helmet turf tackle
* </pre>
*
* <p>where hockey and football are the labels and the remaining words are the features associated with those
* particular labels.</p>
*
* <p>The output from the trainer is a {@link org.apache.hadoop.io.SequenceFile}.</p>
*/
package org.apache.mahout.classifier.bayes;