Package weka.classifiers.bayes
Class ComplementNaiveBayes
- java.lang.Object
-
- weka.classifiers.Classifier
-
- weka.classifiers.bayes.ComplementNaiveBayes
-
- All Implemented Interfaces:
java.io.Serializable
,java.lang.Cloneable
,CapabilitiesHandler
,OptionHandler
,RevisionHandler
,TechnicalInformationHandler
,WeightedInstancesHandler
public class ComplementNaiveBayes extends Classifier implements OptionHandler, WeightedInstancesHandler, TechnicalInformationHandler
Class for building and using a Complement class Naive Bayes classifier.
For more information see,
Jason D. Rennie, Lawrence Shih, Jaime Teevan, David R. Karger: Tackling the Poor Assumptions of Naive Bayes Text Classifiers. In: ICML, 616-623, 2003.
P.S.: TF, IDF and length normalization transforms, as described in the paper, can be performed through weka.filters.unsupervised.StringToWordVector. BibTeX:@inproceedings{Rennie2003, author = {Jason D. Rennie and Lawrence Shih and Jaime Teevan and David R. Karger}, booktitle = {ICML}, pages = {616-623}, publisher = {AAAI Press}, title = {Tackling the Poor Assumptions of Naive Bayes Text Classifiers}, year = {2003} }
Valid options are:-N Normalize the word weights for each class
-S Smoothing value to avoid zero WordGivenClass probabilities (default=1.0).
- Version:
- $Revision: 5516 $
- Author:
- Ashraf M. Kibriya (amk14@cs.waikato.ac.nz)
- See Also:
- Serialized Form
-
-
Constructor Summary
Constructors Constructor Description ComplementNaiveBayes()
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description void
buildClassifier(Instances instances)
Generates the classifier.double
classifyInstance(Instance instance)
Classifies a given instance.Capabilities
getCapabilities()
Returns default capabilities of the classifier.boolean
getNormalizeWordWeights()
Returns true if the word weights for each class are to be normalizedjava.lang.String[]
getOptions()
Gets the current settings of the classifier.java.lang.String
getRevision()
Returns the revision string.double
getSmoothingParameter()
Gets the smoothing value to be used to avoid zero WordGivenClass probabilities.TechnicalInformation
getTechnicalInformation()
Returns an instance of a TechnicalInformation object, containing detailed information about the technical background of this class, e.g., paper reference or book this class is based on.java.lang.String
globalInfo()
Returns a string describing this classifierjava.util.Enumeration
listOptions()
Returns an enumeration describing the available options.static void
main(java.lang.String[] argv)
Main method for testing this class.java.lang.String
normalizeWordWeightsTipText()
Returns the tip text for this propertyvoid
setNormalizeWordWeights(boolean doNormalize)
Sets whether if the word weights for each class should be normalizedvoid
setOptions(java.lang.String[] options)
Parses a given list of options.void
setSmoothingParameter(double val)
Sets the smoothing value used to avoid zero WordGivenClass probabilitiesjava.lang.String
smoothingParameterTipText()
Returns the tip text for this propertyjava.lang.String
toString()
Prints out the internal model built by the classifier.-
Methods inherited from class weka.classifiers.Classifier
debugTipText, distributionForInstance, forName, getDebug, makeCopies, makeCopy, setDebug
-
-
-
-
Method Detail
-
listOptions
public java.util.Enumeration listOptions()
Returns an enumeration describing the available options.- Specified by:
listOptions
in interfaceOptionHandler
- Overrides:
listOptions
in classClassifier
- Returns:
- an enumeration of all the available options.
-
getOptions
public java.lang.String[] getOptions()
Gets the current settings of the classifier.- Specified by:
getOptions
in interfaceOptionHandler
- Overrides:
getOptions
in classClassifier
- Returns:
- an array of strings suitable for passing to setOptions
-
setOptions
public void setOptions(java.lang.String[] options) throws java.lang.Exception
Parses a given list of options. Valid options are:-N Normalize the word weights for each class
-S Smoothing value to avoid zero WordGivenClass probabilities (default=1.0).
- Specified by:
setOptions
in interfaceOptionHandler
- Overrides:
setOptions
in classClassifier
- Parameters:
options
- the list of options as an array of strings- Throws:
java.lang.Exception
- if an option is not supported
-
getNormalizeWordWeights
public boolean getNormalizeWordWeights()
Returns true if the word weights for each class are to be normalized- Returns:
- true if the word weights are normalized
-
setNormalizeWordWeights
public void setNormalizeWordWeights(boolean doNormalize)
Sets whether if the word weights for each class should be normalized- Parameters:
doNormalize
- whether the word weights are to be normalized
-
normalizeWordWeightsTipText
public java.lang.String normalizeWordWeightsTipText()
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
getSmoothingParameter
public double getSmoothingParameter()
Gets the smoothing value to be used to avoid zero WordGivenClass probabilities.- Returns:
- the smoothing value
-
setSmoothingParameter
public void setSmoothingParameter(double val)
Sets the smoothing value used to avoid zero WordGivenClass probabilities- Parameters:
val
- the new smooting value
-
smoothingParameterTipText
public java.lang.String smoothingParameterTipText()
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
globalInfo
public java.lang.String globalInfo()
Returns a string describing this classifier- Returns:
- a description of the classifier suitable for displaying in the explorer/experimenter gui
-
getTechnicalInformation
public TechnicalInformation getTechnicalInformation()
Returns an instance of a TechnicalInformation object, containing detailed information about the technical background of this class, e.g., paper reference or book this class is based on.- Specified by:
getTechnicalInformation
in interfaceTechnicalInformationHandler
- Returns:
- the technical information about this class
-
getCapabilities
public Capabilities getCapabilities()
Returns default capabilities of the classifier.- Specified by:
getCapabilities
in interfaceCapabilitiesHandler
- Overrides:
getCapabilities
in classClassifier
- Returns:
- the capabilities of this classifier
- See Also:
Capabilities
-
buildClassifier
public void buildClassifier(Instances instances) throws java.lang.Exception
Generates the classifier.- Specified by:
buildClassifier
in classClassifier
- Parameters:
instances
- set of instances serving as training data- Throws:
java.lang.Exception
- if the classifier has not been built successfully
-
classifyInstance
public double classifyInstance(Instance instance) throws java.lang.Exception
Classifies a given instance.The classification rule is:
MinC(forAllWords(ti*Wci))
where
ti is the frequency of word i in the given instance
Wci is the weight of word i in Class c.For more information see section 4.4 of the paper mentioned above in the classifiers description.
- Overrides:
classifyInstance
in classClassifier
- Parameters:
instance
- the instance to classify- Returns:
- the index of the class the instance is most likely to belong.
- Throws:
java.lang.Exception
- if the classifier has not been built yet.
-
toString
public java.lang.String toString()
Prints out the internal model built by the classifier. In this case it prints out the word weights calculated when building the classifier.- Overrides:
toString
in classjava.lang.Object
-
getRevision
public java.lang.String getRevision()
Returns the revision string.- Specified by:
getRevision
in interfaceRevisionHandler
- Overrides:
getRevision
in classClassifier
- Returns:
- the revision
-
main
public static void main(java.lang.String[] argv)
Main method for testing this class.- Parameters:
argv
- the options
-
-