Class Logistic

  • All Implemented Interfaces:
    java.io.Serializable, java.lang.Cloneable, CapabilitiesHandler, OptionHandler, RevisionHandler, TechnicalInformationHandler, WeightedInstancesHandler

    public class Logistic
    extends Classifier
    implements OptionHandler, WeightedInstancesHandler, TechnicalInformationHandler
    Class for building and using a multinomial logistic regression model with a ridge estimator.

    There are some modifications, however, compared to the paper of leCessie and van Houwelingen(1992):

    If there are k classes for n instances with m attributes, the parameter matrix B to be calculated will be an m*(k-1) matrix.

    The probability for class j with the exception of the last class is

    Pj(Xi) = exp(XiBj)/((sum[j=1..(k-1)]exp(Xi*Bj))+1)

    The last class has probability

    1-(sum[j=1..(k-1)]Pj(Xi))
    = 1/((sum[j=1..(k-1)]exp(Xi*Bj))+1)

    The (negative) multinomial log-likelihood is thus:

    L = -sum[i=1..n]{
    sum[j=1..(k-1)](Yij * ln(Pj(Xi)))
    +(1 - (sum[j=1..(k-1)]Yij))
    * ln(1 - sum[j=1..(k-1)]Pj(Xi))
    } + ridge * (B^2)

    In order to find the matrix B for which L is minimised, a Quasi-Newton Method is used to search for the optimized values of the m*(k-1) variables. Note that before we use the optimization procedure, we 'squeeze' the matrix B into a m*(k-1) vector. For details of the optimization procedure, please check weka.core.Optimization class.

    Although original Logistic Regression does not deal with instance weights, we modify the algorithm a little bit to handle the instance weights.

    For more information see:

    le Cessie, S., van Houwelingen, J.C. (1992). Ridge Estimators in Logistic Regression. Applied Statistics. 41(1):191-201.

    Note: Missing values are replaced using a ReplaceMissingValuesFilter, and nominal attributes are transformed into numeric attributes using a NominalToBinaryFilter.

    BibTeX:

     @article{leCessie1992,
        author = {le Cessie, S. and van Houwelingen, J.C.},
        journal = {Applied Statistics},
        number = {1},
        pages = {191-201},
        title = {Ridge Estimators in Logistic Regression},
        volume = {41},
        year = {1992}
     }
     

    Valid options are:

     -D
      Turn on debugging output.
     -R <ridge>
      Set the ridge in the log-likelihood.
     -M <number>
      Set the maximum number of iterations (default -1, until convergence).
    Version:
    $Revision: 5523 $
    Author:
    Xin Xu (xx5@cs.waikato.ac.nz)
    See Also:
    Serialized Form
    • Constructor Summary

      Constructors 
      Constructor Description
      Logistic()  
    • Method Summary

      All Methods Static Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      void buildClassifier​(Instances train)
      Builds the classifier
      double[][] coefficients()
      Returns the coefficients for this logistic model.
      java.lang.String debugTipText()
      Returns the tip text for this property
      double[] distributionForInstance​(Instance instance)
      Computes the distribution for a given instance
      Capabilities getCapabilities()
      Returns default capabilities of the classifier.
      boolean getDebug()
      Gets whether debugging output will be printed.
      int getMaxIts()
      Get the value of MaxIts.
      java.lang.String[] getOptions()
      Gets the current settings of the classifier.
      java.lang.String getRevision()
      Returns the revision string.
      double getRidge()
      Gets the ridge in the log-likelihood.
      TechnicalInformation getTechnicalInformation()
      Returns an instance of a TechnicalInformation object, containing detailed information about the technical background of this class, e.g., paper reference or book this class is based on.
      java.lang.String globalInfo()
      Returns a string describing this classifier
      java.util.Enumeration listOptions()
      Returns an enumeration describing the available options
      static void main​(java.lang.String[] argv)
      Main method for testing this class.
      java.lang.String maxItsTipText()
      Returns the tip text for this property
      java.lang.String ridgeTipText()
      Returns the tip text for this property
      void setDebug​(boolean debug)
      Sets whether debugging output will be printed.
      void setMaxIts​(int newMaxIts)
      Set the value of MaxIts.
      void setOptions​(java.lang.String[] options)
      Parses a given list of options.
      void setRidge​(double ridge)
      Sets the ridge in the log-likelihood.
      java.lang.String toString()
      Gets a string describing the classifier.
      • Methods inherited from class java.lang.Object

        equals, getClass, hashCode, notify, notifyAll, wait, wait, wait
    • Constructor Detail

      • Logistic

        public Logistic()
    • Method Detail

      • globalInfo

        public java.lang.String globalInfo()
        Returns a string describing this classifier
        Returns:
        a description of the classifier suitable for displaying in the explorer/experimenter gui
      • getTechnicalInformation

        public TechnicalInformation getTechnicalInformation()
        Returns an instance of a TechnicalInformation object, containing detailed information about the technical background of this class, e.g., paper reference or book this class is based on.
        Specified by:
        getTechnicalInformation in interface TechnicalInformationHandler
        Returns:
        the technical information about this class
      • listOptions

        public java.util.Enumeration listOptions()
        Returns an enumeration describing the available options
        Specified by:
        listOptions in interface OptionHandler
        Overrides:
        listOptions in class Classifier
        Returns:
        an enumeration of all the available options
      • setOptions

        public void setOptions​(java.lang.String[] options)
                        throws java.lang.Exception
        Parses a given list of options.

        Valid options are:

         -D
          Turn on debugging output.
         -R <ridge>
          Set the ridge in the log-likelihood.
         -M <number>
          Set the maximum number of iterations (default -1, until convergence).
        Specified by:
        setOptions in interface OptionHandler
        Overrides:
        setOptions in class Classifier
        Parameters:
        options - the list of options as an array of strings
        Throws:
        java.lang.Exception - if an option is not supported
      • getOptions

        public java.lang.String[] getOptions()
        Gets the current settings of the classifier.
        Specified by:
        getOptions in interface OptionHandler
        Overrides:
        getOptions in class Classifier
        Returns:
        an array of strings suitable for passing to setOptions
      • debugTipText

        public java.lang.String debugTipText()
        Returns the tip text for this property
        Overrides:
        debugTipText in class Classifier
        Returns:
        tip text for this property suitable for displaying in the explorer/experimenter gui
      • setDebug

        public void setDebug​(boolean debug)
        Sets whether debugging output will be printed.
        Overrides:
        setDebug in class Classifier
        Parameters:
        debug - true if debugging output should be printed
      • getDebug

        public boolean getDebug()
        Gets whether debugging output will be printed.
        Overrides:
        getDebug in class Classifier
        Returns:
        true if debugging output will be printed
      • ridgeTipText

        public java.lang.String ridgeTipText()
        Returns the tip text for this property
        Returns:
        tip text for this property suitable for displaying in the explorer/experimenter gui
      • setRidge

        public void setRidge​(double ridge)
        Sets the ridge in the log-likelihood.
        Parameters:
        ridge - the ridge
      • getRidge

        public double getRidge()
        Gets the ridge in the log-likelihood.
        Returns:
        the ridge
      • maxItsTipText

        public java.lang.String maxItsTipText()
        Returns the tip text for this property
        Returns:
        tip text for this property suitable for displaying in the explorer/experimenter gui
      • getMaxIts

        public int getMaxIts()
        Get the value of MaxIts.
        Returns:
        Value of MaxIts.
      • setMaxIts

        public void setMaxIts​(int newMaxIts)
        Set the value of MaxIts.
        Parameters:
        newMaxIts - Value to assign to MaxIts.
      • buildClassifier

        public void buildClassifier​(Instances train)
                             throws java.lang.Exception
        Builds the classifier
        Specified by:
        buildClassifier in class Classifier
        Parameters:
        train - the training data to be used for generating the boosted classifier.
        Throws:
        java.lang.Exception - if the classifier could not be built successfully
      • distributionForInstance

        public double[] distributionForInstance​(Instance instance)
                                         throws java.lang.Exception
        Computes the distribution for a given instance
        Overrides:
        distributionForInstance in class Classifier
        Parameters:
        instance - the instance for which distribution is computed
        Returns:
        the distribution
        Throws:
        java.lang.Exception - if the distribution can't be computed successfully
      • coefficients

        public double[][] coefficients()
        Returns the coefficients for this logistic model. The first dimension indexes the attributes, and the second the classes.
        Returns:
        the coefficients for this logistic model
      • toString

        public java.lang.String toString()
        Gets a string describing the classifier.
        Overrides:
        toString in class java.lang.Object
        Returns:
        a string describing the classifer built.
      • main

        public static void main​(java.lang.String[] argv)
        Main method for testing this class.
        Parameters:
        argv - should contain the command line arguments to the scheme (see Evaluation)