concept LDA in category R

appears as: LDA, LDA
Machine Learning with R, the tidyverse, and mlr

This is an excerpt from Manning's book Machine Learning with R, the tidyverse, and mlr.

Figure 5.3. Learning a discriminant function in two dimensions. LDA learns a new axis such that, when the data is projected onto it (dashed lines), it maximizes the difference between class means while minimizing intra-class variance. and s2 are the mean and variance of each class along the new axis, respectively.

LDA performs well if the data within each class is normally distributed across all the predictor variables, and the classes have similar covariances. Covariance simply means how much one variable increases/decreases when another variable increases/decreases. So LDA assumes that for each class in the dataset, the predictor variables covary with each other the same amount.

This often isn’t the case, and classes have different covariances. In this situation, QDA tends to perform better than LDA because it doesn’t make this assumption (though it still assumes the data is normally distributed). Instead of learning straight lines that separate the classes, QDA learns curved lines. It is also well suited, therefore, to situations in which classes are best separated by a nonlinear decision boundary. This is illustrated in figure 5.7.

Figure 5.7. Examples of two classes which have equal covariance (the relationship between variable 1 and 2 is the same for both classes) and different covariances. Ovals represent distributions of data within each class. Quadratic and linear DFs (QDF and LDF) are shown. The projection of the classes with different covariances onto each DF is shown.

In the example on the left in the figure, the two classes are normally distributed across both variables and have equal covariances. We can see that the covariances are equal because, for both classes, as variable 1 increases, variable 2 decreases by the same amount. In this situation, LDA and QDA will find similar DFs, although LDA is slightly less prone to overfitting than QDA because it is less flexible.

5.3. Strengths and weaknesses of LDA and QDA

While it often isn’t easy to tell which algorithms will perform well for a given task, here are some strengths and weaknesses that will help you decide whether LDA and QDA will perform well for you.

  • Use the discriminant scores from the LDA as predictors in a kNN model:
    # CREATE TASK ----
    wineDiscr <- wineTib %>%
      mutate(LD1 = ldaPreds[, 1], LD2 = ldaPreds[, 2]) %>%
      select(Class, LD1, LD2)
    
    wineDiscrTask <- makeClassifTask(data = wineDiscr, target = "Class")
    
    # TUNE K ----
    knnParamSpace <- makeParamSet(makeDiscreteParam("k", values = 1:10))
    gridSearch <- makeTuneControlGrid()
    cvForTuning <- makeResampleDesc("RepCV", folds = 10, reps = 20)
    tunedK <- tuneParams("classif.knn", task = wineDiscrTask,
                         resampling = cvForTuning,
                         par.set = knnParamSpace,
                         control = gridSearch)
    
    knnTuningData <- generateHyperParsEffectData(tunedK)
    plotHyperParsEffect(knnTuningData, x = "k", y = "mmce.test.mean",
                        plot.type = "line") +
        theme_bw()
    # CROSS-VALIDATE MODEL-BUILDING PROCESS ----
    inner <- makeResampleDesc("CV")
    outer <- makeResampleDesc("CV", iters = 10)
    knnWrapper <- makeTuneWrapper("classif.knn", resampling = inner,
                                  par.set = knnParamSpace,
                                  control = gridSearch)
    
    cvWithTuning <- resample(knnWrapper, wineDiscrTask, resampling = outer)
    cvWithTuning
    
    # TRAINING FINAL MODEL WITH TUNED K ----
    tunedKnn <- setHyperPars(makeLearner("classif.knn"), par.vals = tunedK$x)
    
    tunedKnnModel <- train(tunedKnn, wineDiscrTask)
  • sitemap

    Unable to load book!

    The book could not be loaded.

    (try again in a couple of minutes)

    manning.com homepage
    test yourself with a liveTest