evaluation/

directory
v0.0.0-...-361c87b Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jan 4, 2017 License: Apache-2.0

README

Evaluation

Being able to properly evaluate a model is essential. Without evaluation, the model development process is just guess work. Evaluation helps us find the best model to make a prediction and gives us an idea about how the model should perform in the future. represents our data and how well the chosen model will work in the future. A whole variety of evaluation metrics have been developed and not all evaluation metrics are relevant to all models. We will explore a sampling of these metrics/scores here, but, as data scientists, it is very important to evaluate the metrics/scores you are using on a case by case basis.

Notes

  • "When building prediction models, the primary goal should be to make a model that most accurately predicts the desired target value for new data. The measure of model error that is used should be one that achieves this goal." from here
  • Let's say we are predicting if people have or don't have a disease (from here):
    • true positives (TP): These are cases in which we predicted yes (they have the disease), and they do have the disease.
    • true negatives (TN): We predicted no, and they don't have the disease.
    • false positives (FP): We predicted yes, but they don't actually have the disease. (Also known as a "Type I error.")
    • false negatives (FN): We predicted no, but they actually do have the disease. (Also known as a "Type II error.")

Comparison of Evaluation Measures - Wikipedia
Accurately Measuring Model Prediction Errors
Understanding the Bias-Variance Tradeoff
Simple guide to confusion matrix terminology

Code Review

Calculate R2 (Coefficient of Determination)
Calculate Mean Absolute Error
Caclulate Accuracy
Caclulate Precision
Caclulate Recall

Exercises

Exercise 1

For the labeled.csv results, implement and calculate the evaluation metric called "specificity" as defined here. Think about when we might want to use this as compared with accuracy, precision, or recall.

Template | Answer

Exercise 2

For the continuous.csv results, implement and calculate the evaluation metric called "mean squared error" as defined here. What advantages or disadvantages might this metric have as compared to mean absolute error?

Template | Answer


All material is licensed under the Apache License Version 2.0, January 2004.

Directories

Path Synopsis
Sample program to calculate an R^2 value.
Sample program to calculate an R^2 value.
Sample program to calculate a mean absolute error.
Sample program to calculate a mean absolute error.
Sample program to calculate a accuracy.
Sample program to calculate a accuracy.
Sample program to calculate precision.
Sample program to calculate precision.
Sample program to calculate recall.
Sample program to calculate recall.
exercises
exercise1
Sample program to calculate specificity.
Sample program to calculate specificity.
exercise2
Sample program to calculate a mean squared error.
Sample program to calculate a mean squared error.
template1
Sample program to calculate specificity.
Sample program to calculate specificity.
template2
Sample program to calculate a mean squared error.
Sample program to calculate a mean squared error.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL