< prev | next >

Model Categories

Machine Learning models use mathematical concepts, structures, and algorithms to produce analytic and predictive results. The table and diagrams below provides a framework for understanding the types and varieties of these models using a number of grouping dimensions.

In the table: x indicates usage, X indicates the most common usage:

Screen Shot 2018-05-14 at 9.32.34 AM.png

Base Models

Base Models are characterized by their fundamentally different approaches to Machine Learning and fall into two groups - graphs and relationships.

Choosing a model can depend on factors such as:

application - type, applicable models
data - magnitude, type
models - white/black box, flexibility, fit to application and data, model capabilities and extensions
libraries - availability, ease of use, programming languages supported

Graphs

These models are based on data graphs.

Artificial Neural Networks

Neural Network.jpg — data flows through graph nodes
ability to handle large data
very flexible architecture and application
black box model

Decision Trees

Decision Tree.jpg — tree branching movement through graph nodes
uses decision, chance and end nodes
white box model

Probabilistic Graphical Models

Probabilistic Graphical Network.jpg — probabilistic movement between graph nodes
conditional dependencies between random variables

Relationships

These models are based on data relationships (correlation & dependence).

Cluster Analysis

Gaussian Processes

Regression Analysis

Model Comparisons

Which type of model to use for a given application can involve a number of factors. For example, consider the diagram below which relates selects comparison factors to aspects of modeling:

Accuracy - how well a model measure of the closeness of predicted values to desired values
Big Data - how well a model performs using large sets of training data
Fitting Efficiency - how efficiently a model handles the Bias-Variance Tradeoff
Image Recognition - how well a model performs image recognition
Interpretability - how easy it is to understand how input values relate to predicted values
Memory Efficiency - how efficient a model is in using memory
Natural Language Processing - how well a model performs natural language processing
Parallel Processing Utility - how well a model leverages parallel processing capabilities
Pattern Recognition - how well a model performs pattern recognition
Performance - how well a model performs using Confusion Matrix measures
Prediction Efficiency - how efficient a model is in prediction processes
Simplicity - a measure of how easy it is to understand a model and its processing
Small Data - how well a model performs using small sets of training data
Training Efficiency - how efficient a model is in training processes

A spreadsheet can be used to compare weighted and unweighted factors across models as shown in the example below.

To download the spreadsheet for customization, click here.
The factors chosen are based on observations from a variety of articles comparing Machine Learning Models.
The models, factors and their associated weights are examples that should be modified based on the application for which comparisons are being made.