Model Categories

Machine Learning models use mathematical concepts, structures, and algorithms to produce analytic and predictive results. The table and diagrams below provides a framework for understanding the types and varieties of these models using a number of grouping dimensions. 

In the table: x indicates usage, X indicates the most common usage:

Base Models

Base Models are characterized by their fundamentally different approaches to Machine Learning and fall into two groups - graphs and relationships.

Choosing a model can depend on factors such as:

  • application - type, applicable models

  • data - magnitude, type

  • models - white/black box, flexibility, fit to application and data, model capabilities and extensions

  • libraries - availability, ease of use, programming languages supported

Graphs

These models are based on data graphs.

Artificial Neural Networks 

Neural Network.jpg

data flows through graph nodes

ability to handle large data

very flexible architecture and application

black box model

Decision Tree.jpg

tree branching movement through graph nodes

uses decision, chance and end nodes

white box model

Probabilistic Graphical Network.jpg

probabilistic movement between graph nodes

conditional dependencies between random variables

Relationships

These models are based on data relationships (correlation & dependence).

Cluster Analysis

Cluster Analysis.png

object groupings

objects in the same group are more similar to each other than those in other groups

cluster visualizations can provide white box analysis

Gaussian Processes.png

continuous function relationships

stochastic processes

collections of random variables that have a multivariate normal distribution

Regression Analysis.png

variable relationships

helps understand the change in dependent variables when independent variables change

regression visualizations can provide white box analysis

Model Comparisons

Which type of model to use for a given application can involve a number of factors. For example, consider the diagram below which relates selects comparison factors to aspects of modeling:

  • Accuracy - how well a model measure of the closeness of predicted values to desired values

  • Big Data - how well a model performs using large sets of training data

  • Fitting Efficiency - how efficiently a model handles the Bias-Variance Tradeoff

  • Image Recognition - how well a model performs image recognition

  • Interpretability - how easy it is to understand how input values relate to predicted values

  • Memory Efficiency - how efficient a model is in using memory

  • Natural Language Processing - how well a model performs natural language processing

  • Parallel Processing Utility - how well a model leverages parallel processing capabilities

  • Pattern Recognition - how well a model performs pattern recognition

  • Performance - how well a model performs using Confusion Matrix measures

  • Prediction Efficiency - how efficient a model is in prediction processes

  • Simplicity - a measure of how easy it is to understand a model and its processing

  • Small Data - how well a model performs using small sets of training data

  • Training Efficiency - how efficient a model is in training processes

A spreadsheet can be used to compare weighted and unweighted factors across models as shown in the example below.

  • To download the spreadsheet for customization, click here.

  • The factors chosen are based on observations from a variety of articles comparing Machine Learning Models.

  • The models, factors and their associated weights are examples that should be modified based on the application for which comparisons are being made.