Glossary and Index
A
Abstraction - separates ideas from specific instances of those ideas
Accuracy - the degree of correctness of a quantity, expression or finding - in Machine Learning, it is thought of as the number of True results divided by the total number of results
Activation Function - used in Machine Learning model nodes to produce an output y based on an input x, a change in x activates the y value
Agile Processes - include various methodologies under which requirements and solutions evolve through the collaborative effort of cross-functional teams and their clients
AI Agents - also known as intelligent agents, are autonomous entities designed to: perceive their environment, process information, make decisions, and take actions to achieve specific goals
Algorithm Libraries - repositories for algorithm code storage and access
Analysis - process of performing functions such as probability and statistical analysis, business analysis, model accuracy analysis, prediction results analysis
Applications - applications use Machine Learning to address specific needs
Application Programming Interface (API) - APIs provide a mechanism for interaction between client applications and server systems
Array - a data structure consisting of a collection of elements, each identified with at least one array index or key
Artificial General Intelligence (AGI) - systems will be capable of reasoning, learning, and adapting to a wide range of tasks and environments, much like humans do.
Artificial Narrow Intelligence (ANI) - also known as "Weak AI," refers to AI systems that are designed and trained to perform a specific task or a narrow range of tasks
Artificial Neural Networks - computing systems made of a network of connected graph nodes the design of which are modeled on the biological neural networks of animal brains
Artificial Superintelligence (ASI) - refers to a hypothetical form of Artificial Intelligence that surpasses human cognitive abilities across all domains
Artificial Universal Intelligence (AUI) - refers to a future state in which superintelligent AI has not only surpassed human intelligence but has transformed society, technology, and even the nature of reality itself. In this era, AI continuously evolves, shaping all aspects of existence, from economics and governance to biology and consciousness.
Attention - mechanisms that let a Machine Learning model directly look at, and draw from, the state at any earlier point in, for example, a sentence
Attribute - defines a property of an object, element, or file
Automated Machine Learning - also referred to as AutoML, is the process of automating the end-to-end process of applying machine learning to real-world problems
A/B Testing - a method of testing and comparing two versions of a single variable
B
Backpropagation - the process of adjusting the weights in the layers of nodes an Artificial Neural Network in a backward flow manner using activation function differentials
Bayes Models - Mathematical Model based on probabilistic graphs
Best-first Search - an algorithm which explores a graph by serially expanding the most promising nodes
Bias - a statistic that is systematically different from the entities being estimated
Bias-Variance Tradeoff - deals with the problem of minimizing bias and variance errors
Big O Notation - uses order of magnitude as a way to characterize the time or resources needed to solve a computing problem
Binary Search - the test for membership can be performed efficiently provided that the tree is reasonably balanced, that is, the leaves of the tree are at comparable depths
Black Box refers to Machine Learning models, such as Deep Learning Artificial Neural Networks, that can produce predictions that cannot be traced specifically through model training and prediction processes.
Block - section of code executed as a unit
Branch - sequence of statements triggered by a conditional
Business Model: describes how an organization creates, delivers, and captures value
C
Calculus - the study and mathematics of continuous change
Callback - pointer to a piece of executable code that is passed as an argument to other code, which is expected to call back (execute) the argument at some appropriate time
Central Limit Theorem - as independent random variables are added to a set, that set tends toward a normal distribution
Chief AI Officer - is an executive-level leader responsible for overseeing and guiding an organization’s Artificial Intelligence (AI) strategy, development, and implementation
Class Programming Construct - code template for creating objects. Think of architect's drawings
Classification - identifies to which of a set of categories a new observation belongs
Client-Server Architecture - hardware and software that connects front-end user facing devices with back-end operations and database access
Cloud Computing - offers Machine Learning cloud server based capabilities
Cluster - group of objects that has characteristics differing from other object clusters
Cluster Analysis - grouping a set of objects in a way that differentiates them from other groups
Coding - AI is on an exponential growth curve and the evolution of AI is accelerating; this growth includes automated coding and design capabilities such as OpenAI Codex
Coefficient of Determination - or r squared, yields a high number when observed data points are more closely replicated by a data model
Collaborative Filtering - filtering for information and patterns using elements such as multiple agents, viewpoints and data sources
Collection - an object made up of other objects
Columnar Databases - stores data tables by column rather than by row
Computing Systems - hardware and software that provides a platform for Machine Learning systems
Concatenation in Linear Algebra - the appending of vectors or matrices to form a new vector or matrix
Conditional - performs different actions depending on whether a condition is true or false
Confidence - a type of estimate computed from the statistics of the observed data. This proposes a range of plausible values for an unknown parameter; the interval has an associated confidence level that the true parameter is in the proposed range
Confusion Matrix - a table layout of measurement results that allows visualization of the performance of an algorithm
Constructor - a block of code that's executed when its class object is instantiated
Container/Collection - an object made up of other objects
Convolution - an operation on two functions to produce a third function; used in a class of deep, feed-forward Artificial Neural Networks typically used for image and facial recognition
Convolutional Neural Networks - are a class of Artificial Neural Networks that use convolution along with other processes; they have application in areas such as computer vision and natural language processing
Correlation - the degree to which two or more measurements tend to vary together
Cosine Similarity - is a measure of similarity between two non-zero vectors of an inner product space
Cross Decomposition - find the correlation between two matrices
Cross Entropy Loss - measures the difference between two probability vectors
CSV Data - data organized by comma separated values
Curve Fitting - construction of a mathematical function that best fits a set of data points
D
Data - characteristics or information that are collected through human observation and automated processes
Data Analysis - process of performing functions such as probability and statistical analysis, business analysis, model accuracy analysis, prediction results analysis
Data Cleaning - detecting and correcting corrupt or inaccurate data
Data Discovery - also known as Business Intelligence (BI) is, in the case of Machine Learning, the process of identifying and understanding the data needed for specific ML applications
Data ETL - data Extract, Transform and Load in order to create a new data subset
Data Flow - a template for understanding and designing a Machine Learning sequence of data movement
Data Lake - a repository of data stored in raw format
Data Lakehouse - is a data management architecture that combines the flexibility and cost-effectiveness of a Data Lake with the structured data management capabilities of a traditional Data Warehouse
Data Management - processes of organizing, storing and manipulating data
Data Pipeline - captures data inputs, retain data for a period of time, and deliver data to subscribers such as databases.
Data Reporting - involves collecting and displaying data in an effective manner
Data Visualization - visual representation of data
Data Visualization Diagrams - diagrams include bar charts, histograms, scatter plots, and networks
Data Warehouse - is a centralized repository designed to store and manage large volumes of historical data from multiple sources within an organization; it serves as a single source of truth for business intelligence (BI) and analytics activities, enabling organizations to derive valuable insights from their data to support informed decision-making
Database Tables - collection of related data in a structured row and column format
Decision Tree Algorithms - algorithm code that uses decision trees to solve a problem
Decision Trees - use tree-like graphs to model decisions and possible outcomes
Deep Learning - uses multiple layers of graph nodes to improved model results
Deep Reasoning - combines deep learning with reasoning for solving complex tasks
Deviation - a measure of the difference between the observed value of a variable and some other value, often that variable's mean
Dimensionality Reduction - reduces the number of variables representing a data dimension
Differential Calculus - study and mathematics of the rates at which quantities change
Differential Calculus Chain Rule - equations for calculating the derivatives of nested functions f(g(h(x) . . .))
Differential Calculus Rules - equations for calculating the derivatives of common functions
Diffusion Models - are a class of generative models that have gained attention in recent years, particularly for their ability to generate high-quality images, audio, and other data types
Discriminant Analysis - finds a combination of features that characterize or separate classes of objects or events
Document Database - is a computer program designed for storing, retrieving and managing document-oriented information, also known as semi-structured data
DOM (Document Object Module) - convention for representation and interaction between objects in HTML documents
Dynamic Array - variable sized data structure that allows adding and deleting elements
Dynamic Programming - is an algorithmic technique used to solve complex problems by breaking them down into smaller subproblems, solving each subproblem only once, and storing the solutions to subproblems to avoid redundant computation
E
Eigenvalues and Eigenvectors - a vector that only changes by a scalar factor when a linear transformation is applied to it
Encapsulation - packing of data and functions into a single component
Ensemble Learning - uses multiple Machine Learning algorithms to obtain better performance than could be obtained using any single model
Entropy - in the context of information, it is the average amount of information produced by a stochastic source of data
Estimator - a rule for calculating an estimate of a given quantity based on observed data
Euler's Number - a Mathematical Constant very useful for simplifying calculus derivatives, e = lim(1+1/n)^n = 2.71828...
Exception - anomalous or failure event that may require special handling.
Exponential Growth - is growth pattern that exceeds a linearity over time
Expression - combination of elements that produces a result
Extrapolation - the process of estimating values beyond an original observation range
F
Factor Analysis - method used to describe variability among observed, correlated variables in terms of a potentially lower number of unobserved variables
Fairness - algorithm is said to be fair if its results are independent of sensitive variables, such as gender, ethnicity, sexual orientation. disability
False Positive and False Negative Errors - concepts analogous to type I and type II errors in statistical hypothesis testing, where a positive result corresponds to rejecting the null hypothesis, and a negative result corresponds to not rejecting the null hypothesis
Features - are individually measurable characteristics of a phenomenon being observed
Foundation Models - are a new paradigm in Artificial Intelligence that represent a shift from traditional task-specific models to more general and adaptable models trained on vast amounts of data.
Fourier Analysis - represents functions as sums of simpler trigonometric functions
Function - a function in mathematics is a relation between a set of inputs and a set of permissible outputs with the property that each input is related to exactly one output
Function Programming Construct - a function, or subroutine, is a sequence of program instructions that performs a specific task, packaged as a unit
Functional Groups - ways to organize into groups such as data mining, programming, analysis, reporting, modeling
F1 Score - a weighted average of precision and recall
G
Garbage Collection - reclaiming of unused memory
Gaussian Analysis - are supervised learning methods for solving regression and probabilistic classification problems that uses lazy learning and a measure of the similarity between points to predict new point values
Generalized Linear Model - or GLM, is a generalization of ordinary linear regression that allows for models other than a normal distribution
Generative Adversarial Networks - a type of Machine Learning in which two neural networks compete with each other in a competition that results in training one of the neural networks
Generative AI - refers to a class of artificial intelligence models and techniques that are capable of generating new, original content such as text, images, audio, code, and other data types.
Governance - addresses ensuring that AI technologies are developed and deployed in ways that are ethical, safe, and beneficial to society
Gradients - calculation of derivatives in multi-dimensional spaces
Gradient Boosting - ensemble of weak prediction models, typically decision trees, that are used in a stage-wise manner to produce a more optimal solution to a differentiable loss function
Graph Activation Node - graph node containing an Activation Function
Graph Backpropagation Data Flow - used in Artificial Neural Networks to calculate data weights moving backward through network nodes
Graph Convolution Node - an Artificial Neural Network node that performs a mathematical operation on two functions to produce a third
Graph Databases - use graph structures for semantic queries with nodes, edges, and properties to represent and store data.
Graph Data Operations Flow - the movement of data between graph nodes
Graph Deep Learning Data Flow - the movement of data between deep learning Artificial Neural Network nodes
Graph Dropout Node - nodes in a graph that are removed during training to avoid overfitting and underfitting of data to a mathematical curve
Graph Feed Forward Data Flow - the forward movement of data between nodes in a graph
Graph Input Node - a graph node that takes in data and feeds it to other nodes in the graph
Graph Matrix Operation Node - a graph node that performs Matrix Operations
Graph Memory Node - a graph node that stores a value for either long or short time periods
Graph Output Node - a graph node that outputs data from graph operations
Graph Pooling Node - a graph node that combines the output of other nodes
Graph Recurrent Data Flow - data flow in Recurrent Neural Networks that allows dynamic temporal time sequence behavior
Graph Terms - terms related to Graph discrete mathematics
Graphics Processing Unit - a specialized electronic circuit designed to rapidly manipulate and alter memory to accelerate processing; GPUs have been a key factor in the increasing power of Machine Learning and AI
Greedy Algorithms - algorithm code that makes the locally optimal choice at each stage
H
Hash - used to provide direct access to data based on a calculated key
Histogram of Oriented Gradients - count of occurrences of gradient orientation in a localized portion of an image
HTML iframe - a way to embed with a web page access to a web page on a different server.
HTTP Request - indicates the desired action to be performed on an identified resource. Actions include: GET, POST, HEAD, OPTIONS, PUT, DELETE, TRACE, PATCH and CONNECT
Hybrid Cloud Computing - refers to a Cloud Computing environment that combines on-premises private cloud resources with public cloud services from single or multiple third-party providers
I
Identifier - names for objects and entities
Image Processing - includes functions for filtering, interpolation, measurements and morphology
Implementation - of ML/AI in many ways is similar to the implementation of other previously new technologies, such as the Internet, various databases and application systems; however, ML/AI is different is some significant ways
Independent Events - the occurrence of one event doesn't effect the probability of the occurrence of the other
Individuals - can take steps to prepare for and leverage its potential; By staying proactive, embracing continuous learning, and leveraging AI tools in both personal and professional contexts, individuals can not only mitigate the disruptions caused by AI but also harness its potential for career growth and personal improvement
Inference - the process of using a Machine Learning model to determine a result based on model inputs
Information Theory - includes the quantification, storage and communication of information
Inheritance - obtaining the capabilities on an object
Inner Class - a class that is declared entirely within the body of another class or interface. An inner class cannot be instantiated without being bound to a top-level class
Instance - specific realization of an object, as in a house vs. the plans
Integral Calculus - mathematical functions that measure the magnitude of the combination of infinitesimal data such as the area under a curve
Integral Calculus Rules - equations for calculating the integrals of common
Internet Protocol Suite - is the networking model and protocols used on the Internet. The diagram below shows the flow of control and data for the HTTP-TCP-IP-ARP most common protocol sequence
Interpolation - a method of constructing new data points within the range of a known set of data points
Isotonic Regression - is regression analysis that fits a non-decreasing free-form line as closely to a set of data points as possible
Iterator - an object that enables traversing a container, such as lists
J
JSON - human-readable text used to transmit data objects consisting of attribute–value pairs. often used to transmit data between a server and web application
K
Kaggle - Data Science and Machine Learning website that includes competitions and datasets
Keyword - word with a special meaning
K-Means Clustering - partitioning observations into k clusters in which each observation belongs to the cluster with the nearest mean
L
Lambda Function - A simple, lightweight anonymous function definition not using the normal function syntax.
Large Language Models - are a type of Artificial Intelligence that use Deep Learning techniques and massive datasets to understand, generate, and process human language in a highly sophisticated manner
Law of Large Numbers - results obtained from a large number of trials should be close to the expected value, and will tend to become closer as more trials are performed
Libraries - collections of programming code and/or data
Likelihood - is the probability that an event that has already occurred would yield a specific outcome
Linear Algebra - mathematics of linear equations and linear functions, such as those for lines and planes
Linear Equations - are mathematical expressions consisting of variables and coefficients
Linear Regression - is a linear approach to modeling the relationship between a dependent variable and one or more independent variables
Linked List - a data structure consisting of a group of nodes which together represent a sequence
List - a sequence of values, where the same value may occur more than once
Literal - notation representing a fixed value
Logarithms - the logarithm of a number is the exponent to which another fixed number, the base, must be raised to produce that number
Logistic Regression - a regression model where the dependent variable is categorical
Long Short-term Memory - building blocks of Recurrent Neural Networks that remember values over time intervals
Loss (Cost) Function - maps an event or values of one or more variables onto a real number intuitively representing some "cost" associated with the event
M
Machine Learning - the use of computer algorithms that improve automatically through experience
MapReduce - an ETL process that uses a map-shuffle-reduce sequence to process large data sets using parallel and distributed processing
Markov Chains - a sequence of possible events in which the probability of each event depends only on the state attained in the previous event
Masking - a method of indicating which elements of a matrix or vector should and should not be used
Mathematical Symbols - symbols used to represent formulas, variables, constants and other mathematical concepts
Matrices - the representation and mathematics of rectangular arrays of numbers, symbols and expressions
Maximum - the largest value of a function
Mean Squared Error - measures the average of the squares of the errors or deviations of the difference between an estimator and what is estimated
Metaclass - a class whose instances are classes
Method - code that defines the run time behavior of objects
Minimum - the smallest value of a function
Mining - includes functions such as data ETL, data cleaning, data discovery, data normalization
Mixin - a class which contains a combination of methods from other classes without using inheritance
Model Alignment - refers to the process of ensuring that an artificial intelligence system's behavior aligns with human values, goals, and intentions
Model Self Improvement - refers to techniques that allow AI systems to enhance their own capabilities without direct human intervention.
Modeling Process - a multi-stage methodology for creating trained and tested ML models
Models - configurations of Machine Learning elements focused on particular types of problems
Moment - a specific quantitative measure of the shape of a set of points
Moore’s Law - is the observation that the number of transistors in a dense integrated circuit doubles about every two years
Mutually Exclusive Events - two events that cannot both be true or occur
N
Naive Bayes - uses Bayes' theorem with strong independence assumptions between features
Nearest Neighbors - a class of functions and graphs for performing cluster analysis
Neural Network Algorithms - are used in the training of neural networks and neural network predictions processing
Nonparametric Statistics - not meant to imply that such models completely lack parameters but that the number and nature of the parameters are flexible and not fixed in advance
Normal Distribution - is a probability distribution that plots all of its values in a symmetrical fashion and most of the results are situated around the probability's mean
Normalization - adjusts data values to fit into a prescribed range
Null Hypothesis - is the default position that there is no relationship between two measured phenomena
O
Object - an object is an instance of a class
Operator - performs actions on elements such as variables
Orthogonality - is the generalization into additional dimensions of concept of two dimensional perpendicularity
Outliers - observation points that is distant from other observations
Overfitting - a statistical analysis that corresponds too closely to a particular set of data
Overloading - ability to create multiple methods of the same name with different implementations
Overriding - allows a subclass or child class to provide a specific implementation of a method that is already provided by one of its superclasses or parent classes
P
P Versus NP Complexity - deals with whether every problem whose solution can be quickly verified by a computer can also be quickly solved by a computer
Package - a mechanism for organizing code
Parameter - a reference or value passed to a function, procedure, subroutine, command, or program
Performance Tracking - the process of measuring and comparing predicted results with actual
Periodic Functions - repeat values at regular intervals
Perpendicular - the relationship between two lines which meet at a right angle
Pi - is the ratio of a circle's circumference to its diameter which is the irrational number 3.14159...
Platforms for Machine Learning and AI - are comprehensive suites or frameworks designed to assist developers, data scientists, and researchers in creating, training, and deploying Machine Learning and AI models efficiently.
Poisson Distribution - a discrete probability distribution that expresses the probability of a given number of events occurring in a fixed interval of time or space if these events occur with a known constant rate and independently of the time since the last event
Polymorphism - provision of a single interface for entities of different types
Pooling - reduces the dimensions of data by combining the outputs of neuron clusters at one layer into a single neuron in the next layer
Precision and Recall - measures of measure of the relevance of pattern recognition data
Prediction - the process of using a Machine Learning model to determine a result based on model inputs
Primitive - smallest unit of processing available to a programmer of a particular machine, or can be an atomic element of an expression in a language
Probabilistic Graphical Models - a graph which expresses conditional dependence structures between random variables
Probability - measure of the likelihood an event will occur
Probability Density Function - predicts the likelihood that a given value in the function will occur
Probability Measure - a real-valued function defined on a set of events in a probability space that satisfies measure properties such as countable additivity
Prompts and Prompting - Large language model (LLM) prompts and prompting refer to the process and content of inputting specific queries or instructions into a large language model, such as ChatGPT, to generate useful and accurate responses.
P-Value - is the probability for a statistical model that, when the null hypothesis is true, the statistical summary would be the same or greater than the actual observed results
Programming - includes functions such as platform development, algorithm development, db query development
Programming Constructs - common and important programming constructs
Python - an interpreted high-level programming language for general-purpose programming that is commonly used for Machine Learning systems and applications
Q
Quantum Computing - uses the superposition and entanglement to generate qubits which can hold two states simultaneously
R
Random Forrest - uses multiple decision trees or other model types during model training to find single decision tree that is the mode of the multiple trees and corrects for overfitting
Receiver Operating Characteristic (ROC) - a graphical plot showing the diagnostic ability of a binary classifier
Rectifier Function - returns 0 for 0 or negative value inputs and a straight line function for positive value inputs
Recurrent Neural Networks - a class of Artificial Neural Networks that employ a directed graph with back linking data flows
Recursion - a method of solving a problem where the solution depends on solutions to smaller instances of the same problem
Reflection - ability to modify an object's own properties at runtime
Regression Analysis - a set of statistical processes for estimating the relationships among variables - regression analysis estimates the conditional expectation of the dependent variable given the independent variables – that is, the average value of the dependent variable when the independent variables are fixed ... the tendency of extreme data values to "regress" to the overall mean value
Regularization - introduction of additional information to solve an ill-posed problem, e.g. to prevent overfitting
Regular Expression - sequence of characters that can be used as a search pattern. Sometimes referred to as Regex or Regexen
Reinforcement Learning - involves finding a balance between exploration (of uncharted territory) and exploitation (of current knowledge) using an iterative process of process execution and reward for execution
Relational Database - organizes data into one or more tables (or "relations") of columns and rows, with a unique key identifying each row
Repeatability - the closeness of the agreement between the results of successive measurements of the carried out under the same conditions
Reporting - involves collecting and displaying data in an effective manner
Resampling - changing or shuffling data used in mathematical modeling
Reserved Word - a word that cannot be used as an identifier, such as the name of a variable or function
Retrieval Augmented Generation (RAG) - is an approach that aims to enhance the capabilities of Large Language Models (LLMs) by incorporating external knowledge sources during the generation process
Return - a value that is passed from a calculating function to an invoking function
Risks - sophisticated AI raises several critical issues, spanning ethical, technical, societal, and regulatory domains; addressing these issues requires a collaborative approach between governments, industry, and civil society to ensure that AI's deployment benefits humanity while mitigating potential harms
S
Sample Size Determination - choosing the number of observations to include in a statistical sample
Sampling - selection of a subset of individuals from within a statistical population to estimate characteristics of the whole population
Scaling - a transformation that increases or decreases object dimensions
Science - a system of organized knowledge and testable explanations/predictions
Self - Python implementation of the this parameter, which refers to the object, class, or other entity that the currently running code is part of
Scalar - an element of a field used to define a vector space
Selection Bias - the bias introduced by the selection of individuals, groups or data for analysis in such a way that proper randomization is not achieved
Sequential Query Language (SQL) - is used for database CRUD (Create, Read, Update, Delete) operations
Server - a computer program or a device that provides functionality for other programs or devices, called clients
Sigmoid Function - has a somewhat elongated S shape curve
Signal Processing - the analysis, synthesis, and modification of data streams, such as those for sound and images
Softmax Function - for a set of inputs, transforms each input to a positive number with the sum of all = 1
Software Containers - are lightweight, standalone executable packages that include all the necessary components and dependencies required to run an application consistently across different computing environments
Sorting Algorithms - algorithm code that sorts items using a variety of sorting methods
Sort Programming Construct - functions that put elements in a certain order. Sorting is used so often that many programming languages have built-in sort functions
Sparse Matrix - a matrix in which most elements are zero
Standard Deviation - a measure that is used to quantify the amount of variation or dispersion of a set of data values
Statement - expresses some action to be carried out
Statistical Hypothesis Test - statistical tests to determine the degree to which there is a relationship between a mathematical model and measured results
Statistical Power of a Test - the probability that the test correctly rejects the null hypothesis (H0) when a specific alternative hypothesis (H1) is true
Statistics - numerical facts and data, often coupled with the application of probability
Stochastic Gradient Descent - a calculus based method of finding the minima or maxima of differentiable
Structure Automation - automation of some of the processes of Machine Learning, also referred to as AutoML
Supervised Learning - Machine Learning model training methodology that uses labeled example data
Support Vector Machines - SVMs represent elements as clustered scalars in a multidimensional space separated by vectors
Switch - control mechanism used to allow the value of a variable or expression to change the control flow of program execution
System Scaling - scaling Computing Systems to serve large numbers of users while maintaining good response times involves several technical techniques that optimize resource utilization, distribution, and management.
T
Table - a collection of related data held in a structured format within a database. It consists of fields (columns), and rows
Tanh Function - has a compressed S shape curve
Tensor Processing Unit - Computing Hardware well suited to performing TensorFlow Machine Learning calculations
TensorFlow - open source machine learning platform from Google
This/Self - refers to the object, class, or other entity that the currently running code is part of
Time Series - a series of data points indexed (or listed or graphed) in time order
Token - an object which represents the right to perform an operation
Transformer Neural Networks - do not require sequential data be processed in order
Trigonometry - studies relationships involving lengths and angles of triangles
Trigonometric Functions - relate an angle of a right-angled triangle to ratios of two side lengths
Type - collection of rules for constructs such as variables and functions. Untyped languages allow any operation to be performed on any data type
U
Underfitting - a statistical analysis that does not adequately capture the underlying structure of the data
Unsupervised Learning - Machine Learning model training methodology that uses unlabeled example data
V
Variable - a storage location and associated name which contains a modifiable value
Variance - the expectation of the squared deviation of a random variable from its mean
Vector - among other definitions, a one dimensional array
Vector Databases - are a specialized type of database designed to store, manage, and index high-dimensional vector data efficiently; they are particularly useful for handling unstructured data such as text, images, audio, and video, which can be converted into numerical vector representations using machine learning techniques like embeddings
W
Web Crawler - is an Internet bot that systematically browses the World Wide Web
Weights - in the context Machine Learning graphs, a numerical value, assigned as a label to a vertex or edge of a graph
Word Embedding - the process of mapping words or phrases to vectors of real numbers
X
X Axis - in a cartesian coordinate system, the horizontal axis, often measuring independent variables
Y
Y Axis - in a cartesian coordinate system, the horizontal axis, often measuring dependent variables
Z
Z Axis - in a cartesian coordinate system, the third axis, often measuring dependent variables