Terminology Glossary

Quick reference for AI/ML terms used throughout this book.

Term	Definition
Activation Function	Non-linear function applied to neuron output (ReLU, GELU, sigmoid)
Adversarial Example	Input crafted to cause misclassification while appearing normal to humans
Alignment	Training a model to behave according to human values and intentions
Attention	Mechanism allowing each token to weigh the relevance of every other token
Autoregressive	Generating output one token at a time, each conditioned on prior tokens
Backpropagation	Algorithm for computing gradients through a neural network
BLEU/ROUGE	Metrics for evaluating generated text quality
Chain-of-Thought (CoT)	Prompting technique that elicits step-by-step reasoning
Context Window	Maximum number of tokens the model can process at once
DPO	Direct Preference Optimization — alternative to RLHF for alignment
Embedding	Dense vector representation of a token capturing semantic meaning
Epoch	One full pass through the training dataset
Few-Shot	Providing examples in the prompt to guide the model
Fine-Tuning	Additional training on a specific dataset after pre-training
FGSM	Fast Gradient Sign Method — efficient adversarial attack
Gradient	Direction and magnitude of steepest ascent in the loss landscape
Gradient Descent	Optimization algorithm that follows negative gradients to minimize loss
Hallucination	Model generating confident but factually incorrect output
Hyperparameter	Training setting not learned from data (learning rate, batch size)
Inference	Using a trained model to make predictions
In-Context Learning	Model learning from examples provided in the prompt
Jailbreak	Technique to bypass model safety training
LoRA	Low-Rank Adaptation — efficient fine-tuning method
Loss Function	Measures how wrong the model's prediction is
LLM	Large Language Model
Logits	Raw model output before softmax normalization
Membership Inference	Determining if a specific sample was in the training data
MLP / FFN	Multi-layer perceptron / Feed-forward network within transformer layers
Next-Token Prediction	The training objective: predict the next token given prior context
Overfitting	Model memorizes training data, fails to generalize
Parameter	A learned weight in the model
Perplexity	Metric for how well a model predicts a text sample (lower = better)
Positional Encoding	Vector added to embeddings to encode token position in sequence
Prompt Injection	Embedding adversarial instructions in model input
QLoRA	Quantized LoRA — even more memory-efficient fine-tuning
Quantization	Reducing model precision (float32 → int8) to reduce size/speed
RAG	Retrieval-Augmented Generation — model retrieves external docs before responding
Reinforcement Learning	Learning by trial and reward signal
RLHF	Reinforcement Learning from Human Feedback
Self-Attention	Attention mechanism where query, key, value all come from the same sequence
Softmax	Function that converts logits to probability distribution summing to 1
System Prompt	Hidden instructions from the developer that set model behavior
Temperature	Controls randomness in sampling (0 = deterministic, higher = more random)
Token	Sub-word unit that the model processes (not exactly a word or character)
Tokenizer	Converts text to token IDs and back
Top-k / Top-p	Sampling strategies to control output diversity
Transfer Attack	Adversarial example crafted on one model that works on another
Transformer	Architecture using self-attention, basis of all modern LLMs
Vector Database	Database storing embeddings for similarity search (used in RAG)
Weight	Learnable parameter in a neural network
Zero-Shot	Model performing a task with no examples, just instructions

AI Security Book

Terminology Glossary