Symbols and Notation

The most common symbols and notation used in this book are presented below. Any deviations will be made clear from context.

Matrices and vectors

A lower-case letter in normal font, \(x\), refers to a single, fixed observation. When in bold font, a lower-case letter, \(\mathbf{x}\), refers to a vector of fixed observations, and an upper-case letter, \(\mathbf{X}\), represents a matrix. Calligraphic letters, \(\mathcal{X}\), are used to denote sets.

A matrix will always be defined with its dimensions using the notation, \(\mathbf{X}\in \mathcal{X}^{n \times p}\), or if for example \(\mathcal{X}\) is the set of real numbers, it may be written as ‘\(\mathbf{X}\) is an \(n \times p\) real-valued matrix’, analogously for matrices of integers, naturals, etc. By default, a ‘vector’ will refer to a column vector, which may be thought of as a matrix with \(n\) rows and one column, and may be represented as:

\[ \mathbf{x}= \begin{pmatrix} x_1 \\ x_2 \\ \vdots \\ x_n \end{pmatrix} \]

Vectors are usually defined using transpose notation, for example the vector above may instead be written as \(\mathbf{x}^\top= (x_1 \ x_2 \cdots x_n)\) or \(\mathbf{x}= (x_1 \ x_2 \cdots x_n)^\top\). Vectors may also be defined in a shortened format as \(\mathbf{x}\in \mathcal{X}^n\), which implies a column vector of length \(n\) with elements as represented above.

Let \(\mathbf{X}\in \mathcal{X}^{n \times p}\) be a matrix. A letter in normal font with two subscripts is used to reference a single element of a matrix, for example \(x_{i,j} \in \mathcal{X}\) would be the element in the \(i\)th row and \(j\)th column of \(\mathbf{X}\). A bold-face, lower-case letter with a single subscript refers to a row of a matrix, for example \(\mathbf{x}_i = (x_{i,1} \ x_{i,2} \cdots x_{i,p})^\top\); note that while \(\mathbf{x}_i\) refers to the row of a matrix, the column vector notation is maintained to adhere to the convention that vectors are column vectors. A column is referenced with a dot in place of the row index, for example the \(j\)th column of \(\mathbf{X}\) would be \(\mathbf{x}_{\cdot j} = (x_{1,j} \ x_{2,j} \cdots x_{n,j})^\top\). A letter in normal font with one subscript refers to a single element from a vector, for example given \(\mathbf{x}\in \mathcal{X}^n\), the \(i\)th element is denoted \(x_i\).

Functions

A ‘hat’ over a variable, \(\hat{x}\), will refer to the prediction or estimation of a variable, \(x\), with bold-face used again to represent vectors. A ‘bar’ over a variable, \(\bar{x}\), refers to the sample mean of a corresponding vector \(\mathbf{x}\). Capital letters in normal font, \(X\), refer to scalar or vector random variables, which will be made clear from context. \(\mathbb{E}(X)\) is the expectation of the random variable \(X\).

\(g: \mathcal{A} \rightarrow \mathcal{B}\) denotes a function \(g\) from some domain \(\mathcal{A}\) to some codomain \(\mathcal{B}\). While \(f\) is often used to represent abstract functions, \(g\) is primarily used in this book to avoid confusion with the probability density function. Given a random variable \(X\), \(f_X\) denotes its probability density function; other distribution-defining functions are subscripted analogously, for example \(F_X\) for the cumulative distribution function and \(S_X\) for the survival function. In the context of probability distributions, a subscript \(0\) refers to a baseline function, for example, \(S_0\) is the baseline survival function.

Let \(B(\omega)\) be a logical statement, then \(\mathbb{I}(B(\omega))\) denotes the indicator function, defined as:

\[ \mathbb{I}(B(\omega))= \begin{cases} 1, & \text{if } B(\omega) \text{ is true }, \\ 0, & \text{otherwise}. \end{cases} \tag{1}\]

The exponential function, \(e^x\), is written as \(\exp(x)\), and \(\log(x)\) refers to the natural logarithm \(\ln(x) = \log_e(x)\). When required, equations are referenced as a number in parentheses, for example the indicator function above is (1).

Variables and acronyms

Common variables and acronyms used in the book are given in Table 1 and Table 2 respectively.

Table 1: Common variables used throughout the book.
Variable Definition
\(\mathbb{R}, \mathbb{R}_{>0}, \mathbb{R}_{\geq 0}, \bar{\mathbb{R}}\) Set of real numbers, positive real numbers, non-negative real numbers, and the extended real numbers (includes \(\pm\infty\)).
\(\mathbb{N}_0, \mathbb{N}_{> 0}\) Set of Naturals and positive Naturals.
\((\mathbf{X}, \mathbf{t}, \boldsymbol{\delta})\) Survival data where \(\mathbf{X}\in \mathbb{R}^{n \times p}\) is a real-valued matrix of \(n\) observations (rows) and \(p\) features (columns), \(\mathbf{t}\in \mathbb{R}_{\geq 0}^n\) is a vector of observed outcome times, and \(\boldsymbol{\delta}\in \mathbb{N}_0^n\) is a vector of observed outcome indicators (generalized here for event history analysis).
\(\boldsymbol{\beta}\) Vector of model coefficients/weights, \(\boldsymbol{\beta}\in \mathbb{R}^p\).
\(\boldsymbol{\eta}\) Vector of linear predictors, \(\boldsymbol{\mathbf{\eta}} = ({\eta}_1 \ {\eta}_2 \cdots {\eta}_{n})^\top\), where \(\boldsymbol{\eta}= \mathbf{X}\boldsymbol{\beta}\) and \(\eta_i = \mathbf{x}_{i}^\top\boldsymbol{\beta}\).
\(\mathcal{D}, \mathcal{D}_{train}, \mathcal{D}_{test}\) Dataset, training data, and testing data.
Table 2: Common acronyms used throughout the book.
Acronym Definition
AFT Accelerated Failure Time
CIF Cumulative Incidence Function
CPH Cox Proportional Hazards
GBM Gradient Boosting Machine
IPC(W) Inverse Probability of Censoring (Weighting)
KM Kaplan-Meier
NA Nelson-Aalen
PH Proportional Hazards
RMST Restricted Mean Survival Time
RSF Random Survival Forest
(S)SVM (Survival) Support Vector Machine