Symbols and Notation
The most common symbols and notation used in this book are presented below. Any deviations will be made clear from context.
Matrices and vectors
A lower-case letter in normal font, \(x\), refers to a single, fixed observation. When in bold font, a lower-case letter, \(\mathbf{x}\), refers to a vector of fixed observations, and an upper-case letter, \(\mathbf{X}\), represents a matrix. Calligraphic letters, \(\mathcal{X}\), are used to denote sets.
A matrix will always be defined with its dimensions using the notation, \(\mathbf{X}\in \mathcal{X}^{n \times p}\), or if for example \(\mathcal{X}\) is the set of real numbers, it may be written as ‘\(\mathbf{X}\) is an \(n \times p\) real-valued matrix’, analogously for matrices of integers, naturals, etc. By default, a ‘vector’ will refer to a column vector, which may be thought of as a matrix with \(n\) rows and one column, and may be represented as:
\[ \mathbf{x}= \begin{pmatrix} x_1 \\ x_2 \\ \vdots \\ x_n \end{pmatrix} \]
Vectors are usually defined using transpose notation, for example the vector above may instead be written as \(\mathbf{x}^\top= (x_1 \ x_2 \cdots x_n)\) or \(\mathbf{x}= (x_1 \ x_2 \cdots x_n)^\top\). Vectors may also be defined in a shortened format as \(\mathbf{x}\in \mathcal{X}^n\), which implies a column vector of length \(n\) with elements as represented above.
Let \(\mathbf{X}\in \mathcal{X}^{n \times p}\) be a matrix. A letter in normal font with two subscripts is used to reference a single element of a matrix, for example \(x_{i,j} \in \mathcal{X}\) would be the element in the \(i\)th row and \(j\)th column of \(\mathbf{X}\). A bold-face, lower-case letter with a single subscript refers to a row of a matrix, for example \(\mathbf{x}_i = (x_{i,1} \ x_{i,2} \cdots x_{i,p})^\top\); note that while \(\mathbf{x}_i\) refers to the row of a matrix, the column vector notation is maintained to adhere to the convention that vectors are column vectors. A column is referenced with a dot in place of the row index, for example the \(j\)th column of \(\mathbf{X}\) would be \(\mathbf{x}_{\cdot j} = (x_{1,j} \ x_{2,j} \cdots x_{n,j})^\top\). A letter in normal font with one subscript refers to a single element from a vector, for example given \(\mathbf{x}\in \mathcal{X}^n\), the \(i\)th element is denoted \(x_i\).
Functions
A ‘hat’ over a variable, \(\hat{x}\), will refer to the prediction or estimation of a variable, \(x\), with bold-face used again to represent vectors. A ‘bar’ over a variable, \(\bar{x}\), refers to the sample mean of a corresponding vector \(\mathbf{x}\). Capital letters in normal font, \(X\), refer to scalar or vector random variables, which will be made clear from context. \(\mathbb{E}(X)\) is the expectation of the random variable \(X\).
\(g: \mathcal{A} \rightarrow \mathcal{B}\) denotes a function \(g\) from some domain \(\mathcal{A}\) to some codomain \(\mathcal{B}\). While \(f\) is often used to represent abstract functions, \(g\) is primarily used in this book to avoid confusion with the probability density function. Given a random variable \(X\), \(f_X\) denotes its probability density function; other distribution-defining functions are subscripted analogously, for example \(F_X\) for the cumulative distribution function and \(S_X\) for the survival function. In the context of probability distributions, a subscript \(0\) refers to a baseline function, for example, \(S_0\) is the baseline survival function.
Let \(B(\omega)\) be a logical statement, then \(\mathbb{I}(B(\omega))\) denotes the indicator function, defined as:
\[ \mathbb{I}(B(\omega))= \begin{cases} 1, & \text{if } B(\omega) \text{ is true }, \\ 0, & \text{otherwise}. \end{cases} \tag{1}\]
The exponential function, \(e^x\), is written as \(\exp(x)\), and \(\log(x)\) refers to the natural logarithm \(\ln(x) = \log_e(x)\). When required, equations are referenced as a number in parentheses, for example the indicator function above is (1).
Variables and acronyms
Common variables and acronyms used in the book are given in Table 1 and Table 2 respectively.
| Variable | Definition |
|---|---|
| \(\mathbb{R}, \mathbb{R}_{>0}, \mathbb{R}_{\geq 0}, \bar{\mathbb{R}}\) | Set of real numbers, positive real numbers, non-negative real numbers, and the extended real numbers (includes \(\pm\infty\)). |
| \(\mathbb{N}_0, \mathbb{N}_{> 0}\) | Set of Naturals and positive Naturals. |
| \((\mathbf{X}, \mathbf{t}, \boldsymbol{\delta})\) | Survival data where \(\mathbf{X}\in \mathbb{R}^{n \times p}\) is a real-valued matrix of \(n\) observations (rows) and \(p\) features (columns), \(\mathbf{t}\in \mathbb{R}_{\geq 0}^n\) is a vector of observed outcome times, and \(\boldsymbol{\delta}\in \mathbb{N}_0^n\) is a vector of observed outcome indicators (generalized here for event history analysis). |
| \(\boldsymbol{\beta}\) | Vector of model coefficients/weights, \(\boldsymbol{\beta}\in \mathbb{R}^p\). |
| \(\boldsymbol{\eta}\) | Vector of linear predictors, \(\boldsymbol{\mathbf{\eta}} = ({\eta}_1 \ {\eta}_2 \cdots {\eta}_{n})^\top\), where \(\boldsymbol{\eta}= \mathbf{X}\boldsymbol{\beta}\) and \(\eta_i = \mathbf{x}_{i}^\top\boldsymbol{\beta}\). |
| \(\mathcal{D}, \mathcal{D}_{train}, \mathcal{D}_{test}\) | Dataset, training data, and testing data. |
| Acronym | Definition |
|---|---|
| AFT | Accelerated Failure Time |
| CIF | Cumulative Incidence Function |
| CPH | Cox Proportional Hazards |
| GBM | Gradient Boosting Machine |
| IPC(W) | Inverse Probability of Censoring (Weighting) |
| KM | Kaplan-Meier |
| NA | Nelson-Aalen |
| PH | Proportional Hazards |
| RMST | Restricted Mean Survival Time |
| RSF | Random Survival Forest |
| (S)SVM | (Survival) Support Vector Machine |