torch_openreml.REML

class torch_openreml.REML(v_builder=None, map_theta_to_v=None, map_theta_to_g=None, map_theta_to_dv=None)[source]

Bases: object

REML estimator for linear mixed-effects models.

Fits variance components \(\boldsymbol{\theta}\) by maximising the restricted log-likelihood

\[\ell_R(\boldsymbol{\theta}) = -\frac{1}{2} \left( \log |\symbf{V}(\boldsymbol{\theta})| + \log |\symbf{X}^\top \symbf{V}(\boldsymbol{\theta})^{-1} \symbf{X}| + \symbf{y}^\top \symbf{P} \symbf{y} \right)\]

where \(\symbf{P} = \symbf{V}(\boldsymbol{\theta})^{-1} - \symbf{V}(\boldsymbol{\theta})^{-1}\symbf{X} (\symbf{X}^\top\symbf{V}(\boldsymbol{\theta})^{-1}\symbf{X})^{-1}\symbf{X}^\top\symbf{V}(\boldsymbol{\theta})^{-1}\) is the projection matrix, and \(\symbf{V}(\boldsymbol{\theta}) = \symbf{Z}\symbf{G}(\boldsymbol{\theta})\symbf{Z}^\top + \symbf{R}(\boldsymbol{\theta})\) is the mariginal covariance matrix.

Optimisation uses the average information (AI) algorithm, which forms a quasi-Newton step \(\Delta = \symbf{AI}^{-1} \symbf{s}\) from the score vector \(\symbf{s}\) and the AI matrix at each iteration.

The covariance model \(\symbf{V}(\boldsymbol{\theta})\) is supplied either as a Matrix instance via v_builder, or as a pair of callables map_theta_to_v and map_theta_to_dv. If map_theta_to_dv is omitted, the Jacobian is computed automatically via torch.func.jacrev().

Initialize a REML estimator.

Parameters:
  • v_builder (Matrix, optional) – A Matrix instance whose map_theta_to_v() and map_theta_to_dv() methods are used to construct \(\symbf{V}\) and its Jacobian. Takes precedence over map_theta_to_v and map_theta_to_dv if both are provided.

  • map_theta_to_v (callable, optional) – Maps a flat parameter tensor \(\boldsymbol{\theta}\) to the covariance matrix \(\symbf{V}\). Required if v_builder is not provided.

  • map_theta_to_g (callable, optional) – Maps \(\boldsymbol{\theta}\) to the random-effect covariance matrix \(\symbf{G}\). Required for blup(), predict(), and residual().

  • map_theta_to_dv (callable, optional) – Maps \(\boldsymbol{\theta}\) to the Jacobian of \(\symbf{V}\), a 3D tensor of shape (num_params, n, n). If None and v_builder is not provided, the Jacobian is computed via automatic differentiation.

Raises:
  • TypeError – If v_builder is not a Matrix instance.

  • ValueError – If neither v_builder nor map_theta_to_v is provided.

Example:

import torch
from torch_openreml import REML
from torch_openreml.covariance import ScalarMatrix

n, p = 50, 2
y = torch.randn(n)
x = torch.randn(n, p)
theta = torch.tensor([0.0])

mat = ScalarMatrix(n)
reml = REML(v_builder=mat)
theta_hat, beta_hat, n_iter = reml.optimize(y, x, theta, verbose=2)
theta_hat, beta_hat

∥∇∥:      13.9393, ∥Δ∥: 0.1125, η: 1.00, ∥Δᶜ∥: 0.1125, log 𝓛: -34.7760
∥∇∥:       1.4572, ∥Δ∥: 0.0147, η: 1.00, ∥Δᶜ∥: 0.0147, log 𝓛: -33.9361 (+0.8399)
∥∇∥:       0.0213, ∥Δ∥: 0.0002, η: 1.00, ∥Δᶜ∥: 0.0002, log 𝓛: -33.9253 (+0.0108)
∥∇∥:       0.0000, ∥Δ∥: 0.0000, η: 1.00, ∥Δᶜ∥: 0.0000, log 𝓛: -33.9253 (+0.0000)

[∇: score, Δ: 𝐉⁻¹∇, η: learning rate, Δᶜ: clip(𝛉 + ηΔ, lb, ub) - 𝛉, 𝓛: restricted likelihood]

✓ Converged at iteration 4
(tensor([0.1275]), tensor([ 0.0941, -0.0798]))

Methods

ai_step(y, x, theta[, require_loglik, ...])

Perform a single average information (AI) algorithm step.

blue(y, x, theta)

Compute the best linear unbiased estimator (BLUE) of fixed effects.

blup(y, x, z, theta[, map_theta_to_g])

Compute the best linear unbiased predictor (BLUP) of random effects.

compute_v_dv(theta)

Compute the covariance matrix and its Jacobian.

get_beta([select, history])

Retrieve a fixed-effect estimate from the optimisation history.

get_theta([select, history])

Retrieve a parameter estimate from the optimisation history.

is_converged([check_score, check_delta, ...])

Check whether the optimisation has converged.

loglik(y, x, theta)

Evaluate the REML log-likelihood.

marginal_predict(y, x, theta)

Compute marginal fitted values from fixed effects only.

marginal_residual(y, x, theta)

Compute marginal residuals from fixed effects only.

optimize(y, x, theta[, max_iter, eta, ...])

Run the AI-REML optimisation loop.

predict(y, x, z, theta[, map_theta_to_g])

Compute conditional fitted values including random effects.

residual(y, x, z, theta[, map_theta_to_g])

Compute conditional residuals including random effects.

update(theta, delta, eta[, lb, ub])

Apply a damped AI step and clip parameters to bounds.

blue(y, x, theta)[source]

Compute the best linear unbiased estimator (BLUE) of fixed effects.

Solves for \(\hat{\boldsymbol{\beta}}\) via generalised least squares:

\[\hat{\boldsymbol{\beta}} = (\symbf{X}^\top \symbf{V}(\boldsymbol{\theta})^{-1} \symbf{X})^{-1} \symbf{X}^\top \symbf{V}(\boldsymbol{\theta})^{-1} \symbf{y}\]
Parameters:
  • y (torch.Tensor) – Response vector of shape (n,).

  • x (torch.Tensor) – Design matrix of shape (n, p).

  • theta (torch.Tensor) – Flat variance component parameter tensor.

Returns:

Fixed-effect estimate \(\hat{\boldsymbol{\beta}}\) of shape (p,).

Return type:

torch.Tensor

marginal_predict(y, x, theta)[source]

Compute marginal fitted values from fixed effects only.

\[\hat{\symbf{y}} = \symbf{X} \hat{\boldsymbol{\beta}}\]
Parameters:
  • y (torch.Tensor) – Response vector of shape (n,).

  • x (torch.Tensor) – Design matrix of shape (n, p).

  • theta (torch.Tensor) – Flat variance component parameter tensor.

Returns:

Marginal fitted values of shape (n,).

Return type:

torch.Tensor

marginal_residual(y, x, theta)[source]

Compute marginal residuals from fixed effects only.

\[\hat{\symbf{e}} = \symbf{y} - \symbf{X}\hat{\boldsymbol{\beta}}\]
Parameters:
  • y (torch.Tensor) – Response vector of shape (n,).

  • x (torch.Tensor) – Design matrix of shape (n, p).

  • theta (torch.Tensor) – Flat variance component parameter tensor.

Returns:

Marginal residuals of shape (n,).

Return type:

torch.Tensor

blup(y, x, z, theta, map_theta_to_g=None)[source]

Compute the best linear unbiased predictor (BLUP) of random effects.

\[\hat{\symbf{u}} = \symbf{G} \symbf{Z}^\top \symbf{V}^{-1}(\boldsymbol{\theta}) \hat{\symbf{e}}\]

where \(\hat{\symbf{e}}\) are the marginal residuals.

Parameters:
  • y (torch.Tensor) – Response vector of shape (n,).

  • x (torch.Tensor) – Fixed-effect design matrix of shape (n, p).

  • z (torch.Tensor) – Random-effect design matrix of shape (n, q).

  • theta (torch.Tensor) – Flat variance component parameter tensor.

  • map_theta_to_g (callable, optional) – Maps theta to the random-effect covariance matrix \(\symbf{G}\). Defaults to map_theta_to_g set at initialisation.

Returns:

Random-effect predictions of shape (q,).

Return type:

torch.Tensor

predict(y, x, z, theta, map_theta_to_g=None)[source]

Compute conditional fitted values including random effects.

\[\hat{\symbf{y}} = \symbf{X}\hat{\boldsymbol{\beta}} + \symbf{Z}\hat{\symbf{u}}\]
Parameters:
  • y (torch.Tensor) – Response vector of shape (n,).

  • x (torch.Tensor) – Fixed-effect design matrix of shape (n, p).

  • z (torch.Tensor) – Random-effect design matrix of shape (n, q).

  • theta (torch.Tensor) – Flat variance component parameter tensor.

  • map_theta_to_g (callable, optional) – Maps theta to \(\symbf{G}\). Defaults to map_theta_to_g set at initialisation.

Returns:

Conditional fitted values of shape (n,).

Return type:

torch.Tensor

residual(y, x, z, theta, map_theta_to_g=None)[source]

Compute conditional residuals including random effects.

\[\hat{\symbf{e}} = \symbf{y} - \hat{\symbf{y}}\]

where \(\hat{\symbf{y}}\) is the conditional prediction from predict().

Parameters:
  • y (torch.Tensor) – Response vector of shape (n,).

  • x (torch.Tensor) – Fixed-effect design matrix of shape (n, p).

  • z (torch.Tensor) – Random-effect design matrix of shape (n, q).

  • theta (torch.Tensor) – Flat variance component parameter tensor.

  • map_theta_to_g (callable, optional) – Maps theta to \(\symbf{G}\). Defaults to map_theta_to_g set at initialisation.

Returns:

Conditional residuals of shape (n,).

Return type:

torch.Tensor

loglik(y, x, theta)[source]

Evaluate the REML log-likelihood.

\[\ell_R(\boldsymbol{\theta}) = -\frac{1}{2} \left( \log |\symbf{V}(\boldsymbol{\theta})| + \log |\symbf{X}^\top \symbf{V}(\boldsymbol{\theta})^{-1} \symbf{X}| + \symbf{y}^\top \symbf{P} \symbf{y} \right)\]
Parameters:
  • y (torch.Tensor) – Response vector of shape (n,).

  • x (torch.Tensor) – Design matrix of shape (n, p).

  • theta (torch.Tensor) – Flat variance component parameter tensor.

Returns:

Scalar REML log-likelihood value.

Return type:

torch.Tensor

compute_v_dv(theta)[source]

Compute the covariance matrix and its Jacobian.

Calls map_theta_to_v to build \(\symbf{V}(\boldsymbol{\theta})\) and either map_theta_to_dv or torch.func.jacrev() to build the Jacobian \(\partial\symbf{V}(\boldsymbol{\theta})/\partial\boldsymbol{\theta}\).

Parameters:

theta (torch.Tensor) – Flat variance component parameter tensor.

Returns:

(v, dv), where v is the covariance matrix of shape (n, n) and dv is the Jacobian of shape (num_params, n, n).

Return type:

tuple

ai_step(y, x, theta, require_loglik=True, require_beta=True)[source]

Perform a single average information (AI) algorithm step.

Computes the score vector \(\symbf{s}\), AI matrix, and optionally the REML log-likelihood and fixed-effect estimate at the current \(\boldsymbol{\theta}\).

The score vector and AI matrix are:

\[\begin{split}s_k &= \frac{1}{2}\left( \symbf{y}^\top \symbf{P} \frac{\partial\symbf{V}(\boldsymbol{\theta})}{\partial\theta_k} \symbf{P} \symbf{y} - \mathrm{tr}\!\left(\symbf{P} \frac{\partial\symbf{V}(\boldsymbol{\theta})}{\partial\theta_k}\right) \right) \\ \mathrm{AI}_{kj} &= \frac{1}{2} \symbf{y}^\top \symbf{P} \frac{\partial\symbf{V}(\boldsymbol{\theta})}{\partial\theta_k} \symbf{P} \frac{\partial\symbf{V}(\boldsymbol{\theta})}{\partial\theta_j} \symbf{P} \symbf{y}\end{split}\]
Parameters:
  • y (torch.Tensor) – Response vector of shape (n,).

  • x (torch.Tensor) – Design matrix of shape (n, p).

  • theta (torch.Tensor) – Flat variance component parameter tensor.

  • require_loglik (bool, optional) – Whether to evaluate the REML log-likelihood. Defaults to True.

  • require_beta (bool, optional) – Whether to compute the fixed-effect estimate. Defaults to True.

Returns:

(beta, score, ai, loglik), where beta is of shape (p,), score is of shape (num_params,), ai is of shape (num_params, num_params), and loglik is a scalar tensor. beta and loglik are torch.nan if their respective require_* flag is False.

Return type:

tuple

get_theta(select='last', history=None)[source]

Retrieve a parameter estimate from the optimisation history.

Parameters:
  • select (str, optional) – "last" returns the final iterate; any other value returns the iterate with the highest log-likelihood. Defaults to "last".

  • history (dict, optional) – History dictionary to query. Defaults to history populated by optimize().

Returns:

Selected parameter tensor \(\boldsymbol{\theta}\).

Return type:

torch.Tensor

get_beta(select='last', history=None)[source]

Retrieve a fixed-effect estimate from the optimisation history.

Parameters:
  • select (str, optional) – "last" returns the final iterate; any other value returns the iterate with the highest log-likelihood. Defaults to "last".

  • history (dict, optional) – History dictionary to query. Defaults to history populated by optimize().

Returns:

Selected fixed-effect estimate \(\hat{\boldsymbol{\beta}}\).

Return type:

torch.Tensor

is_converged(check_score=True, check_delta=True, check_loglik=True, tol_score=0.0001, tol_delta=0.0001, tol_loglik=0.0001)[source]

Check whether the optimisation has converged.

Convergence is declared when all enabled criteria fall below their respective tolerances. At least two iterations must have completed before any criterion can be satisfied.

Parameters:
  • check_score (bool, optional) – Check the norm of the score vector. Defaults to True.

  • check_delta (bool, optional) – Check the norm of the parameter update \(\Delta\). Defaults to True.

  • check_loglik (bool, optional) – Check the absolute change in log-likelihood between successive iterates. Defaults to True.

  • tol_score (float, optional) – Score norm tolerance. Defaults to 1e-4.

  • tol_delta (float, optional) – Parameter update norm tolerance. Defaults to 1e-4.

  • tol_loglik (float, optional) – Log-likelihood change tolerance. Defaults to 1e-4.

Returns:

True if all enabled criteria are satisfied, False otherwise.

Return type:

bool

update(theta, delta, eta, lb=-inf, ub=inf)[source]

Apply a damped AI step and clip parameters to bounds.

\[\boldsymbol{\theta} \leftarrow \mathrm{clip}(\boldsymbol{\theta} + \eta \Delta,\, \text{lb},\, \text{ub})\]
Parameters:
  • theta (torch.Tensor) – Current parameter tensor.

  • delta (torch.Tensor) – AI step \(\Delta = \symbf{AI}^{-1}\symbf{s}\).

  • eta (float) – Step size (learning rate).

  • lb (float, optional) – Lower bound for clipping. Defaults to -inf.

  • ub (float, optional) – Upper bound for clipping. Defaults to inf.

Returns:

(theta, update), where theta is the updated parameter tensor and update is the actual change after clipping.

Return type:

tuple

optimize(y, x, theta, max_iter=200, eta=1.0, require_loglik=True, lb=-inf, ub=inf, verbose=0, check_score=True, check_delta=True, check_loglik=True, tol_score=0.0001, tol_delta=0.0001, tol_loglik=0.0001)[source]

Run the AI-REML optimisation loop.

Iterates ai_step() and update() until convergence or max_iter is reached. Optimisation history is stored in history and can be queried afterwards via get_theta() and get_beta().

Parameters:
  • y (torch.Tensor) – Response vector of shape (n,).

  • x (torch.Tensor) – Design matrix of shape (n, p).

  • theta (torch.Tensor) – Initial variance component parameter tensor.

  • max_iter (int, optional) – Maximum number of iterations. Defaults to 200.

  • eta (float, optional) – Step size applied to each AI update. Defaults to 1.0.

  • require_loglik (bool, optional) – Whether to evaluate the REML log-likelihood at each iteration. Defaults to True.

  • lb (float, optional) – Lower bound for parameter clipping. Defaults to -inf.

  • ub (float, optional) – Upper bound for parameter clipping. Defaults to inf.

  • verbose (int, optional) – Verbosity level. 0 suppresses all output, 1 shows a progress bar, 2 additionally prints per-iteration diagnostics. Defaults to 0.

  • check_score (bool, optional) – Include score norm in convergence check. Defaults to True.

  • check_delta (bool, optional) – Include parameter update norm in convergence check. Defaults to True.

  • check_loglik (bool, optional) – Include log-likelihood change in convergence check. Defaults to True.

  • tol_score (float, optional) – Score norm tolerance. Defaults to 1e-4.

  • tol_delta (float, optional) – Parameter update norm tolerance. Defaults to 1e-4.

  • tol_loglik (float, optional) – Log-likelihood change tolerance. Defaults to 1e-4.

Returns:

(theta, beta, n_iter), where theta is the final parameter estimate, beta is the corresponding fixed-effect estimate, and n_iter is the number of iterations completed.

Return type:

tuple