torch_openreml.covariance.DiagonalMatrix

class torch_openreml.covariance.DiagonalMatrix(n, param_names=None, trans=None, no_grad_index=None)[source]

Bases: Matrix

Diagonal covariance matrix with one variance parameter per entry.

\[\symbf{V} = \mathrm{diag}(\sigma^2_0, \ldots, \sigma^2_{n-1})\]

Each diagonal entry is parameterised by a single unconstrained scalar transformed to a positive variance via TransformExpPow2 by default. Off-diagonal entries are always zero.

Initialize a diagonal covariance matrix of size n x n.

Parameters:
  • n (int) – Matrix dimension.

  • param_names (list of str, optional) – Names for the n variance parameters. Defaults to ["sigma^2_0", ..., "sigma^2_{n-1}"].

  • trans (list of Transform, optional) – Transforms applied to each parameter. Defaults to [TransformExpPow2()], broadcast across all parameters.

  • no_grad_index (list of int, optional) – Indices of parameters to exclude from gradient computation.

Example:

import torch
from torch_openreml.covariance import DiagonalMatrix

mat = DiagonalMatrix(3)
params = torch.tensor([0.0, 0.5, 1.0])
print(mat(params))

print(mat.grad(params))
tensor([[1.0000, 0.0000, 0.0000],
        [0.0000, 2.7183, 0.0000],
        [0.0000, 0.0000, 7.3891]])
(tensor([[[ 2.0000,  0.0000,  0.0000],
         [ 0.0000,  0.0000,  0.0000],
         [ 0.0000,  0.0000,  0.0000]],

        [[ 0.0000,  0.0000,  0.0000],
         [ 0.0000,  5.4366,  0.0000],
         [ 0.0000,  0.0000,  0.0000]],

        [[ 0.0000,  0.0000,  0.0000],
         [ 0.0000,  0.0000,  0.0000],
         [ 0.0000,  0.0000, 14.7781]]]), ['sigma^2_0', 'sigma^2_1', 'sigma^2_2'])

Methods

__call__(params)

Construct the matrix from a flat parameter tensor.

auto_grad(params)

Compute the Jacobian of build() with respect to trainable parameters using automatic differentiation.

check_params(params)

Validate a parameter tensor and return its device and dtype.

from_param_dict(param_dict)

Extract parameter tensors from a dictionary into a flat 1D tensor.

get_intermediates(params)

Retrieve cached intermediate computation results if still valid.

grad(params)

Compute the Jacobian of __call__() with respect to trainable parameters.

manual_grad(params)

Compute the Jacobian of __call__() with respect to trainable parameters using a closed-form analytic expression.

map_theta_to_dv(theta)

An interface compatible with torch_openreml.REML that maps parameters to the matrix Jacobian.

map_theta_to_v(theta)

An interface compatible with torch_openreml.REML that maps parameters to a matrix.

reset_intermediates()

Clear the intermediate computation cache.

set_intermediates(params, intermediates)

Cache intermediate computation results keyed by parameter hash.

set_no_grad([index, param_name])

Set the indices of parameters to exclude from gradient computation.

to_param_dict(params)

Convert a flat parameter tensor to a parameter dictionary.

trans_grad(params)

Compute the element-wise derivative of the parameter transforms.

trans_params(params)

Apply parameter transforms to a flat parameter tensor.

Attributes

no_grad_index

Indices of parameters excluded from gradient computation.

num_params

Total number of parameters.

param_names

Ordered parameter names.

repr_dict

Key-value pairs used to build the string representation.

shape

Output matrix shape.

trans

Parameter transforms.

__call__(params)[source]

Construct the matrix from a flat parameter tensor.

Must be implemented by subclasses. Implementations should convert params via from_param_dict() or to_param_dict(), then call check_params() to validate and trans_params() to apply transforms before any computation.

Parameters:

params (torch.Tensor or dict) – Flat 1D parameter tensor or parameter dictionary.

Returns:

Constructed matrix of shape shape.

Return type:

torch.Tensor

manual_grad(params)[source]

Compute the Jacobian of __call__() with respect to trainable parameters using a closed-form analytic expression.

Parameters:

params (torch.Tensor or dict) – Flat 1D parameter tensor or parameter dictionary.

Returns:

(grad, grad_names), where grad is a 3D tensor of shape (num_params - len(no_grad_index), *shape) and grad_names is a list of the corresponding parameter names. Returns (None, []) if all parameters are excluded from gradient computation.

Return type:

tuple