torch_openreml.covariance.DiagonalMatrix¶

class torch_openreml.covariance.DiagonalMatrix(n, param_names=None, trans=None, no_grad_index=None)[source]¶

Bases: Matrix

Diagonal covariance matrix with one variance parameter per entry.

\[\symbf{V} = \mathrm{diag}(\sigma^2_0, \ldots, \sigma^2_{n-1})\]

Each diagonal entry is parameterised by a single unconstrained scalar transformed to a positive variance via TransformExpPow2 by default. Off-diagonal entries are always zero.

Initialize a diagonal covariance matrix of size n x n.

Parameters:

n (int) – Matrix dimension.
param_names (list of str, optional) – Names for the n variance parameters. Defaults to ["sigma^2_0", ..., "sigma^2_{n-1}"].
trans (list of Transform, optional) – Transforms applied to each parameter. Defaults to [TransformExpPow2()], broadcast across all parameters.
no_grad_index (list of int, optional) – Indices of parameters to exclude from gradient computation.

Example:

import torch
from torch_openreml.covariance import DiagonalMatrix

mat = DiagonalMatrix(3)
params = torch.tensor([0.0, 0.5, 1.0])
print(mat(params))

print(mat.grad(params))

tensor([[1.0000, 0.0000, 0.0000],
        [0.0000, 2.7183, 0.0000],
        [0.0000, 0.0000, 7.3891]])
(tensor([[[ 2.0000,  0.0000,  0.0000],
         [ 0.0000,  0.0000,  0.0000],
         [ 0.0000,  0.0000,  0.0000]],

        [[ 0.0000,  0.0000,  0.0000],
         [ 0.0000,  5.4366,  0.0000],
         [ 0.0000,  0.0000,  0.0000]],

        [[ 0.0000,  0.0000,  0.0000],
         [ 0.0000,  0.0000,  0.0000],
         [ 0.0000,  0.0000, 14.7781]]]), ['sigma^2_0', 'sigma^2_1', 'sigma^2_2'])

Methods

`__call__`(params)	Construct the matrix from a flat parameter tensor.
`auto_grad`(params)	Compute the Jacobian of `build()` with respect to trainable parameters using automatic differentiation.
`check_params`(params)	Validate a parameter tensor and return its device and dtype.
`from_param_dict`(param_dict)	Extract parameter tensors from a dictionary into a flat 1D tensor.
`get_intermediates`(params)	Retrieve cached intermediate computation results if still valid.
`grad`(params)	Compute the Jacobian of `__call__()` with respect to trainable parameters.
`manual_grad`(params)	Compute the Jacobian of `__call__()` with respect to trainable parameters using a closed-form analytic expression.
`map_theta_to_dv`(theta)	An interface compatible with `torch_openreml.REML` that maps parameters to the matrix Jacobian.
`map_theta_to_v`(theta)	An interface compatible with `torch_openreml.REML` that maps parameters to a matrix.
`reset_intermediates`()	Clear the intermediate computation cache.
`set_intermediates`(params, intermediates)	Cache intermediate computation results keyed by parameter hash.
`set_no_grad`([index, param_name])	Set the indices of parameters to exclude from gradient computation.
`to_param_dict`(params)	Convert a flat parameter tensor to a parameter dictionary.
`trans_grad`(params)	Compute the element-wise derivative of the parameter transforms.
`trans_params`(params)	Apply parameter transforms to a flat parameter tensor.

Attributes

`no_grad_index`	Indices of parameters excluded from gradient computation.
`num_params`	Total number of parameters.
`param_names`	Ordered parameter names.
`repr_dict`	Key-value pairs used to build the string representation.
`shape`	Output matrix shape.
`trans`	Parameter transforms.

__call__(params)[source]¶

Construct the matrix from a flat parameter tensor.

Must be implemented by subclasses. Implementations should convert params via from_param_dict() or to_param_dict(), then call check_params() to validate and trans_params() to apply transforms before any computation.

Parameters:: params (torch.Tensor or dict) – Flat 1D parameter tensor or parameter dictionary.
Returns:: Constructed matrix of shape shape.
Return type:: torch.Tensor

manual_grad(params)[source]¶

Compute the Jacobian of __call__() with respect to trainable parameters using a closed-form analytic expression.

Parameters:: params (torch.Tensor or dict) – Flat 1D parameter tensor or parameter dictionary.
Returns:: (grad, grad_names), where grad is a 3D tensor of shape (num_params - len(no_grad_index), *shape) and grad_names is a list of the corresponding parameter names. Returns (None, []) if all parameters are excluded from gradient computation.
Return type:: tuple