Solve AD Notes
Conventions
Unless noted otherwise, Linearization and Transpose are written for the raw-output-space solve map before any DB observable projection. For complex tensors, Transpose means the adjoint under the real Frobenius inner product
\langle X, Y \rangle_{\mathbb{R}} = \operatorname{Re}\operatorname{tr}(X^\dagger Y).
Forward
The raw operator is the solution map
(A, B) \mapsto X, \qquad A X = B.
Linearization
Differentiating the defining equation gives
\dot{A} X + A \dot{X} = \dot{B}, \qquad \dot{X} = A^{-1}(\dot{B} - \dot{A}X).
JVP
The JVP is the same tangent solve:
\operatorname{jvp}(\operatorname{solve})(A,B;\dot{A},\dot{B}) = A^{-1}(\dot{B} - \dot{A}X).
Transpose
For a raw output cotangent \bar{X}, define
G = A^{-\mathsf{H}}\bar{X}.
Then the transpose map is
(\bar{A}, \bar{B}) = (-G X^{\mathsf{H}}, G).
VJP (JAX convention)
JAX exposes the same raw transpose on the solution output. Right solves, triangular solves, and tensorsolve are the same cotangent geometry with the matching primal structure.
VJP (PyTorch convention)
PyTorch uses the same raw adjoint formulas in linalg_solve_backward and its triangular variants. For solve_ex, the status output remains nondifferentiable metadata.
Forward Definition
For the left solve
A X = B, \qquad A \in \mathbb{C}^{N \times N}, \qquad B \in \mathbb{C}^{N \times K},
the primal solution is
X = A^{-1} B.
Forward Rule
Differentiate the defining equation:
\dot{A} X + A \dot{X} = \dot{B}.
Therefore
\dot{X} = A^{-1}(\dot{B} - \dot{A} X).
Reverse Rule
Given a cotangent \bar{X}:
\delta \ell = \langle \bar{X}, \dot{X} \rangle = \langle A^{-\mathsf{H}} \bar{X}, \dot{B} \rangle - \langle A^{-\mathsf{H}} \bar{X} X^{\mathsf{H}}, \dot{A} \rangle.
Define
G = A^{-\mathsf{H}} \bar{X}.
Then
\bar{B} = G, \qquad \bar{A} = -G X^{\mathsf{H}}.
Triangular Solve
When A is triangular, the same formulas apply with triangular solves replacing the generic solve.
For lower-triangular A:
\bar{A} = \mathrm{tril}(-G X^{\mathsf{H}}).
For upper-triangular A:
\bar{A} = \mathrm{triu}(-G X^{\mathsf{H}}).
For unit-triangular matrices, the diagonal of \bar{A} is additionally zeroed.
Right-side solve
By transposition symmetry, the right solve X A = B obeys
\dot{X} A = \dot{B} - X \dot{A},
\bar{B} = \bar{X} A^{-\mathsf{H}}, \qquad \bar{A} = -X^{\mathsf{H}} \bar{B}.
Structured Variants
solve_exshares the same derivative on the solution output; status outputs are nondifferentiable metadata.solve_triangularuses the same formulas with triangular projection.lu_solvereuses the solve cotangent while taking LU factors and pivots as primal inputs.tensorsolveis the indexed tensor analogue of the same implicit-system rule.
Verification
Forward residual
A X \approx B.
Backward checks
- perturb A and compare the VJP against finite differences
- perturb B and compare the VJP against finite differences
- for triangular solves, confirm the cotangent respects the triangular and unit-triangular structure
References
- M. B. Giles, βAn extended collection of matrix derivative results for forward and reverse mode AD,β 2008.
DB Families
The DB publishes the solution tensor directly.
The DB validates the differentiable solution output; auxiliary execution-status fields are treated as metadata.
The DB applies the same solve differential with the triangular structure enforced by the primal operator.
The DB uses the solution observable for factor-backed solves as well.
The DB treats tensorsolve as the indexed tensor analogue of linear solve.