Norm AD Notes
1. Vector p-norm
\|x\|_p = \left(\sum_i |x_i|^p\right)^{1/p}, \qquad x \in \mathbb{C}^N
Forward Rule
\dot{n} = \frac{\sum_i |x_i|^{p-2}\operatorname{Re}(\bar{x}_i \dot{x}_i)} {\|x\|_p^{p-1}}.
Reverse Rule
\bar{x}_i = \bar{n} \cdot \frac{x_i |x_i|^{p-2}}{\|x\|_p^{p-1}}.
Special cases
| p | Reverse rule | Notes |
|---|---|---|
| 0 | 0 | Piecewise constant |
| 1 | \bar{n}\,\operatorname{sgn}(x) | Subgradient at x_i = 0 |
| 2 | \bar{n}\,x / \|x\|_2 | Masked at \|x\| = 0 |
| \infty | uniform average over active maximizers | Tie-sensitive |
2. Frobenius norm
\|A\|_F = \sqrt{\operatorname{tr}(A^\dagger A)}.
Forward Rule
\dot{n} = \frac{\operatorname{Re}\operatorname{tr}(A^\dagger \dot{A})}{\|A\|_F}.
Reverse Rule
\bar{A} = \bar{n} \cdot \frac{A}{\|A\|_F}.
3. Nuclear norm
\|A\|_* = \sum_i \sigma_i(A).
If A = U S V^\dagger is an SVD, then
Forward Rule
\dot{n} = \operatorname{Re}\operatorname{tr}(U^\dagger \dot{A} V).
Reverse Rule
\bar{A} = \bar{n} \cdot U V^\dagger.
This norm inherits the same smoothness caveats as the SVD.
4. Spectral norm
\|A\|_2 = \sigma_{\max}(A).
For a simple top singular value:
Forward Rule
\dot{n} = \operatorname{Re}(u_1^\dagger \dot{A} v_1).
Reverse Rule
\bar{A} = \bar{n} \cdot u_1 v_1^\dagger.
For multiplicity k > 1, the subgradient is the average over the active singular-vector dyads.
Implementation Correspondence
tenferro-rs/docs/AD/norm.mdseparates vector norms, Frobenius norm, nuclear norm, and spectral norm explicitly. This note preserves that structure.- PyTorch’s
norm_backwardandnorm_jvpimplement the scalar/vector p-norm cases directly, including the tie-handling for p = \infty. linalg_vector_norm_backwardis a thin wrapper around the same formulas.- Matrix nuclear and spectral norms are implemented in PyTorch by decomposition into SVD-derived primitives rather than a dedicated manual formula.
Numerical Notes
- Nonsmooth points, especially zero inputs and repeated top singular values, require subgradient conventions.
- The DB excludes upstream norm families that are classified as unsupported subgradient cases.
Verification
- compare primal norm values against direct evaluation
- compare JVP/VJP against finite differences away from nonsmooth points
- for nuclear and spectral norms, cross-check against SVD-based observables
References
- M. B. Giles, “An extended collection of matrix derivative results for forward and reverse mode automatic differentiation,” 2008.
- G. A. Watson, “Characterization of the subdifferential of some matrix norms,”
DB Families
The DB publishes the chosen norm value directly.
The DB publishes the matrix-norm observable directly.
The DB publishes the vector-norm observable directly.
The DB treats condition-number families as scalar spectral observables derived from the same singular-value sensitivities.