Scalar And Tensor Wrapper AD Notes
Scope
This note records the shared scalar AD formulas implemented in chainrules-scalarops together with the tensor-level wrappers used by tenferro-dyadtensor.
PyTorch Baseline
The local comparison baseline is PyTorch’s manual autograd formulas:
tools/autograd/derivatives.yamltorch/csrc/autograd/FunctionsManual.cppdocs/source/notes/autograd.rst
In particular, the scalar wrappers here follow the same handle_r_to_c real-input projection convention used by PyTorch.
Complex Gradient Convention
For real-valued losses:
- gradients follow the conjugate-Wirtinger convention
- VJP formulas include complex conjugation where required
- real inputs project complex intermediates back to the real domain (
handle_r_to_c)
Current Complex Support Boundary
This note groups both complex-capable wrappers and wrappers that remain float-only in the pinned PyTorch upstream AD coverage.
For this repository phase, families that are still float-only in the pinned PyTorch upstream AD coverage are tracked as explicitly unsupported for complex in docs/math/complex-support.json rather than being promoted to repo-specific complex extensions.
Scalar Basis Rules
Let g be the output cotangent, x the primal input, and y = f(x) the primal output.
Core arithmetic
add: (dx_1, dx_2) = (g, g)sub: (dx_1, dx_2) = (g, -g)mul: (dx_1, dx_2) = (g \cdot \overline{x_2}, g \cdot \overline{x_1})div: quotient rule with conjugated denominator factors
Analytic unary wrappers
conj: dx = \overline{g}sqrt: dx = g / (2 \overline{\sqrt{x}})exp: dx = g \cdot \overline{y}log: dx = g / \overline{x}expm1: derivative factorexp(x)log1p: derivative factor1 / (1 + x)sin: derivative factorcos(x)cos: derivative factor-sin(x)tanh: derivative factor1 - y^2
Parameterized wrappers
atan2: standard real partials over a^2 + b^2powf: fixed scalar-exponent rulepowi: integer-exponent specialization ofpowfpow:- base path: dx = g \cdot \overline{a x^{a-1}}
- exponent path: da = g \cdot \overline{x^a \log(x)}
Tensor-Composite Rules
Tensor-level wrappers built on top of the scalar basis include:
- pointwise unary analytic families
- broadcasted binary analytic families
- small tensor wrappers such as
cross,diagonal,matrix_power,multi_dot,vander,vecdot, andhouseholder_product
Tensor Reduction Wrappers
sum_ad
Every element receives the same cotangent.
mean_ad
Every element receives the cotangent divided by the number of reduced entries.
var_ad
Differentiate through the centered residual x - \operatorname{mean}(x).
std_ad
Combine the variance rule with the derivative of sqrt.
Published DB Families Using This Note
Reflected and arithmetic wrappers
Unary analytic, sign, rounding, and casts
absacosacoshangleasinasinhatanatan2atanhcdoubleceilclamp_maxclamp_mincomplexconjconj_physicalcopysigncoscoshdeg2raddigammadoubleerferfcerfinvexpexp2expm1fillfloorfmaxfminfracfrexpi0imagldexplgammaloglog10log1plog2logaddexplogitnan_to_numnegpositivepolarrad2degrealreciprocalroundround_decimals_0round_decimals_3round_decimals_neg_3rsqrtsgnsigmoidsignsinsincsinhspecial_entrspecial_erfcxspecial_i0especial_i1special_i1especial_log_ndtrspecial_ndtrspecial_ndtrispecial_polygamma_special_polygamma_n_0special_xlog1pysqrtsquaretantanhtrunc
Reductions and statistics
Neural-network functional wrappers
nn_functional_celunn_functional_elunn_functional_hardshrinknn_functional_hardsigmoidnn_functional_hardtanhnn_functional_logsigmoidnn_functional_mishnn_functional_prelunn_functional_relunn_functional_relu6nn_functional_rrelunn_functional_selunn_functional_silunn_functional_softplusnn_functional_softshrinknn_functional_softsignnn_functional_tanhshrinknn_functional_threshold
Special-function parameter families
Small tensor wrappers currently grouped here
Notes On Future Splits
This shared note is intentionally broad in the first migration pass. Operations that later grow heavier derivation detail can be split into dedicated note files without changing the DB schema; only the central registry needs to move.