Scalar And Tensor Wrapper AD Notes

Scope

This note records the shared scalar AD formulas implemented in chainrules-scalarops together with the tensor-level wrappers used by tenferro-dyadtensor.

PyTorch Baseline

The local comparison baseline is PyTorch’s manual autograd formulas:

  • tools/autograd/derivatives.yaml
  • torch/csrc/autograd/FunctionsManual.cpp
  • docs/source/notes/autograd.rst

In particular, the scalar wrappers here follow the same handle_r_to_c real-input projection convention used by PyTorch.

Complex Gradient Convention

For real-valued losses:

  • gradients follow the conjugate-Wirtinger convention
  • VJP formulas include complex conjugation where required
  • real inputs project complex intermediates back to the real domain (handle_r_to_c)

Current Complex Support Boundary

This note groups both complex-capable wrappers and wrappers that remain float-only in the pinned PyTorch upstream AD coverage.

For this repository phase, families that are still float-only in the pinned PyTorch upstream AD coverage are tracked as explicitly unsupported for complex in docs/math/complex-support.json rather than being promoted to repo-specific complex extensions.

Scalar Basis Rules

Let g be the output cotangent, x the primal input, and y = f(x) the primal output.

Core arithmetic

  • add: (dx_1, dx_2) = (g, g)
  • sub: (dx_1, dx_2) = (g, -g)
  • mul: (dx_1, dx_2) = (g \cdot \overline{x_2}, g \cdot \overline{x_1})
  • div: quotient rule with conjugated denominator factors

Analytic unary wrappers

  • conj: dx = \overline{g}
  • sqrt: dx = g / (2 \overline{\sqrt{x}})
  • exp: dx = g \cdot \overline{y}
  • log: dx = g / \overline{x}
  • expm1: derivative factor exp(x)
  • log1p: derivative factor 1 / (1 + x)
  • sin: derivative factor cos(x)
  • cos: derivative factor -sin(x)
  • tanh: derivative factor 1 - y^2

Parameterized wrappers

  • atan2: standard real partials over a^2 + b^2
  • powf: fixed scalar-exponent rule
  • powi: integer-exponent specialization of powf
  • pow:
    • base path: dx = g \cdot \overline{a x^{a-1}}
    • exponent path: da = g \cdot \overline{x^a \log(x)}

Tensor-Composite Rules

Tensor-level wrappers built on top of the scalar basis include:

  • pointwise unary analytic families
  • broadcasted binary analytic families
  • small tensor wrappers such as cross, diagonal, matrix_power, multi_dot, vander, vecdot, and householder_product

Tensor Reduction Wrappers

sum_ad

Every element receives the same cotangent.

mean_ad

Every element receives the cotangent divided by the number of reduced entries.

var_ad

Differentiate through the centered residual x - \operatorname{mean}(x).

std_ad

Combine the variance rule with the derivative of sqrt.

Published DB Families Using This Note

Reflected and arithmetic wrappers

  • __radd__
  • __rdiv__
  • __rmod__
  • __rmul__
  • __rpow__
  • __rsub__
  • add
  • div_no_rounding_mode
  • float_power
  • hypot
  • max_binary
  • maximum
  • min_binary
  • minimum
  • mul
  • pow
  • rsub
  • sub
  • true_divide
  • xlogy

Unary analytic, sign, rounding, and casts

  • abs
  • acos
  • acosh
  • angle
  • asin
  • asinh
  • atan
  • atan2
  • atanh
  • cdouble
  • ceil
  • clamp_max
  • clamp_min
  • complex
  • conj
  • conj_physical
  • copysign
  • cos
  • cosh
  • deg2rad
  • digamma
  • double
  • erf
  • erfc
  • erfinv
  • exp
  • exp2
  • expm1
  • fill
  • floor
  • fmax
  • fmin
  • frac
  • frexp
  • i0
  • imag
  • ldexp
  • lgamma
  • log
  • log10
  • log1p
  • log2
  • logaddexp
  • logit
  • nan_to_num
  • neg
  • positive
  • polar
  • rad2deg
  • real
  • reciprocal
  • round
  • round_decimals_0
  • round_decimals_3
  • round_decimals_neg_3
  • rsqrt
  • sgn
  • sigmoid
  • sign
  • sin
  • sinc
  • sinh
  • special_entr
  • special_erfcx
  • special_i0e
  • special_i1
  • special_i1e
  • special_log_ndtr
  • special_ndtr
  • special_ndtri
  • special_polygamma_special_polygamma_n_0
  • special_xlog1py
  • sqrt
  • square
  • tan
  • tanh
  • trunc

Reductions and statistics

  • amax
  • amin
  • mean
  • nanmean
  • nansum
  • prod
  • std
  • std_unbiased
  • sum
  • var
  • var_unbiased

Neural-network functional wrappers

  • nn_functional_celu
  • nn_functional_elu
  • nn_functional_hardshrink
  • nn_functional_hardsigmoid
  • nn_functional_hardtanh
  • nn_functional_logsigmoid
  • nn_functional_mish
  • nn_functional_prelu
  • nn_functional_relu
  • nn_functional_relu6
  • nn_functional_rrelu
  • nn_functional_selu
  • nn_functional_silu
  • nn_functional_softplus
  • nn_functional_softshrink
  • nn_functional_softsign
  • nn_functional_tanhshrink
  • nn_functional_threshold

Special-function parameter families

  • mvlgamma_mvlgamma_p_1
  • mvlgamma_mvlgamma_p_3
  • mvlgamma_mvlgamma_p_5
  • polygamma_polygamma_n_0
  • polygamma_polygamma_n_1
  • polygamma_polygamma_n_2
  • polygamma_polygamma_n_3
  • polygamma_polygamma_n_4

Small tensor wrappers currently grouped here

  • cross
  • diagonal
  • householder_product
  • matrix_power
  • multi_dot
  • vander
  • vecdot

Notes On Future Splits

This shared note is intentionally broad in the first migration pass. Operations that later grow heavier derivation detail can be split into dedicated note files without changing the DB schema; only the central registry needs to move.