Module bgemm_faer

Module bgemm_faer 

Source
Expand description

Batched GEMM backend using the [faer] library. faer-backed batched GEMM kernel on strided views.

Uses faer::linalg::matmul::matmul for SIMD-optimized matrix multiplication. When dimension groups cannot be fused into 2D matrices (non-contiguous strides), copies operands to contiguous column-major buffers before calling faer.

Functionsยง

bgemm_contiguous_into
Batched GEMM on pre-contiguous operands.
bgemm_strided_into
Batched strided GEMM using faer: C = alpha * A * B + beta * C