Module bgemm_naive

Module bgemm_naive 

Source
Expand description

Batched GEMM fallback using explicit loops. Naive batched GEMM kernel on strided views.

Operates on N-dimensional permuted views where dimensions are grouped as:

  • A: [lo…, sum…, batch…]
  • B: [sum…, ro…, batch…]
  • C: [lo…, ro…, batch…]

Functions§

bgemm_strided_into
Batched strided GEMM: C = alpha * A * B + beta * C
bgemm_strided_into_with_map
Batched strided GEMM with closure-based element mapping: C = alpha * map_a(A) * map_b(B) + beta * C