Skip to main content

Module kernel

Module kernel 

Source
Expand description

Block-based iteration engine for strided permutation operations.

Structs§

KernelPlan

Constants§

SMALL_TENSOR_THRESHOLD
Maximum total elements for the small tensor fast path.

Functions§

build_plan_fused
Build an execution plan with dimension fusion.
build_plan_fused_small
Simplified plan for small tensors that fit in L1 cache.
for_each_inner_block_preordered
Iterate over blocks with pre-ordered dimensions and initial offsets.
total_len
Utility: total number of elements.