Crate strided_kernel

Crate strided_kernel 

Source
Expand description

Cache-optimized kernels for strided multidimensional array operations.

This crate is a Rust port of Julia’s Strided.jl and StridedViews.jl libraries, providing efficient operations on strided multidimensional array views.

§Core Types

§Primary API (view-based, Julia-compatible)

§Map Operations

§Reduce Operations

§Basic Operations

§Example

use strided_kernel::{StridedView, StridedViewMut, StridedArray, Identity, map_into};

// Create a column-major array (Julia default)
let src = StridedArray::<f64>::from_fn_col_major(&[2, 3], |idx| {
    (idx[0] * 10 + idx[1]) as f64
});
let mut dest = StridedArray::<f64>::col_major(&[2, 3]);

// Map with view-based API
map_into(&mut dest.view_mut(), &src.view(), |x| x * 2.0).unwrap();
assert_eq!(dest.get(&[1, 2]), 24.0); // (1*10 + 2) * 2

§Cache Optimization

The library uses Julia’s blocking strategy for cache efficiency:

  • Dimensions are sorted by stride magnitude for optimal memory access
  • Operations are blocked into tiles fitting L1 cache (BLOCK_MEMORY_SIZE = 32KB)
  • Contiguous arrays use fast paths bypassing the blocking machinery

Modules§

view
Julia-like dynamic-rank strided view types.

Structs§

Adjoint
Adjoint operation: f(x) = adjoint(x) = conj(transpose(x)) For scalar numbers, this is conj.
Conj
Complex conjugate operation: f(x) = conj(x)
Identity
Identity operation: f(x) = x
StridedArray
Owned strided multidimensional array.
StridedView
Dynamic-rank immutable strided view with lazy element operations.
StridedViewMut
Dynamic-rank mutable strided view.
Transpose
Transpose operation: f(x) = transpose(x) For scalar numbers, this is identity. For matrix elements, this would transpose each element.

Enums§

StridedError
Errors that can occur during strided array operations.

Constants§

BLOCK_MEMORY_SIZE
Block memory size for cache-optimized iteration (L1 cache target).
CACHE_LINE_SIZE
Cache line size in bytes.

Traits§

ComposableElementOp
Trait for element operations that support type-level composition.
Compose
Helper trait for composing two ElementOp types.
ElementOp
Trait for element-wise operations applied to strided views.
ElementOpApply
Trait for types that support element operations (conj, transpose, adjoint).
MaybeSend
Equivalent to Send when parallel is enabled; blanket-impl otherwise.
MaybeSendSync
Equivalent to Send + Sync when parallel is enabled; blanket-impl otherwise.
MaybeSimdOps
Trait for types that may have SIMD-accelerated sum/dot operations.
MaybeSync
Equivalent to Sync when parallel is enabled; blanket-impl otherwise.

Functions§

add
Element-wise addition: dest[i] += src[i].
axpy
AXPY: dest[i] = alpha * src[i] + dest[i].
col_major_strides
Compute column-major strides (Julia default: first index varies fastest).
copy_conj
Copy with complex conjugation: dest[i] = conj(src[i]).
copy_into
Copy elements from source to destination: dest[i] = src[i].
copy_scale
Copy with scaling: dest[i] = scale * src[i].
copy_transpose_scale_into
Copy with transpose and scaling: dest[j,i] = scale * src[i,j].
dot
Dot product: sum(OpA::apply(a[i]) * OpB::apply(b[i])).
fma
Fused multiply-add: dest[i] += OpA::apply(a[i]) * OpB::apply(b[i]).
map_into
Apply a function element-wise from source to destination.
mul
Element-wise multiplication: dest[i] *= src[i].
reduce
Full reduction with map function: reduce(init, op, map.(src)).
reduce_axis
Reduce along a single axis, returning a new StridedArray.
row_major_strides
Compute row-major strides (C default: last index varies fastest).
sum
Sum all elements: sum(src).
symmetrize_conj_into
Conjugate-symmetrize a square matrix: dest = (src + conj(src^T)) / 2.
symmetrize_into
Symmetrize a square matrix: dest = (src + src^T) / 2.
zip_map2_into
Binary element-wise operation: dest[i] = f(a[i], b[i]).
zip_map3_into
Ternary element-wise operation: dest[i] = f(a[i], b[i], c[i]).
zip_map4_into
Quaternary element-wise operation: dest[i] = f(a[i], b[i], c[i], e[i]).

Type Aliases§

Result
Result type for strided array operations.