tenferro-rs Crate Architecture

Repo: tenferro-rs Parent: ../index.md Related: computegraph.md, primitive-ad.md, tidu.md, ../spec/backend-contract.md, ../spec/primitive-catalog.md, ../spec/extension-op.md

I. Purpose

This document records the current workspace crate split and dependency boundaries. The workspace intentionally has no root tenferro facade crate: users import the crates that own the layer or operation family they need.

The design goal is to keep these concerns separate:

host tensor data model
concrete runtime tensors and backend traits
CPU/GPU backend implementations
traced graph construction and execution
eager and traced AD APIs
experimental XLA/PJRT lowering for static traced programs
first-class operation families such as einsum, linalg, and FFT
internal graph operation vocabulary and primitive metadata

The overview diagram focuses on the native CPU/GPU runtime and extension stack. tenferro-xla is a peer executor over compiled static programs, not a TensorBackend inside that stack.

II. Current Workspace Crates

Crate	Role
`tenferro-tensor-core`	Rank/layout metadata, dtype tags, scalar trait, and host-only tensor adapters
`tenferro-tensor`	Runtime `TypedTensor<T, R>`/`Tensor` values, typed views, backend traits, and backend-independent contracts
`tenferro-cpu`	CPU backend, CPU execution sessions, CPU kernels, buffer pools, and CPU provider selection
`tenferro-gpu`	CubeCL/CUDA backend and GPU transfer helpers
`tenferro-runtime`	Concrete tensor helpers, traced tensors, graph compilation/execution, extension runtime registration, and extension cache storage
`tenferro-xla`	Experimental StableHLO lowering and runtime-loaded PJRT plugin support for static-shaped traced programs
`tenferro-ad`	Eager runtime, eager tensors, and traced AD extension traits
`tenferro-einsum`	Subscripts, contraction planning, concrete/traced/eager einsum APIs, extension runtime, and AD rule
`tenferro-linalg`	Linear algebra traced APIs, eager helpers, extension runtime, and optional linalg AD rules
`tenferro-fft`	FFT extension runtime and public concrete/traced FFT APIs
`tenferro-core-ops`	Internal core primitive operation catalog used by graph, runtime, and backend dispatch
`tenferro-internal-ops`	Graph op vocabulary and AD rule implementations
`tenferro-internal-extension-macros`	Procedural macros for extension-op registration

The public user-facing crates are tenferro-tensor-core, tenferro-tensor, tenferro-cpu, tenferro-gpu, tenferro-runtime, tenferro-ad, tenferro-xla, tenferro-einsum, tenferro-linalg, and tenferro-fft. Crates with internal in their name are implementation crates and should not be presented as user-facing API surfaces.

III. Layering

Layer 4: tenferro-ad
         eager runtime, eager tensors, traced AD extension traits

Layer 3: tenferro-runtime
         concrete tensor helpers, traced tensors, graph compilation/execution,
         extension runtime registration, extension cache storage

         tenferro-xla
         experimental StableHLO/PJRT peer executor for static programs

         tenferro-einsum / tenferro-linalg / tenferro-fft
         first-class operation-family crates

Layer 2: tenferro-tensor
         runtime TypedTensor<T, R>/Tensor values, typed views,
         backend traits, backend-independent contracts

         tenferro-cpu
         CPU backend, CPU execution sessions, CPU kernels, buffer pools

         tenferro-gpu
         CubeCL/CUDA backend and GPU transfer helpers

Layer 1: tenferro-tensor-core
         rank/layout metadata, dtype tags, scalar trait,
         host-only tensor adapters

Internal: tenferro-core-ops
          core primitive operation catalog

          tenferro-internal-ops
          graph op vocabulary and AD rule implementations

          tenferro-internal-extension-macros
          extension-op registration macros

IV. Dependency Direction

The dependency direction is deliberately one-way. Arrows below mean “depends on”:

tenferro-tensor           -> tenferro-tensor-core, tenferro-core-ops
tenferro-cpu              -> tenferro-tensor
tenferro-gpu              -> tenferro-tensor, tenferro-core-ops
tenferro-internal-ops     -> tenferro-tensor, tenferro-core-ops,
                              tenferro-internal-extension-macros
tenferro-runtime          -> tenferro-tensor, tenferro-core-ops,
                              tenferro-internal-ops
tenferro-xla              -> tenferro-runtime, tenferro-internal-ops,
                              tenferro-tensor
tenferro-ad               -> tenferro-runtime, tenferro-internal-ops,
                              tenferro-tensor, tenferro-cpu

tenferro-einsum           -> tenferro-runtime, tenferro-internal-ops,
                              tenferro-tensor, tenferro-cpu
tenferro-linalg           -> tenferro-runtime, tenferro-internal-ops,
                              tenferro-tensor, tenferro-cpu
tenferro-fft              -> tenferro-runtime, tenferro-internal-ops,
                              tenferro-tensor

Additional internal dependencies:

tenferro-runtime, tenferro-tensor, tenferro-gpu, and tenferro-internal-ops use tenferro-core-ops for primitive metadata.
tenferro-einsum, tenferro-linalg, and tenferro-fft depend on tenferro-runtime for extension application and runtime registration.
tenferro-xla depends on tenferro-runtime to read compiled programs and owns its own runtime-loaded PJRT boundary. It must not make XLA/PJRT a compile-time dependency of tenferro-runtime.
Operation-family AD support depends on tenferro-ad only when the crate’s autodiff feature is enabled.
CUDA support flows through explicit cuda features and tenferro-gpu.

Rules:

tenferro-tensor-core must remain backend-independent and must not depend on GPU, BLAS/LAPACK provider crates, backend buffers, runtime caches, or AD.
tenferro-tensor-core must not expose public TypedTensor aliases. Backend-capable typed tensors are owned by tenferro-tensor.
tenferro-tensor owns concrete runtime tensor values, arbitrary-stride typed views, backend traits, and backend-independent contracts.
tenferro-cpu owns CpuBackend, CpuContext, CPU execution sessions, CPU kernels, buffer pools, and CPU provider selection.
tenferro-gpu owns GPU backend implementation and transfer helpers.
tenferro-runtime owns graph construction, compilation, execution, extension runtime registration, and extension cache ownership.
tenferro-xla owns StableHLO lowering, explicit host-layout conversion at the XLA boundary, and runtime PJRT plugin loading.
tenferro-ad owns eager AD surfaces and traced AD helper APIs. Primitive AD rule implementations remain in tenferro-internal-ops/src/ad/.
Device, placement, and error concepts are owned by the crate that uses them: tenferro-tensor owns backend-independent tensor contracts, tenferro-gpu owns GPU backend details, and each operation-family crate owns operation-specific parse/planning errors.
Standard operation families stay in direct crates. Do not add a root tenferro facade path for them.

V. Extension Boundary

Operation families are first-class crates, not modules hidden behind a facade.

tenferro-einsum, tenferro-linalg, and tenferro-fft own their public operation APIs, extension payloads, shape/dtype inference, runtime registration, and operation-specific tests. When they support AD, they register AD rules for their extension families instead of adding those rules to a monolithic tensor type.

Custom external crates follow the same pattern:

Implement an extension operation payload.
Expose a small Rust API around it.
Register an extension runtime with tenferro-runtime when execution needs family-specific runtime state or caches.
Register AD rules through tenferro-ad when gradients are supported.

See ../guides/custom-operations.md and ../spec/extension-op.md.

VI. AD Boundary

tidu-rs owns the generic Primitive contract and graph transforms such as linearize and linear_transpose. It is not tied to tenferro’s concrete tensor types.

tenferro supplies one concrete graph vocabulary, StdTensorOp, plus operation-family extension carriers. Core primitive AD rule implementations live in tenferro-internal-ops/src/ad/. Extension-family AD rules live with the operation family that owns the semantics, for example tenferro-einsum or tenferro-linalg.

This is the current split:

tidu-rs                Primitive contract and generic AD graph transforms
tenferro-internal-ops  StdTensorOp AD rules
tenferro-ad            public eager/traced AD APIs
operation crates       optional extension-family AD rules

VII. Documentation Notes

Historical files under docs/plans/ may describe older crate names and should not be treated as current architecture. This file, REPOSITORY_RULES.md, AGENTS.md, and the crate Cargo.toml files should agree on the live crate split.