Struct Tensor

Source

pub struct Tensor<T: Scalar> { /* private fields */ }

Expand description

Multi-dimensional dense tensor.

Tensor<T> is the primary data type in tenferro. It owns its data via DataBuffer and carries shape, strides, and memory space information.

§Zero-copy views

Operations like permute, broadcast, and diagonal return new Tensor values that share the same underlying data buffer, modifying only the dims/strides/offset metadata.

§Accessing raw data

Use DataBuffer::as_slice via Tensor::buffer combined with dims, strides, and offset to construct backend-specific views (e.g., StridedView in tenferro-prims).

§GPU async support

The event field tracks pending GPU computation via CompletionEvent. When a GPU operation produces a tensor, event is set to Some(...). Passing this tensor to another GPU operation chains via stream dependencies without CPU synchronization. Methods that access data from CPU call wait internally. For CPU tensors, event is always None with zero overhead.

See tenferro-einsum crate docs for async chaining examples.

Implementations§

Source §

impl<T: Scalar> Tensor<T>

Source

pub fn zeros( _dims: &[usize], _memory_space: LogicalMemorySpace, _order: MemoryOrder, ) -> Self

Create a tensor filled with zeros.

§Arguments

dims — Shape of the tensor (e.g., &[3, 4] for a 3×4 matrix)
memory_space — Logical memory space for the allocation
order — Memory layout for the new allocation

§Examples

use tenferro_tensor::{Tensor, MemoryOrder};
use tenferro_device::LogicalMemorySpace;

let a = Tensor::<f64>::zeros(
    &[3, 4],
    LogicalMemorySpace::MainMemory,
    MemoryOrder::ColumnMajor,
);

Source

pub fn ones( _dims: &[usize], _memory_space: LogicalMemorySpace, _order: MemoryOrder, ) -> Self

Create a tensor filled with ones.

§Arguments

dims — Shape of the tensor
memory_space — Logical memory space for the allocation
order — Memory layout for the new allocation

Source

pub fn from_slice( _data: &[T], _dims: &[usize], _order: MemoryOrder, ) -> Result<Self>

Create a tensor from a data slice.

The slice length must equal the product of dims. Data is copied into owned storage with the specified memory order. Memory space is set to [LogicalMemorySpace::MainMemory].

§Errors

Returns an error if data.len() does not match the product of dims.

Source

pub fn from_vec( _data: Vec<T>, _dims: &[usize], _strides: &[isize], _offset: isize, ) -> Result<Self>

Create a tensor from an owned Vec<T> with explicit layout.

Takes ownership of the data. The caller specifies the dims, strides, and offset that describe how the data is laid out.

§Errors

Returns an error if the layout is inconsistent with the data length.

§Examples

use tenferro_tensor::Tensor;

// 2×3 column-major: strides [1, 2], offset 0
let data = vec![1.0, 2.0, 3.0, 4.0, 5.0, 6.0];
let t = Tensor::<f64>::from_vec(data, &[2, 3], &[1, 2], 0).unwrap();

Source

pub fn eye( _n: usize, _memory_space: LogicalMemorySpace, _order: MemoryOrder, ) -> Self

Create an identity matrix.

Returns a 2D tensor of shape [n, n] with ones on the diagonal and zeros elsewhere.

§Examples

use tenferro_tensor::{Tensor, MemoryOrder};
use tenferro_device::LogicalMemorySpace;

let id = Tensor::<f64>::eye(3,
    LogicalMemorySpace::MainMemory, MemoryOrder::ColumnMajor);
assert_eq!(id.dims(), &[3, 3]);

Source

pub fn dims(&self) -> &[usize]

Returns the shape (size of each dimension).

Source

pub fn strides(&self) -> &[isize]

Returns the strides (in units of T).

Source

pub fn offset(&self) -> isize

Returns the element offset into the data buffer.

Source

pub fn buffer(&self) -> &DataBuffer<T>

Returns a reference to the underlying data buffer.

Source

pub fn buffer_mut(&mut self) -> &mut DataBuffer<T>

Returns a mutable reference to the underlying data buffer.

Source

pub fn ndim(&self) -> usize

Returns the number of dimensions (rank).

Source

pub fn len(&self) -> usize

Returns the total number of elements.

Source

pub fn is_empty(&self) -> bool

Returns true if the tensor has zero elements.

Source

pub fn logical_memory_space(&self) -> LogicalMemorySpace

Returns the logical memory space where this tensor’s data resides.

Source

pub fn preferred_compute_device(&self) -> Option<ComputeDevice>

Returns the preferred compute device override, if set.

Source

pub fn set_preferred_compute_device(&mut self, device: Option<ComputeDevice>)

Set the preferred compute device override.

When set, this device will be used for operations on this tensor instead of the default device selected by preferred_compute_devices. Pass None to clear the override and revert to automatic selection.

Source

pub fn effective_compute_devices( &self, _op_kind: OpKind, ) -> Result<Vec<ComputeDevice>>

Return the effective compute devices for a given operation kind.

If a preferred compute device is set, returns a single-element vector containing that device. Otherwise, delegates to preferred_compute_devices.

§Errors

Returns an error if no compatible compute device is found.

Source

pub fn tensor_view(&self) -> TensorView<'_, T>

Returns a TensorView for data inspection.

Waits for any pending accelerator computation before returning. The returned view has event = None (data is ready to read).

Source

pub fn permute(&self, _perm: &[usize]) -> Result<Tensor<T>>

Permute (reorder) the dimensions of the tensor.

This is a zero-copy operation that only modifies dims and strides. Waits for any pending accelerator computation before returning.

§Arguments

perm — Permutation of dimension indices (e.g., &[1, 0] to transpose)

§Errors

Returns an error if perm is not a valid permutation of 0..ndim().

Source

pub fn broadcast(&self, _target_dims: &[usize]) -> Result<Tensor<T>>

Broadcast the tensor to a larger shape.

Dimensions of size 1 are expanded to the target size (zero-copy via stride 0). This is a zero-copy metadata operation.

§Errors

Returns an error if target_dims is incompatible with the current shape.

Source

pub fn diagonal(&self, _axes: &[(usize, usize)]) -> Result<Tensor<T>>

Extract a diagonal view by merging pairs of axes.

For each (axis_i, axis_j) pair, the two dimensions are replaced by a single diagonal dimension. This is a zero-copy stride trick.

§Errors

Returns an error if any axis is out of range or the paired dimensions have different sizes.

Source

pub fn reshape(&self, _new_dims: &[usize]) -> Result<Tensor<T>>

Reshape the tensor to a new shape.

The total number of elements must remain the same. Requires contiguous data; returns an error if the tensor is not contiguous.

§Errors

Returns an error if the tensor is not contiguous or the new shape has a different total element count.

Source

pub fn select(&self, _dim: usize, _index: usize) -> Result<Tensor<T>>

Select a single index along a dimension, removing that dimension.

Returns a tensor with ndim() - 1 dimensions. This is a zero-copy operation that adjusts the offset and removes the selected dimension.

§Errors

Returns an error if dim >= ndim() or index >= dims()[dim].

§Examples

use tenferro_tensor::{Tensor, MemoryOrder};
use tenferro_device::LogicalMemorySpace;

// Batched matrices [m, n, batch] = [3, 4, 10]
let a = Tensor::<f64>::zeros(&[3, 4, 10],
    LogicalMemorySpace::MainMemory, MemoryOrder::ColumnMajor);
// Select batch index 5 → [3, 4]
let mat = a.select(2, 5).unwrap();
assert_eq!(mat.dims(), &[3, 4]);

Source

pub fn narrow( &self, _dim: usize, _start: usize, _length: usize, ) -> Result<Tensor<T>>

Narrow (slice) a dimension to a sub-range.

Returns a tensor with the same number of dimensions, but dims()[dim] reduced to length. Zero-copy: only offset and dim size change.

§Errors

Returns an error if dim >= ndim() or start + length > dims()[dim].

§Examples

use tenferro_tensor::{Tensor, MemoryOrder};
use tenferro_device::LogicalMemorySpace;

let a = Tensor::<f64>::zeros(&[3, 10],
    LogicalMemorySpace::MainMemory, MemoryOrder::ColumnMajor);
// Take columns 2..5 → [3, 3]
let sub = a.narrow(1, 2, 3).unwrap();
assert_eq!(sub.dims(), &[3, 3]);

Source

pub fn contiguous(&self, _order: MemoryOrder) -> Tensor<T>

Return a contiguous copy of this tensor in the given memory order.

If the tensor is already contiguous in the requested order, this may avoid copying (implementation-defined).

Source

pub fn into_contiguous(self, _order: MemoryOrder) -> Tensor<T>

Consume this tensor and return a contiguous version.

If the tensor is already contiguous in the requested order, returns self without copying or allocating. Otherwise, copies data into a new contiguous buffer.

Prefer this over contiguous when you no longer need the original tensor, as it avoids unnecessary allocation and reference-count overhead.

§Examples

use tenferro_tensor::{Tensor, MemoryOrder};
use tenferro_device::LogicalMemorySpace;

let a = Tensor::<f64>::zeros(
    &[3, 4],
    LogicalMemorySpace::MainMemory,
    MemoryOrder::ColumnMajor,
);

// Transpose creates a non-contiguous view
let at = a.permute(&[1, 0]).unwrap();
assert!(!at.is_contiguous());

// into_contiguous copies only when necessary
let at_contig = at.into_contiguous(MemoryOrder::ColumnMajor);
assert!(at_contig.is_contiguous());

// Already contiguous: zero-cost passthrough
let b = Tensor::<f64>::zeros(
    &[3, 4],
    LogicalMemorySpace::MainMemory,
    MemoryOrder::RowMajor,
);
let b2 = b.into_contiguous(MemoryOrder::RowMajor); // no copy

Source

pub fn is_contiguous(&self) -> bool

Returns true if the tensor data is contiguous in memory.

A tensor is contiguous if its elements occupy a dense block of memory with no gaps, in either column-major or row-major order.

Source

pub fn conj(&self) -> Tensor<T>
where T: Conjugate,

Return a tensor with complex-conjugated elements.

For real types (f32, f64), returns a copy unchanged. For complex types (Complex32, Complex64), negates the imaginary part.

§Examples

use tenferro_tensor::{Tensor, MemoryOrder};
use num_complex::Complex64;

let data = vec![Complex64::new(1.0, 2.0), Complex64::new(3.0, -4.0)];
let a = Tensor::from_slice(&data, &[2], MemoryOrder::ColumnMajor).unwrap();
let a_conj = a.conj();
// a_conj contains [1.0 - 2.0i, 3.0 + 4.0i]

Source

pub fn into_conj(self) -> Tensor<T>
where T: Conjugate,

Consume this tensor and return one with complex-conjugated elements.

Like conj but consumes self, potentially reusing the buffer if no other references exist.

Source

pub fn tril(&self, _diagonal: isize) -> Tensor<T>

Extract the lower triangular part of a matrix.

Returns a new tensor with elements above the diagonal-th diagonal set to zero. For batched tensors (m, n, *), applies independently to each batch element.

diagonal = 0: main diagonal (default)
diagonal > 0: above main diagonal
diagonal < 0: below main diagonal

§Examples

use tenferro_tensor::{Tensor, MemoryOrder};
use tenferro_device::LogicalMemorySpace;

let a = Tensor::<f64>::ones(&[3, 3],
    LogicalMemorySpace::MainMemory, MemoryOrder::ColumnMajor);
let lower = a.tril(0);
// [[1, 0, 0],
//  [1, 1, 0],
//  [1, 1, 1]]

Source

pub fn triu(&self, _diagonal: isize) -> Tensor<T>

Extract the upper triangular part of a matrix.

Returns a new tensor with elements below the diagonal-th diagonal set to zero. For batched tensors (m, n, *), applies independently to each batch element.

diagonal = 0: main diagonal (default)
diagonal > 0: above main diagonal
diagonal < 0: below main diagonal

§Examples

use tenferro_tensor::{Tensor, MemoryOrder};
use tenferro_device::LogicalMemorySpace;

let a = Tensor::<f64>::ones(&[3, 3],
    LogicalMemorySpace::MainMemory, MemoryOrder::ColumnMajor);
let upper = a.triu(0);
// [[1, 1, 1],
//  [0, 1, 1],
//  [0, 0, 1]]

Source

pub fn to_memory_space_async( &self, _target: LogicalMemorySpace, ) -> Result<Tensor<T>>

Asynchronously transfer this tensor to a different memory space.

Returns a new tensor in the target memory space. If the source and destination spaces are the same, returns a zero-copy no-op. Otherwise, data is copied (potentially asynchronously for GPU).

§Errors

Returns an error if the transfer is not supported.

Source

pub fn wait(&self)

Wait for any pending GPU computation to complete.

No-op for CPU tensors or when GPU computation has already completed. Methods that access tensor data from CPU call this internally, so explicit calls are only needed when the caller wants to ensure completion at a specific point.

§Examples

// GPU einsum returns immediately with pending event
let c = einsum("ij,jk->ik", &[&a_gpu, &b_gpu]).unwrap();
assert!(!c.is_ready());

// Explicit wait
c.wait();
assert!(c.is_ready());

// Chaining: implicit sync via stream dependencies, no CPU wait
let d = einsum("ij,jk->ik", &[&c, &e_gpu]).unwrap();
//  → detects c.event → chains on GPU → returns immediately

Source

pub fn is_ready(&self) -> bool

Check if tensor data is ready without blocking.

Returns true for CPU tensors (always ready) and for GPU tensors whose computation has completed. Returns false if a GPU operation is still in progress.