Research

Pluggable Memory Pooling for Distributed AI Workloads

CXL-style memory aggregation across heterogeneous nodes, with allocation policies tuned for the activation-tensor traffic that AI workloads actually generate.

Lab

TsugiFabric

Status

Research

Filed patent

None

Companion

64/055,093

Mechanism

CXL-style memory aggregation across heterogeneous nodes, with allocation policies tuned for the activation-tensor traffic that AI workloads actually generate. The CXL pool primitive is referenced in Infinity (US Prov. 64/055,093) as an apparatus-layer element of the gradient consensus substrate, where the temporally coherent CXL.mem pool is indexed by a 64-bit regime identifier. This research line extends that substrate primitive to a pluggable memory pool where the allocation policy is workload-aware. Activation tensors with high re-read frequency are pinned to lower-latency tiers. Gradient checkpoints with low re-read frequency are eligible for far-tier pooling. The textbook CXL allocator is workload-agnostic. The research question is what falls out when allocation is informed by the activation-tensor reuse profile rather than by a generic LRU or interleave heuristic.

Why this matters

Activation-tensor traffic dominates training memory pressure at sub-datacenter scale. Allocator decisions made without modeling reuse frequency leave significant headroom on the table, particularly for adaptation passes where checkpoint placement and activation recomputation trade against each other.
CXL 3.0 and 3.1 ratification has opened up the pluggable-pool design point. The policy layer remains under-explored relative to the protocol layer. This is the surface where there is room for technical differentiation that does not depend on winning a protocol-stack race.
Status is Research. A filing decision is deferred until the allocation-policy work is empirically validated against a heterogeneous test bed. We do not want to anchor a claim on policy performance numbers we have not yet measured.

Status and what's next

Active research. Allocation-policy characterization is in progress against representative activation-tensor reuse profiles. Honest disclosure: there is no published benchmark on this surface at the time of writing. Expected output is a workshop preprint within the next 12 to 18 months, at which point the filing question will be re-opened with measured policy curves in hand rather than projections.