SPRIND · Next Frontier AI Challenge

Submission Form

Recurrent Intelligence β€” SPRIND Next Frontier AI Challenge, working submission document.

Color coding β€” tag each field/answer:


1. Your Personal Information βšͺ

Field Value
Salutation (Ms. / Mr. / not specified)
Title (Dr. / Prof. / Prof. Dr.)
Last name
First name
Email
Phone number
Your position (e.g. project manager, CEO)
Institution (dropdown)
Institution (other)
Legal Form (None / Ltd. / GmbH / UG / GbR / KG / AG / Individual Enterprise / Other)
Legal Form (other)
Address line 1
Address line 2
Postal Code
City
Country
Country (other)
Did you know about SPRIND before learning about the Next Frontier AI Initiative? (Yes/No)
Where did you first learn that SPRIND exists? (dropdown)
Please specify where you heard about us.
What made you decide to actually apply to this Challenge? (dropdown)
Please specify why you decided to actually apply to this Challenge.

2. Your Solution

Project Title (text, 50 chars) πŸ”΅

Recurrent Intelligence

Short Description (textarea, 500 chars) πŸ”΅

Recurrent networks are the efficient path to frontier AI. We make recurrent models frontier-grade, with constant per-step compute, no growing context window and memory in the state, at order-of-magnitude better efficiency. We already show structured sparsity beating dense on RNNs, and predict active neuron blocks to run up to 26x faster on GPUs. We target frontier-scale LLMs (Nemotron-class) and systems as a modality, always-on prediction of dynamic systems, as in-house SOTA on small servers.

Frontier Dimension (textarea, 500 chars) πŸ”΅

Our bet is that the world is continuous, so it needs continuous AI, and that no single method wins, the answer is a combination of proven components and new mechanisms. We want to make recurrent networks frontier-grade and efficient, with history in a fixed state, not a growing context window, for bounded memory, predictable per-step compute and unbounded streaming. The capability this unlocks is systems as a modality, the always-on understanding and prediction of dynamic systems.

Core Idea & Architecture (textarea, 3000 chars) πŸ”΅

The system is a recurrent, state-based architecture for efficient long-horizon inference under always-on, low-SWaP constraints. Its showcase capability is systems as a modality, the continuous understanding and prediction of a dynamic system from its multi-sensor stream. The central substrate is a fixed-size latent state updated at each timestep. Unlike transformer inference, history is not represented by an expanding context window or repeated attention over stored tokens. Information is compressed into the evolving state, which enables streaming-native execution with bounded per-step memory and compute, independent of the processed sequence length.

The data flow runs from input stream, to recurrent state update, to a state-conditioned router, to the selected computation blocks, to the updated state and a task-specific output. At each timestep the model receives the current input and the previous latent state. A selective recurrent core, based on SSM/Mamba, S4/S5, HiPPO, spiking or xLSTM-style dynamics, updates the persistent state. A router then selects a bounded, hardware-compatible subset of computation blocks as a function of both current content and accumulated state. This is state-conditioned structured sparsity: only the active sub-network runs, and the selected path can change within a sequence as the inferred context changes.

The computation blocks can include expressive recurrent neurons, multi-timescale cells, recursive or hierarchical reasoning modules, structured sparse operators, and, where empirically justified, optimised attention or MLP blocks. These modules are not a fixed catalogue. Each is kept only if controlled ablations show measurable gains in latency, accuracy, stability, long-horizon retention or energy efficiency.

The key interaction is that recurrence does not only store information, it also controls computation. The evolving state is both compressed memory and the control variable for routing, so prior trajectory affects which blocks activate later. This yields path-dependent computation and a content-addressed memory interface.

The output layer maps the updated state and the selected block activations to task-specific predictions, such as detection, classification, correction or control signals. Because labelled system data is scarce, datasets are produced to defined requirements, using small targeted sets and design partners, rather than assumed to exist. The architecture targets dynamic edge and mobile deployment, where real-time inference, bounded memory, predictable compute and energy efficiency are required. Core claims are evaluated by benchmarking against dense transformers, SSMs, Mamba, MoE-Mamba, BlackMamba-style sparse recurrent models and hybrid baselines. Falsification tests target long-context retention, routing utility, realised sparsity, compute scaling, latency, energy per token and failure-prevention performance.

Technical Novelty (textarea, 2000 chars) πŸ”΅

The novelty is a frontier-grade recurrent network. We want to solve the long-standing problems of recurrent models, such as recall, stability and efficiency at scale, so a fixed-state recurrent model reaches SOTA quality at order-of-magnitude better efficiency than dense transformers, with constant per-step compute, bounded memory and unbounded streaming. The capability this unlocks is systems as a modality, the continuous understanding and prediction of a dynamic system from its endless multi-sensor stream, which no transformer can serve structurally.

We aim to reach this by composing proven components with new mechanisms, not by any single trick, and we use attention or MLP blocks where they measurably help rather than avoiding them. The substrate is selective state-space and recurrent dynamics (SSM, S4/S5, Mamba, Mamba-2, xLSTM, StateX), which carry history in a fixed evolving state. Onto this we add expressive and multi-timescale units (ELM-style), recursive or hierarchical reasoning (GRAM, HRM, TRM, Titans), and conditional computation and routing (MoE, Switch, capsule routing, MoE-Mamba, BlackMamba, Routing-Mamba).

One new mechanism with early evidence is state-conditioned, path-dependent routing, where the active sub-network is selected from the accumulated recurrent state, so the computation path can change mid-sequence and the state doubles as a content-addressed memory. Cue-switch: 99% stateful versus below 70% stateless. It is one tool in the kit, not the thesis.

The capability gap is efficient long-horizon inference at bounded memory and predictable compute. Transformers keep growing context windows, KV-caches and repeated attention; pure SSMs stream efficiently but compress history and show recall gaps. By combining recurrent state, content-addressed memory and accelerator-friendly sparse activation, we target lower latency, lower energy per token and sub-quadratic compute versus length, for always-on, in-house performance on small servers.

Technical Novelty Citation (textarea, 1000 chars) πŸ”΅

Representative prior art. Selective and state-space recurrence: Mamba, Mamba-2, xLSTM, StateX [1–4]. Routing and conditional computation: Capsules, Sparsely-Gated MoE, RIMs, MoE-Mamba, BlackMamba, Routing-Mamba, Swimba [5–11]. Expressive recurrent units: ELM [12]. Recursive and memory-augmented reasoning: HRM, TRM, GRAM, Titans [13–16]. Hybrid recurrent-attention systems: Jamba and Nemotron-H [17–18]. Neuromorphic and event-driven systems: Loihi 2, SpiNNaker2, SpikingBrain [19–21]. Our delta is state-conditioned routing, where the accumulated recurrent state, rather than a token-local embedding alone, selects the active computation path and the memory address.

[1] Mamba; [2] Mamba-2; [3] xLSTM; [4] StateX. [5] Capsules; [6] Sparsely-Gated MoE; [7] RIMs; [8] MoE-Mamba; [9] BlackMamba; [10] Routing-Mamba; [11] Swimba. [12] ELM; [13] HRM; [14] TRM; [15] GRAM; [16] Titans. [17] Jamba; [18] Nemotron-H; [19] Loihi 2; [20] SpiNNaker2; [21] SpikingBrain.

Capability Gap Addressed (textarea, 1000 chars) πŸ”΅

The new functionality comes from efficiency, not from copying transformers. A fixed-state recurrent model gives always-on streaming at bounded per-step compute and memory, which is far leaner than an attention or KV-cache cost that grows with sequence length [16]. SSM and Mamba models stream cheaply but compress history and show recall gaps [1,4], which we want to fix. This unlocks systems as a modality, the continuous understanding and prediction of dynamic systems, as in-house performance on small servers for security and critical infrastructure, which transformers cannot serve structurally. Path-dependent computation lets the active sub-network switch mid-sequence as the inferred state changes (cue-switch: 99% stateful versus below 70% stateless). Recursive models also show that reasoning can arise from architectural depth at small parameter counts [13–15].

Existing Artifacts (textarea, 2000 chars) πŸ”΅

These artifacts are the validated tools and R&D machinery we use to build the target system, not the product. The product is a novel frontier-grade recurrent model, showcased by systems as a modality. The items below are proven components, evidence and tooling we compose with new mechanisms (stateful routing among them).

Live dashboard: routed-SSM, kernels, ablations. Versioned JSONs in the repo. [GAP: GitHub repo and preprint links.]

Technology Readiness Level (TRL) Assessment (textarea, 1000 chars) πŸ”΅

Overall system TRL 2–3. The concept is formulated and backed by component-level precedents, but the integrated architecture is experimental and needs validation under controlled benchmarks. Sub-components:

Open Research Risks (textarea, 1000 chars) πŸ”΅

The research-grade risks centre on the routing mechanism, the memory interface and integrated-system scaling. Stateful routing may not beat stateless MoE or dense recurrent baselines beyond toy tasks. The routed state may be too compressed or unstable to act as a reliable content-addressed memory. Training may suffer routing collapse, gradient instability or state drift.

At systems level, sparse state-conditioned execution may not yield real latency or energy gains if routing irregularity causes poor GPU utilisation, and the sub-quadratic compute versus length hypothesis is empirical and may fail under realistic hardware.

On benchmarks and data, gains on cue-switch or synthetic long-context may not transfer to real streaming or critical-system workloads, and labelled system data is scarce. We mitigate with defined data requirements, small targeted datasets and design partners, and test everything with ablations, dense and stateless baselines, scaling curves and hardware profiling.

Compute Requirements (textarea, 1000 chars) πŸ”΅πŸŸ’

Frontier-grade capability needs substantial compute, scaled across the three stages. Stage 1 funds validation, ablation and progressively larger demonstrators. We budget roughly 300,000 to 500,000 H100-hours, about 96 to 160 H100 GPUs over the seven-month phase, which is about €1.0 to 1.6M at competitive cloud or European pricing. This is a major share of the €3M envelope, alongside personnel and engineering, with a 2 to 3x reserve for failed runs and sweeps.

We avoid a single brute-force run. Instead we establish controlled scaling curves and several model families up to about 13B parameters, benchmarked against transformer, SSM, Mamba and hybrid baselines, to prove architectural leverage. Full frontier scale needs an order of magnitude more compute, secured in Stages 2 and 3 through long-term procurement and SPRIND bulk allocations.

Hardware is H100-class via cloud, European providers or a bulk SPRIND deal, plus a one-time local workstation GPU spend of about €30 to 100k.

KPIs, Benchmarks and Potential Impact (textarea, 1000 chars) πŸ”΅πŸŸ’

KPIs follow efficiency, performance and long-horizon operation: accuracy parity/improvement of routed-sparse vs dense at matched FLOPs; decode latency and throughput (tokens/s) vs compiled dense transformer and SSM baselines; energy per token; max stable stream length at bounded memory; realised sparsity and active-block count; and compute vs sequence length (sub-quadratic test).

Benchmarks include long-context retrieval, associative recall, cue-switching, streaming classification and real-time prediction, final suite chosen in Stage 1.

The lead use case and highest impact is systems as a modality, the continuous understanding and prediction of dynamic systems (power-grid, factory, medical), as always-on, in-house performance on small servers, plus infinite-context assistants and frontier-scale language, code and reasoning under European energy and compute constraints. Impact is assessed by matching benchmarked strengths to use cases, then validating with a design partner.

Work Plan (textarea, 4000 chars) 🟒

The project covers research and commercialisation across all three stages.

Stage 1 (7 months, €3M), validate the training and kernel approach. Proof with a first model, and potential use cases plus a partner for co-development identified. Stage 2 (8 months, €8M), scale to larger recurrent models. Integration tests run inside co-development projects. Stage 3 (9 months, €15.5M), application and use case. Real-time AI on the edge (audio and sensor streams), benchmarks against Transformer SOTA and neuromorphic hardware, and a demo of the use case with a partner.

Approach (research method, all stages). Three intertwined modes. First, build on the field by adapting SSMs, conditional computation, routing and capsule networks, expressive neurons, recursive reasoning and sparsity. Second, do original research to develop new mechanisms, such as state and path-dependent computation, new recurrent cells, routing schemes and efficient sparse compute. A specific bet is to attack the memory and recall problem with feedback cycles (iterative, top-down recurrence) and inference-time selective weight adjustments (test-time weight updates and fast weights). Third, run rigorous empirical science: implement, benchmark on the triad of efficiency, performance and unbounded context against a compiled baseline, profile the real bottleneck, try to falsify, and keep only what survives. This is accelerated by custom AI research tooling for automated experimentation, benchmarking and rapid small-model prototyping.

Implementation and commercialisation. The lead use case is systems as a modality, the continuous understanding, prediction and observation of dynamic systems (power-grid, factory, medical). Each validated approach is benchmarked, matched to use-case requirements, and taken to co-development partners to build one real, validated use case rather than a generic LLM replacement.

Hardware path. GPU-efficient models are the Stage-1 priority. Neuromorphic-chip partnerships are a longer-term lever, and chip-design itself is out of Stage-1 scope.

Team and resourcing. Stage 1 runs on a small, focused team of 5 FTE, the core team full-time plus experts who support specific technical topics, business operations and go-to-market as part-time or temporary hires, subcontractors or advisors. For Stage 2 we grow the team and extend technical, operational and go-to-market skills, which scales R&D with parallel commercialisation.

Business operations from day one. We form a European legal entity (UG minimum) early in Stage 1, budget at least half an FTE for business operations (finance, controlling, HR), use external providers for HR, accounting and IP-law, and include an IP-protection budget and a post-grant follow-up and financing plan. (Operations and Economic-Viability prose: Jana.)

Collaboration and subcontracting. Research advisory (Uni LΓΌbeck, S. Otte), co-development partners for use-case integration in Stages 2 and 3, compute procurement, and legal and IP. Specific subcontract work packages are defined in the Stage-2 roadmap.

Stage 1 milestones.

Financial Cost Estimate for Stage 1 (numeric, max 3,000,000 EUR) 🟒

Rough estimate (main cost drivers):

Cost driver EUR
Compute (HW infrastructure & operation) ~1.0–1.6M
Personnel (5 FTE) 500K +
Overhead (100%) 500K +
Total ~2.0–2.6M (within 3,000,000 cap)

Team (textarea, 2000 chars) 🟒

Why best suited. The two founders already pursue this thesis, sparsity and recurrence for efficient scale, in published research and shipped production systems. Together they cover research, GPU and kernel engineering, and real-time systems, backed by business, strategic and research advisors covering IP, commercialisation and state-based-model science. They have a real entrepreneurial track record and fast execution. Networks include MIT, Numenta, Mercedes-Benz, Merantix AI Campus, Uni LΓΌbeck, Bitkom and Antler.

Founders / core team:

Business and Strategic Advisor: Jana Lehner. Physics PhD; Director of IP and former CBO at a quantum deep-tech scale-up; 19 years at IBM; Bitkom board; managed two €10M grants; covers IP, commercial and partnerships. Research Advisor: Sebastian Otte. Professor at Uni LΓΌbeck and Geoffrey's PhD supervisor; works on state-based and recurrent models (Active Tuning, recurrent spiking control). Additional researcher (possible): Johann Machemer. DNN pruning research (Calprune); 2 peer-reviewed papers and a FLAIRS Best Student Paper; Uni LΓΌbeck.

What is missing and how we close it. The core is deliberately lean. We deepen hands-on coverage of specific methods and models with targeted senior research and engineering hires as funding grows, and we outsource HR, accounting and IP-law, drawing on the networks above.

# Name, role, % FTE (250 chars) Track-record links (250 chars)
Team Member 1 Tebjan Halm, Founder, [% FTE GAP] tebjan.de, GitHub
Team Member 2 Geoffrey Kasenbacher, Founder, [% FTE GAP] GitHub
Team Member 3 Jana Lehner, Business & Strategic Advisor, [% FTE GAP] [GAP]

3. Attachments (PDF only) βšͺ


4. Legal & Declarations βšͺ


5. Spam Protection βšͺ

CAPTCHA: "What letter is the second last letter of the alphabet?" answer Y


6. Submit

β†’ Send SPRIND Challenge submission


Changelog

2026-06-01 β€” Continuous-AI bet, momentum proofs, memory research bet

2026-06-01 β€” Efficiency-first reframe, plain prose, compute scaled up

2026-06-01 β€” Reframe groundwork + Google-Doc sync + artifact reframe

2026-05-31 22:30 CEST β€” Prioritise SSM/Mamba artifacts

2026-05-31 22:05 CEST β€” Integrate Jana's field texts

2026-05-31 21:50 CEST β€” Submission form goes live as main entry point

2026-05-31 β€” Form drafted