โ† All Updates

SubQ: First Sub-Quadratic LLM with 12M Token Context

A startup out of Miami claims to have escaped the quadratic attention bottleneck that has defined transformers since 2017 โ€” with a 12-million-token context window to prove it.

June 29, 2026 ยท 5 min read ยท Architecture Analysis

๐Ÿ“‹ In This Article

  • ๐Ÿ”‘ The Quadratic Problem โ€” Every transformer wastes compute on token-to-token relationships that don't matter
  • โš™๏ธ Sparse Attention (SSA) โ€” Content-based token selection instead of position-based. Linear scaling, not quadratic.
  • ๐Ÿ“Š Efficiency Numbers โ€” 64.5x less compute at 1M tokens. 52x faster than Flash Attention 2. 12M context window.
  • โœ… Quality Benchmarks โ€” RULER 95.6%, needle-in-haystack 100% at 2M tokens, SWE-bench 81.8%
  • โš ๏ธ Honest Caveats โ€” Vendor-reported, no open weights, no independent reproduction yet

The Problem

Every transformer LLM since 2017 has been bottlenecked by quadratic attention: doubling input length quadruples compute. This is why 1M-token contexts cost $5-25 per query and why RAG pipelines exist โ€” not because retrieval is better, but because feeding full documents is too expensive.

The Architecture: Subquadratic Sparse Attention (SSA)

SubQ, built by Miami startup Subquadratic (founded by ex-Meta AI leads, $29M seed), replaces dense attention with content-based sparse selection. Each token learns to select a small subset of other tokens that are semantically relevant, then full attention math runs only on those pairs.

This differs from prior approaches:

Efficiency Gains

Context LengthCompute vs DenseSpeed vs FA2
128K tokens8x less8x faster
512K tokens31x less31x faster
1M tokens64.5x less52x faster
12M tokens~1000x lessโ€”

Quality Benchmarks

Honest caveat: Most benchmarks are vendor-reported. The model weights are not public, and independent reproduction is pending. The architecture is genuinely novel, but real-world performance at 12M tokens remains unverified by third parties. SubQ is available in private beta with SubQ Code (CLI agent) and SubQ Search (long-context research tool).

Official Announcement ยท Technical Blog