std::execution: A Case Study in Institutional Dynamics

Great Founder Theory in C++

Dec 17, 2025

Abstract

This paper examines the standardization of P2300 (”std::execution”) through the lens of Great Founder Theory (GFT), Samo Burja’s framework for analyzing institutional dynamics. The executor/sender-receiver saga—spanning over a decade and generating more than 100 papers—raises questions about how WG21 processes handle competing corporate interests, domain-specific requirements, and fundamental design disagreements.

1. Origins: The Executor Discussions (2012-2016)

The trajectory of P2300 traces back to 2012 when executor discussions began in earnest (N3378). The initial unified proposal, P0443 (”A Unified Executors Proposal for C++”), emerged in late 2016 as a compromise among “such organisations and groups as Google, Sandia National Labs, Codeplay, Facebook, Nasdaq, Clearpool.io, Nvidia, Stellar Group, Microsoft, RedHat, HPX, HPC, and domains such as Ultra Low Latency Finance, Embedded Systems, APU, GPU, SYCL, Networking, Machine Learning, Library Development and Big Data.”

One historical account notes that “more than 100 papers and revisions have been produced that either directly or indirectly have significantly impacted the consensus position represented by P0443.” This extraordinary paper volume—unprecedented in WG21 history—raises a question: does this represent productive iteration, or does it reflect structural difficulty in reconciling domain-specific requirements?

Chris Kohlhoff, the author of Boost.Asio and the Networking TS, had been contributing executor proposals since at least 2014 (N4046). P0443 was shaped significantly by the Networking TS’s requirements. The proposal achieved consensus in SG1 (the Concurrency study group) and appeared on-track for C++20 by early 2018—until a design concern was raised.

2. The Lazy Execution Question (2018)

At the June 2018 Rapperswil meeting, Eric Niebler and Kirk Shoop (then at Facebook) identified a limitation: P0443 provided “poor support for lazy execution, whereby executors participate in the efficient construction and optimal composition of tasks prior to those tasks being enqueued for execution.” Their solution, presented in P1055, introduced “senders” (originally called “deferred”) and “receivers”—abstractions that could be “freely composed into richer senders and receivers, and only when a receiver was ‘submitted’ to a sender would that task be passed to an execution context for actual execution.”

P1055 represented what one paper called “a radical departure from the P0443 design.” LEWG (Library Evolution Working Group), seeing promise in the approach, deferred the decision to merge P0443. This decision moved executors out of C++20 and initiated a multi-year effort to reconcile two fundamentally different visions of C++ asynchrony.

In September 2018, an ad hoc Executors meeting in Bellevue, WA polled on direction: “For the long-term direction for executors we like senders/receivers.” P1194 (”The Compromise Executors Proposal”) by Niebler, Shoop, and Lewis Baker (author of cppcoro and contributor to Facebook’s folly::coro) attempted to synthesize the approaches.

3. Implementation Investment (2019-2021)

Facebook’s investment became concrete with libunifex, a “prototype implementation of the C++ sender/receiver async programming model” authored primarily by Lewis Baker, Kirk Shoop, and Lee Howes. As Eric Niebler later acknowledged: “most of it has been written by my team, not by me: Lewis Baker, Kirk Shoop, and Lee Howes.” This implementation provided the “existing practice” argument that WG21 values—though it was experimental, internal to Facebook, and lacked the decades of production deployment that Boost.Asio could claim.

By 2021, when P2300 (the formal proposal evolving from the sender/receiver work) appeared, its author list included Michał Dominiak (then NVIDIA), Lewis Baker (Facebook), Lee Howes (Facebook), Michael Garland (NVIDIA), Eric Niebler (then Facebook, later NVIDIA), and Bryce Adelstein Lelbach (NVIDIA). The NVIDIA involvement reflected P2300’s explicit goal of supporting GPU and heterogeneous parallelism—use cases that the networking-centric design had not prioritized.

4. The Design Disagreement (2021)

The disagreement became explicit in late 2021 with competing papers. Ville Voutilainen (representing Finland’s national body) authored P2464 (”Ruminations on networking and executors”), arguing: “This paper makes the plain observation that we should not standardize the Networking TS as it’s currently designed. The problem is the use of a P0443 executor.” Voutilainen argued that P0443 executors were “not an executor. It’s a work-submitter. It doesn’t provide any means of finding out whether the work submitted succeeded, or whether an error occurred.”

The Networking TS proponents responded with P2469 (”Response to P2464: The Networking TS is baked, P2300 Sender/Receiver is not”), authored by Jamie Allsop, Vinnie Falco, Richard Hodges, Christopher Kohlhoff, and Klemens Morgenstern. They argued that “the Asio/Net.TS model is a superset of the capabilities of P2300” and that “Evidence suggests that P2300 has not fully explored the design space with respect to the placement of per-operation stable-location memory.”

The October 2021 LEWG polling revealed the depth of division:

Poll 1: “Continue considering shipping P2300 senders/receivers in C++23” — Consensus in favor
Poll 2: “We must have a single async model for the C++ Standard Library” — No consensus
Poll 3: “Stop pursuing the Networking TS/Asio design as the C++ Standard Library’s answer for networking” — No consensus
Poll 4: “Networking in the C++ Standard Library should be based on the sender/receiver model” — Weak consensus in favor

As one observer summarized: “Translation of the above polls: ‘Do we want Networking TS in C++? No. Do we want P2300 in C++? Also no.’” The committee had reached an impasse.

5. Resolution (2022-2024)

The impasse eventually resolved in P2300’s favor through several factors:

Performance demonstration: The NVIDIA authors showed concrete parallel computing results: “The NVIDIA coauthors of P2300 report that parallel performance is on par with CUDA code.” A December 2022 HPC Wire article by Niebler, Georgy Evtushenko, and Jeff Larkin demonstrated cross-platform performance “across different parallel programming models, using both distributed-memory and shared-memory, and across different computer architectures.”

Use case breadth: LEWG commentary noted that one approach had broader applicability. One committee member observed: “On one (I think the first) of three LEWG sessions on the Networking TS and Asio, the Asio proponent began a 90-minute talk on the Asio asynchronous programming model. They used that phrase—’asynchronous programming model’—yet had no examples from parallel computing...”

Theoretical foundations: The P2300 proponents argued that senders/receivers provided composable error handling. As one poll respondent noted: “The Networking TS model doesn’t facilitate composable error (or payload value) handling. Worse, it’s fundamentally incompatible with such a model.”

In February 2022, despite those favoring C++23 inclusion outnumbering opponents “by more than 2-to-1,” the chairs determined to wait. P2300 was tagged for C++26. In June 2024 at St. Louis, P2300R10 was formally adopted into the C++26 working draft. The paper had reached ten major revisions.

6. The Foundational Abstraction Tradeoff

The P2300 outcome represents a choice to prioritize a foundational abstraction over a proven, deployed facility. This tradeoff merits examination.

The Networking TS case: Boost.Asio represents over two decades of production deployment across thousands of applications—from high-frequency trading to embedded systems. The Networking TS derived from it was stable, complete (a developer could write a TCP server using only its abstractions), and addressed a universal need: virtually every networked application requires network I/O.

The senders/receivers case: P2300 offered a theoretical framework promising composability across domains—networking, parallel computing, GPU execution, distributed systems. But libunifex was experimental and Facebook-internal. The abstraction had not accumulated comparable deployment history or failure-mode documentation.

What was prioritized: The committee chose the broader theoretical abstraction over the proven specialized facility. The reasoning: a unified async model serving networking, GPU computing, and HPC would be more valuable long-term than standardizing networking alone.

The consequences:

Networking remains unstandardized after twenty years of effort. Developers still use Boost.Asio, third-party libraries, or platform-specific APIs.
Heterogeneous computing gains a standard abstraction, but one that ships incomplete—no thread pool, no coroutine task type, a “paltry set” of algorithms.
The Networking TS is in limbo: neither merged nor formally abandoned. Its executor model is now considered incompatible with the standard direction.
Users requiring GPU/HPC facilities already have domain-specific libraries (CUDA, Kokkos, HPX, RAJA) that have iterated for years. Whether std::execution will displace them remains unclear.

The question this raises: when a proven, deployed facility addresses a universal need, should it wait for a broader theoretical framework that may take years to mature? Or does the theoretical framework’s long-term value justify the delay? Reasonable people disagree.

7. What Ships and What Doesn’t

P2300’s adoption leaves significant gaps:

The Networking TS remains in limbo—neither merged nor formally abandoned
No standard thread pool ships (users “will need to go to third party libraries for thread-pools, or write their own, in order to write async code that utilizes multiple CPU cores”)
No standard task coroutine type interoperating with senders until at least C++29
P3109 (”A plan for std::execution for C++26”) acknowledges: “We expect the most common use of senders will actually be from within coroutines. It would be a real disservice for users to have to wait until C++29 before they have an async coroutine task type out of the box.”

The tag_invoke customization mechanism was replaced with member-function-based customization (P2855) after concerns about “compile-time scalability.” Algorithm customization went through multiple revisions (P2999). The ensure_started and start_detached algorithms were removed because “all of the options are generally bad options” for handling orphaned async work.

The feature that ships in C++26 differs substantially from any of the 2016-2018 proposals. As Eric Niebler acknowledged, “The P2300 crew have collectively done a terrible job of making this work accessible.”

8. GFT Analysis

The P2300 history illustrates several institutional dynamics:

Resource concentration: The 12-year effort involved competition among well-resourced groups (Facebook/Meta, NVIDIA, and the Boost.Asio ecosystem) with different async requirements. How should WG21 weigh domain-specific needs (networking vs. GPU computing vs. HPC) when they point toward incompatible designs?
Experience vs. theory: Chris Kohlhoff’s Boost.Asio represents decades of production deployment and iterative refinement. The P2300 faction’s libunifex was newer and less deployed, but had explicit theoretical foundations. How should the committee weigh deployment history against theoretical arguments about composability?
Working group dynamics: SG1 achieved internal consensus on P0443 by 2018, but LEWG deferred the merger after P1055. Neither working group could impose resolution. Does the current structure facilitate or impede decision-making?
Process overhead: The hundred-plus papers, 10+ revisions of P2300, and 14+ dedicated LEWG meetings represent extraordinary effort. Papers like P1791 explicitly warned that “specific needs will inadvertently be given precedence over the existing and extensive compromises.” Did the process produce design improvement proportional to its cost?
Implementation momentum: The sender/receiver faction accumulated implementation experience through libunifex and CUDA integration. The Networking TS had published TS status and Kohlhoff’s reputation but less recent implementation investment at comparable scale. How much should implementation reality influence standardization decisions?

9. Questions for Reflection

The P2300 history raises questions about WG21’s process:

When domain-specific requirements conflict fundamentally, what mechanisms exist to reach resolution in less than a decade?
How should the committee balance deployed production experience against theoretical foundations?
When working groups reach incompatible conclusions, who decides?
Does the current paper-driven process scale to proposals of this complexity?
What would it take to ship complete features (with thread pools, task types, networking integration) rather than minimal primitives?

These questions deserve attention from those who care about WG21’s effectiveness.

References

N3378, N4046: Early executor proposals (2012-2014)
P0443: Kohlhoff et al. A Unified Executors Proposal for C++ (2016-2020)
P1055: Niebler, Shoop. A Modest Executor Proposal (2018)
P1194: Niebler, Shoop, Baker. The Compromise Executors Proposal (2018)
P2300R0-R10: Dominiak, Baker, Howes, Garland, Niebler, Lelbach. std::execution (2021-2024)
P2464: Voutilainen. Ruminations on networking and executors (2021)
P2469: Allsop, Falco, Hodges, Kohlhoff, Morgenstern. Response to P2464 (2021)
P2855: Member-function-based customization
P2999: Algorithm customization revisions
P3109: A plan for std::execution for C++26
Burja, S. (2020). Great Founder Theory.

Revisions

2025-12-23: Added section on foundational abstraction tradeoff (prioritizing senders/receivers over Networking TS).
2025-12-23: Adjusted tone to focus on institutional dynamics and raise questions rather than assign blame; preserved all facts and quotes.

We welcome corrections and additional perspectives from committee participants.

My Very Best AI Slop

Discussion about this post

Ready for more?