In this section we describe the programming concepts and existing implementations that serve as the foundation for the hybrid AsyncSHMEM model: Habanero Tasking and OpenSHMEM.
The Habanero task-parallel programming model  offers an async-finish API for exploiting intra-node parallelism. The Habanero-C Library (HClib) is a native library-based implementation of the Habanero programming model that offers C and C++ APIs. Here we briefly describe relevant features of both the abstract Habanero programming model and its HClib implementation. More details can be found in .
The Habanero async construct is used to create an asynchronous child task of the current task executing some user-defined computation. The finish construct is used to join all child async tasks (including any transitively spawned tasks) created inside of a logical scope. The forasync construct offers a parallel loop implementation which can be used to efficiently create many parallel tasks.
The Habanero model also supports defining dependencies between tasks using standard parallel programming constructs: promises and futures. A promise is a write-only value container which is initially empty. In the Habanero model, a promise can be satisfied once by having some value placed inside of it by any task. Every promise has a future associated with it, which can be used to read the value stored in the promise. At creation time tasks can be declared to be dependent on the satisfaction of a promise by registering on its future. This ensures that a task will not execute until that promise has been satisfied. In Habanero, the asyncAwait construct launches a task whose execution is predicated on a user- defined set of futures. User-created tasks can also explicitly block on futures while executing.
In the Habanero model, a place can be used to specify a hardware node within a hierarchical, intra-node place tree . The asyncAt construct accepts a place argument, and creates a task that must be executed at that place.
HClib is a C/C++ library implementation that implements the abstract Habanero programming model. HClib sits on top of a multi-threaded, workstealing, task-based runtime. HClib uses lightweight, runtime-managed stacks from the Boost Fibers  library to support blocking tasks without blocking the underlying runtime worker threads. Past work has shown HClib to be competitive in performance with industry-standard multi-threaded runtimes for a variety of workloads .
HClib serves as the foundation for the intra-node tasking implementation of AsyncSHMEM described in this paper.
SHMEM is a communication library used for Partitioned Global Address Space (PGAS)  style programming. The SHMEM communications library was originally developed as a proprietary application interface by Cray for their T3D systems . Since then different vendors have come up with variations of the SHMEM library implementation to match their individual requirements. These implementations have over the years diverged because of the lack of a standard specification. OpenSHMEM  is an open source community effort to unify all SHMEM library development effort.