Combining Distributed Programming Models with Task-Parallel Programming
The Partitioned Global Address Space (PGAS) programming model  strikes a balance between shared and distributed memory models. It combines the ease of programming with a global address space with performance improvements from locality awareness. PGAS languages include Co-Array Fortran , Titanium , UPC , X10  and Chapel . These languages rely on compiler transformations to convert user code to native code. Some of these languages, such as Titanium, X10 and Chapel, use code transformations to provide dynamic tasking capabilities using a work-stealing scheduler for load balancing of the dynamically spawned asynchronous tasks.
Another related piece of work is HCMPI , a language-based implementation which combines MPI communication with Habanero tasking using a dedicated communication worker (similar to the Offload approach).
Language-based approaches to hybrid multi-node, multi-threaded programming have some inherent disadvantages relative to library-based techniques. Users have to first learn a new language, which often does not have mature debugging or performance analysis tools. Language-based approaches are also associated with significant development and maintenance costs. To avoid these shortcomings HabaneroUPC++  introduced a compiler-free PGAS library that supports integration of intra-node and inter-node parallelism. It uses the UPC++  library to provide PGAS communication and function shipping, and the C++ interface of the HClib library to provide intra-rank task scheduling. HabaneroUPC++ uses C++11 lambda-based user interfaces for launching asynchronous tasks.