Menu
Home
Log in / Register
 
Home arrow Computer Science arrow OpenSHMEM and Related Technologies. Enhancing OpenSHMEM for Hybrid Environments: Third Workshop, OpenSHMEM 2016, Baltimore, MD, USA, August 2 – 4, 2016, Revised Selected Papers

OpenSHMEM and Related Technologies. Enhancing OpenSHMEM for Hybrid Environments: Third Workshop, Ope


OpenSHMEM ExtensionsIntegrating Asynchronous Task Parallelism with OpenSHMEMIntroductionBackgroundHabanero TaskingOpenSHMEMAsyncSHMEMAPI ExtensionsFork-Join ImplementationOffload ImplementationExperimental MethodologyBenchmarksExperimental Infrastructure and MeasurementsResultsISxUTSRelated WorkCombining Distributed Programming Models with Task-Parallel ProgrammingThread-Safe OpenSHMEM ProposalsConclusionReferencesEvaluating OpenSHMEM Explicit Remote Memory Access Operations and Merged RequestsIntroductionMotivationUse Case 1: OpenSHMEM ThreadsUse Case 2: Defining PatternsUse Case 3: Defining New CollectivesAPI and Semantics for RMA Operations with RequestsExplicit Non-blocking RMA OperationsMerging RMA Request HandlesImplementation Using UCXEvaluationExperimental TestBedSystemApplication Kernels and BenchmarksPerformance Evaluation of RMA Operations with Requests and Merged Requests Using Micro-BenchmarksLatency of Get OperationsPing-Pong LatencyMessage Rate EvaluationPerformance Evaluation with Scalable Synthetic Compact Applications (SSCA) #1 KernelDiscussionRelated WorkFuture WorkReferencesIncreasing Computational Asynchrony in OpenSHMEM with Active MessagesIntroductionOverview of Active MessagesActive Message v/s Tasking ModelsProposed Extension for Active Messages SupportDesign of an AM Handler.Registration of AM HandlersInitiating Active MessagesThe Completion of Active MessagesHandler Safe Locking.Prototype EvaluationImplementation DesignExperimental SetupPerformance StudyThe Traveling Salesman Problem (TSP)Experiment Methodology.Related WorkConclusion and Future WorkReferencesSystem-Level Transparent Checkpointing for OpenSHMEMIntroductionReview of CheckpointingDesign Modification of DMTCP to Support OpenSHMEMRelated WorkExperimental EvaluationExperimental SetupScalabilityConclusion and Future WorkReferencesSurviving Errors with OpenSHMEMIntroductionBackgroundScope and Locality of Error ReportingLocal Versus Global Error ReportingNon-uniform Error ReportingError Reporting InterfaceError PropagationPost-error StabilizationRelated WorkConclusions and Future WorkReferencesOn Synchronisation and Memory Reuse in OpenSHMEMIntroductionRelated WorkDesign ConsiderationsDesignAdditional SynchronisationUnlock on User BarrierPairwise SynchronisationEvaluationTheoretical AnalysisSHOCConclusion and Future WorkReferencesOpenSHMEM Implementation and Use CasesDesign and Implementation of OpenSHMEM Using OFI on the Aries InterconnectIntroductionBackground and Related WorkFabric InterfacesOpenFabrics InterfacesOther Fabric APIs.OpenSHMEMDesign of OpenSHMEM for OFILaunch, Wire-Up, and Memory RegistrationOne-Sided Communication OperationsOrdering and Remote Completion OperationsNotification APILibfabric for the Aries NetworkAddressing and Memory RegistrationIssuing and Completing Communication OperationsEvaluationLatency Results Using SOS MicrobenchmarksBandwidth Results Using SOS MicrobenchmarksRandom Access Benchmark (GUPs)Scalable Integer Sort Benchmark (ISx)Conclusions and Future WorkReferencesOpenSHMEM-UCX: Evaluation of UCX for Implementing OpenSHMEM Programming ModelIntroductionRelated WorkOpenSHMEM Reference ImplementationUCX DesignUCX Core ComponentsWorkersInterfacesArbiterDesign and Implementation of uGNI TLInitialization and Connection SetupShort Data TransfersLarge Data TransfersActive MessagesAtomic OperationsExperimentsEvaluationShort Message LatencyMessage RateThe SHOMS BenchmarkEvaluation Using HPCS SSCA 1AnalysisConclusionReferencesSHMemCache: Enabling Memcached on the OpenSHMEM Global Address ModelIntroductionBackgroundMemcachedOpenSHMEMSHMemCache: Enabling Memcached on OpenSHMEMOverview of SHMemCache DesignOpenSHMEM DelegatorTransparent Networking for Server/Client.Shared Memory Pool.Symmetric Ring Buffer.Operation Stages and Pipelined Processing.Server and ClientImplementationEvaluationExperimental EnvironmentOperation LatencyLatency DissectionDiscussionRelated WorksConclusionReferencesAn OpenSHMEM Implementation for the Adapteva Epiphany CoprocessorIntroduction and MotivationBackgroundEpiphany ArchitectureImplementation and Performance EvaluationLibrary Setup, Exit, Query RoutinesMemory Management RoutinesRemote Memory Access RoutinesNon-blocking Remote Memory Access RoutinesAtomic Memory OperationsCollective RoutinesDistributed Locking RoutinesFuture Work and Discussion of Extensions for Embedded ArchitecturesConclusionReferencesHybrid Programming and Benchmarking with OpenSHMEMAn Evaluation of Thread-Safe and Contexts-Domains Features in Cray SHMEMIntroductionBackgroundCray SHMEMOverview of Thread-Safe ProposalOverview of Contexts-Domains ProposalUsage Details Using Thread-Safe and Contexts-Domains ExtensionsImplementation DetailsDMAPPThread-Safe Implementation DetailsContext Implementation DetailsCompare and Contrast Thread-Safe and Contexts-Domains FeaturesExperimental SetupImpact of Thread Local Storage (TLS)Usage of Explicit and Implicit Non-blocking OperationsHierarchy of Threading SupportEfficient Usage of Network ResourcesMemory Ordering ExtensionsExperimentsRelated WorkPAMIMPI-3 RMAFuture WorkConclusionReferencesOpenCL + OpenSHMEM Hybrid Programming Model for the Adapteva Epiphany ArchitectureIntroduction and MotivationBackgroundEpiphany ArchitectureOpenCL for EpiphanyOpenSHMEM for EpiphanyHybrid OpenCL + OpenSHMEM Programming ModelApplication and ResultsConclusions and Future WorkReferencesOpenSHMEM Implementation of HPCG BenchmarkIntroductionImplementation DetailsCray XK7 TitanSGI Turing ClusterCray XC30 EosSummaryReferencesUsing Hybrid Model OpenSHMEM + CUDA to Implement the SHOC Benchmark SuiteIntroduction: Programming Models for Hardware Accelerated Parallel SystemsOpenSHMEM as an Alternative to MPIPaper OrganizationBackground and Related WorkSHOC Benchmark SuitePrevious SHMEM Work with Teams, Collectives, and Hardware AcceleratorsPorting MPI Communication Structures in SHOCProcess Teams for Gradual Reduction of DevicesSynchronization CollectivesPerformance DemonstrationHardware Platform: Cray XK7Scaling of CollectivesPortable Teams ImplementationBenchmark ScalingConclusion and Future WorkReferencesOpenSHMEM ToolsProfiling Production OpenSHMEM ApplicationsIntroductionApproachSymbol WrappingLibrary PreloadingAutomatic Wrapper Library GenerationConclusions and Future WorkReferencesShort PapersSHMEM-MT: A Benchmark Suite for Assessing Multi-threaded SHMEM PerformanceIntroductionSHMEM-MT Benchmarking ApproachBenchmark Conversion.Threading Support.Initial ResultsRelated WorkReferencesInvestigating Data Motion Power Trends to Enable Power-Efficient OpenSHMEM ImplementationsIntroductionOpenSHMEM Power Trend AnalysisOSU Micro-Benchmarks Power AnalysisHPCG Benchmark Power AnalysisInsights and Prospective ResearchReferences
 
Found a mistake? Please highlight the word and press Shift + Enter  
Next >
 
Subjects
Accounting
Business & Finance
Communication
Computer Science
Economics
Education
Engineering
Environment
Geography
Health
History
Language & Literature
Law
Management
Marketing
Mathematics
Political science
Philosophy
Psychology
Religion
Sociology
Travel