|
Legacy protocols like
TCP/IP having large overheads do not exploit hardware
features nor allow experiments with new protocols and
flow control mechanisms.
HPCC software's KSHIPRA communication substrate provides lightweight communication
protocols based on AM conforming to the Active Messages
II specifications of the University of California Berkeley
and VIA conforming to Virtual Interface Architecture
specifications jointly authored by Intel, Compaq and Microsoft.
The implementation leverages on a mechanism, which decouples
communication path from the operating system. This eliminates
operating system overheads in frequent operations like
send/receive. It provides a communication abstraction
which allows the applications to exploit to the fullest
the low latency and high bandwidth of the underlying
high performance network. HPCC software also provides
Message Passing Interface (MPI) application programming
interface for parallel computing layered over the low-level
communication substrate.
To facilitate simultaneous
communication among many parallel processes, MPI provides
collective communication functions. The performance
of these functions can be enhanced by using appropriate
algorithms for the architecture. C-MPI has tuned the MPI collective algorithms to a cluster
of SMP nodes. In addition, C-MPI achieves enhanced performance
by layering MPI over lightweight communication protocols.
Parallel applications,
to scale also require an efficient file system that
provides high throughput for executing parallel job.
HPCC software addresses this issue by providing C-PFS,
a high performance parallel file system exporting MPI-IO
interface, an IO extension to the MPI 2 specification.
Conventional distributed
file systems optimize the aggregate IO throughput while
the parallel applications require the full IO throughput
to be delivered to a single application. C-PFS leverages
on a mechanism that provides an end-to-end user level
implementation to provide high performance. C-PFS also
has features to provide high IO throughput for typical
IO workloads of the parallel programs.

|