|
There
is a strong consensus amongst computer professionals, that the
greatest gains in price/performance can only be achieved through
multiple processor parallel systems. Parallel computers are characterized
by two or more processing elements and memory, tied together by
some interconnection network. Abundance of relatively slow processors,
working together to solve one problem, provides the necessary
performance.
The trend in
parallel computing is to move away from specialized traditional
supercomputing platforms, such as Cray / SGI T3E, to cheaper and
general purpose systems consisting of loosely coupled components
built up from single or multiprocessor PCs or workstations. This
approach has a number of advantages, including being able to build
a platform for a given budget, which is suitable for a large class
of applications and workloads.
The hardware
technology and economic forces are right for an explosion of parallel
processing into the market at all levels. But if parallel computing
is so wonderful, why aren’t we doing it on a large scale. The
main problem lies in the lack of software for parallel machines.
Parallel processing,
or concurrent computing as it is sometimes termed, is not conceptually
new. The jobs that can be broken into multiple tasks that in turn
be handed out to individual workers for simultaneous execution,
are most suitable for parallel machines.
What is
different about parallel programming?
Software development
is intrinsically difficult and time consuming for both sequential
and parallel computing applications. However, designing and writing
software for parallel computers is even more difficult because
parallel programmers must keep in mind details of non-determinism,
synchronization and scheduling, as well as traditional details
of sequential programming.
It
is much easier to program sequentially because humans tend to
think sequentially rather than concurrently. Unfortunately sequential
programming is incapable of directly making effective use of parallel
computers. Parallelizing compilers do not exist in the practical
sense, and even if they did, the greatest performance is often
achieved by rethinking about the underlying algorithm. Tools,
which can aid in “thinking in parallel” can offer greatest prospect
for improvement.
What
options do we have for parallel software development? Either we
must discard decades of sequential software development and embark
on a long journey to (re)write all software in parallel form,
or device tools to convert past applications into a form that
can exploit the powers of parallel computers.
In
order to effectively exploit the power of parallel computers or
a cluster of workstations, good programming environments and development
tools are a must. This is particularly necessary for parallel
processing and distributed computing to become the preferred programming
model for a typical programmer, as opposed to being limited to
a group of experts. The last two decades have seen a significant
development of various kinds of programming environment, together
with a plethora of associated programming aids including parallel
debuggers and monitoring tools. Computer professionals are zeroing
on a standard, which in turn will lead to portability of codes.
Presently, only a small population of programmers has the knowledge
to use parallel and distributed systems for executing large production
codes. Parallel programming technology is still not popular with
the average sequential programmers. These programmers lack the
enthusiasm about moving into a different programming environment
with increased difficulties, though they are aware of the potential
performance gains.
Portability
and scheduling
Another
major concern for parallel programmers is of portability and scheduling.
In the past the development of parallel program was architecture
dependent. For example low level synchronization was done using
locks in a shared memory architecture and via message passing
in a distributed memory architecture. Thanks to the development
of library function calls, called MPI (Message Passing Interface),
it is now possible to address the problem of portability to a
great extent. MPI is an architecture independent higher abstraction,
which allows program designers to express their algorithms in
a high level structure without having to worry about details like
synchronization.
The
second major problem is scheduling – the process of allocating
tasks to physical processors, and specifying the order of execution
of these tasks. How does a parallel programmer schedule tasks
onto a particular parallel computer in a pattern that guarantees
the shortest execution time? This problem is mathematically complex,
often requiring exponential time to solve for the absolute best
schedule. Spending more time on scheduling an application than
running it, can defeat the purpose of parallel computing.
Thus
parallel programming has all the problems associated with traditional
programming. But the programmer must also be concerned with the
architectural details and tuning through scheduling of the parallel
tasks onto the multiple processors. Parallel programming also
requires assistance at all levels – debuggers, performance analyzers
and reusable components.
Implicit
and explicit parallelism
There
are two approaches for programming parallel computers, (1) implicit
parallelization (2) explicit parallelization. Each has its merits
and disadvantages. Implicit parallelism uses existing languages
and conceals the underlying parallel computer from the programmer.
Intelligent, high-level compilers are required to automatically
translate the application into parallel form. Some research is
being done in parallelizing compilers and parallel languages but
their functionality is still very limited. Parallelizing compilers
are only useful for applications that exhibit regular parallelism,
such as computations in loops. For shared-memory multiprocessor
systems, parallelizing compilers have proven to be relatively
successful. However, for distributed memory machines they are
largely unproven. Thus automatic parallelization is very limited
in scope and only rarely provides adequate speedup.
In
explicit parallelization, the programming language incorporates
all the explicit parallel control statements in its syntax and
the programmer must know about parallelism. Therefore, the explicit
approach requires a clever programmer. Since it is easier to develop
tools and techniques to help the programmer be clever than to
develop a smart compiler, most of the progress has been made in
this direction.
Some
parallel languages came into existence, but never gained popularity
because users are not willing to learn a completely new language
for parallel processing. They would like to use the traditional
languages (like C and Fortran). Low level communication libraries,
like MPI and PVM have now become popular with the parallel software
developers, because they provide an interface for C and Fortran.
Programmers can now write efficient parallel programs using a
traditional language of their choice and using MPI or PVM.
Interconnection
network
Parallel
computers require interprocessor communication to perform sufficiently
well, so that multiple processors can execute an application more
quickly than a single processor acting alone can execute that
application. There are cases where 32 processors are slower than
16 processors working on the same problem. This is not because
the problem is insufficiently parallel, but because the interprocessor
communication is too high. Communication speeds have not grown
in the same proportion as the processor speeds. However if an
application has the right ratio for computation and communication,
good performance gains can be obtained.
Parallel
Programming on PARAM 10000
The
National Param Supercomputer Facility (NPSF) located at Pune,
is sufficiently equipped with the hardware, programming tools,
optimized message passing libraries and necessary compilers for
writing parallel application. C-DAC is now in the process of installing
PARAM 10000 at 12 premier institutes within the country. The researchers
at these institutes can now develop applications on these smaller
configurations and later can run large problems at NPSF. In order
to exploit the full potential of PARAM 10000, a user needs to
overcome the inertia and start “thinking in parallel”. Both, the
industry and academic institutions can derive benefits from parallel
programming and solve their large problems.
Some
parting thoughts
The
above description points to the fact that programming parallel
computers is relatively difficult and requires more expertise
than programming uniprocessor machines. Parallel machines address
the big problems of their time. Because they are expensive, they
need a computationally intensive application to warrant their
use. Fundamental problems in science and engineering, so called
Grand Challenge Applications (GCAs), with broad economic and scientific
impact, require parallel machines. Some typical examples of GCA’s
include applications from the areas of Meteorology, Computational
Fluid Dynamics (CFD), Chemistry, Biotechnology, Seismic Data Processing,
Optimization etc.
Sequential
programming evolved from architecture specific low-level languages.
After the development of architecture independent languages, programmers
are now not worried about portability. Extensions to these high-level
languages were made, to make them more structured, leading to
programs that are easier to develop, test and maintain. Parallel
programming is, I believe, also following the same direction.
Most
parallel programming problems can be solved by a clever programmer,
but they require intimate knowledge of both the programming language
and the machine hardware. Some form of automatic or semi automatic
assistance is desirable for development purposes. The final goal
is to have an architecture independent tool to “glue together”
sequential code segments into synchronized, highly parallel, machine
efficient parallel programs. This will guarantee the portability
across a diversity of parallel computer architectures. A considerable
effort has already gone into development of tools and programming
environments and we are moving in the right direction. Till then,
we must learn to enjoy parallel programming with the existing
tools and message passing libraries.
Dr.
Suhas Phadke is the Group Coordinator, Scientific & Engineering
Computing Group. He has 15 years of Research and Industrial experience
in the area of Seismic Data Processing and High Performance computing
applications which includes experience with the ONGC, TOTAL- France,
Western Geophysical, USA.


|