|
Dated September 24, 2001
The Telegraph
India's PARAM (PARAM
10000) Supercomputer leaps into the DNA horizon
For a fleeting
glimpse of a DNA molecule in action inside a human cell,
a peep that lasts a billionth fraction of a second,
biochemist Rajendra Joshi sits at a workstation linked
to the most powerful computer developed in India.
Scientists at
the Centre for Development of Advanced Computing (C-DAC),
Pune are preparing to deploy India's PARAM supercomputer
for bioinformatics,
a research arena that involves analyzing and interpreting
genomic data with the help of computers.
PARAM is a parallel
processing machine in which computation-intensive problems
are distributed across several processors that work
in parallel to slash computation time. C-DAC researchers
have tailored popular bioinformatics software packages
to run on the supercomputer. They are also trying to
develop new software that will speed up the routine
processing of the torrent of genomic sequence data that
is almost doubling every 15 months.
"Our primary
task at C-DAC is to get PARAM ready for Indian scientists
to use on bioinformatics projects that will require
high computing power," says Joshi, Team Coordinator
with C-DAC's Bioinformatics
Applications Group.
"This kind of
a high-end computing resource coupled with bioinformatics
software packages is currently not available in India.
We're trying to fill in that gap," Joshi told KnowHOW.
As new genomic
sequence data floods into gene and protein databanks,
a key task that will keep bioinformaticists busy for
years to come is comparing different sequences to look
for similarities between them. The discovery of similarities
among sequences can be used to assign functions to either
unidentified genes or to proteins without known functions.
C-DAC researchers
L.A. Anbarasu and V. Sunderarajan have developed what
is known as a parallel genetic algorithm that they have
used for the alignment and comparison of multiple sequences.
This technique produces better results than several
conventional techniques of simultaneously comparing
multiple genetic or protein sequences.
The C-DAC scientists
also hope to use PARAM for protein structure optimization
a process that can predict the three-dimensional structure
of a protein from its sequence. This is one of biology's
biggest unsolved problems. While the sequence of animo
acids that make up a protein can be identified by studying
the genome sequence, it is actually the three-dimensional
structure of the protein that is crucially important
in determining how the protein behaves in the body.
The three-dimensional structure depends on how amino
acids align themselves on space. The idea is to find
which is the most preferred shape they will take by
looking at the forces of interaction between the molecules.
Software packages for
molecular dynamics originally developed abroad are also
now available on PARAM. C-DAC scientists have tailored
two popular packages called AMBER and CHARMM, both developed
by independent researchers in US universities, to get
them to work on PARAM.
Molecular dynamics
can be used for several applications in biology - to
understand the structure and dynamics of a DNA sequence,
to understand how a drug binds to a cellular component,
or how a proteins interact with DNA molecules.
"When performing
such complex simulations, we have to mimic conditions
so that they come as close to the real environment inside
a cell as possible," says Joshi. "Biological molecules
are always surrounded by water and ions and other intracellular
components and a realistic simulation should take into
account such a cellular neighbourhood."
Joshi himself
has used a software called AMBER, originally developed
at the University of California, San Francisco, to simulate
the behavior of different DNA sequences. The simulation
has to take into account interactions between several
thousands of atoms in the cells.
"A typical simulation
of a biological molecule for just about one nanosecond
a billionth of a second - in the presence of water molecules
and the myriad ions could involve up to 20,000 atoms
and on a single processor workstation could take several
months," says Joshi. "On the PARAM supercomputer, we
would have our simulation in less than a week."
As C-DAC has
aready demonstrated, bioinformatics on PARAM may yield
spin-off in the understanding of complex diseases. Joshi
has used molecular dynamics on PARAM to throw new light
on certain sequences that tend to get repeated in the
human genome and have long been associated with certain
neurodegenerative diseases such as Huntington's disease.
He concentrated
on the sequence C-A-G that repeats in a certain section
of the genome 6 to 35 times in normal people, but is
found multiplied 36 to 121 times in people with Huntington's
disease an incurable neurodegenerative disorder. Molecular
simulation of this repeat sequence shows that the three
dimension structure of the sone of DNA that makes up
these repeats is highly flexible. While the neighbouring
sequences are relatively rigid, the C-A-G section is
flexible. This could be why C-A-G repeats itself so
many times.
This simulation
of the DNA repeat zone has provided a new insight into
the molecular mechanisms that underlie highly repeated
sequences linked to this neurodegenerative disorder.
by G.S. Mudur
|