C-DAC Logo











C-DAC Home
C-DAC's Tera-Scale Supercomputing Facility
General Enquiry Form
Premier Institutes Project
 Home > HPCC > NPSF > Users at NPSF
USERS AT NPSF - BASIC SCIENCES

The National PARAM Supercomputing Facility (NPSF) of C-DAC can be utilized by researchers for scientific and engineering applications. A number of prestigious organisations have been using the NPSF facility. Some of the key research undertaken by them and typical results obtained, are reproduced here.

Coupled Cluster Calculations

Scientists of Indian Institute of Astrophysics, Bangalore have developed a FORTRAN code to perform large-scale Coupled Cluster (CC) calculations on heavy atoms in order to investigate effects of parity non-conservation (PNC) in atomic systems. This code has been tested on smaller systems and heavy systems with small basis sets. In order to reach the required accuracy, heavy systems have to be evaluated with a large basis set; this is possible only on a supercomputer. The parallel version of this code was made to use the shared memory on PARAM machine.

Scaling

A couple of test runs using a small system (Na+) and a small basis set (Nc = 4 core orbital, Nv = 21 virtual orbital) were performed in order to test the scaling of the execution time with the number of processors. The results are shown in Table 1.

Table 1: Number of processors Vs Time

No. of processors
Time in seconds
4
259
8
142
16
68.8
32
37.5

The nodes, which are used for the experiments were not in dedicated mode and other processes may have influenced the timings. However, an excellent scaling behavior is visible. For larger basis sets, this is likely to improve even more, since the size of data to be broadcasted scales with (Nv * Nc)2 whereas the number of operations scales with (Nv * Nc)4 so that for large systems the communication becomes negligible.


Runs using Linear CCSD (LCCSD) approximation

Three systems have been evaluated in LCCSD approximation. The calculations have been done on a moderate number of processors (15 - 30) within a few hours each. However, the memory requirements are scaling with (Nv * Nc)4 due to the number of Coulomb integrals which have to be stored. The nonlinear CCSD calculations are much more time consuming than the LCCSD approximation. It is therefore essential to use a maximum number of processors on each node to reduce the execution time. To achieve this a shared memory version of the CCSD code has been developed.

Nonlinear run

For the first time a nonlinear calculation for the Tl3 system has been performed. It was implemented on 28 nodes in dedicated mode and run the shared memory CCSD on 112 processors. This computation proves that the MPI runs stable on many nodes and for a long time and that the shared memory approach was suitable to make the code run on 4 processors per node despite the large memory requirements. The run also showed that further optimization of the nonlinear part will be required to reduce the execution time. An estimated speed up of 30% can be achieved by optimizing the load sharing on the different processes, another major speed up will be achieved by storing the 3j and 6j symbols in RAM instead of evaluating them on the fly. With the estimated time and memory usage, the other systems Ca+, Ba2+ and Tl+ were also tested. The results are summarized below with the appropriate time taken on PARAM 10000.

Table 2: Results of linear and non-linear systems

System
#Basis
# of
unknowns
# nodes
Time
(linear)
Time
(non-linear)
Ca2+
7 core,
71 virtual
42705
1
54 min
7.7 hr
Ba2+
13 core,
58 virtual
126653
14
15 min
20.9 hr
Ba2+
13 core,
66 virtual
155333
15
22 min
37.4 hr
Ba2+
16 core,
58 virtual
183462
5
66 min
-----
Ba2+
17 core,
58 virtual
201197
2
156 min
-----
Tl+
13 core,
60 virtual
147917
15
31 min
35.7 hr
Tl+
18 core,
60 virtual
278368
15
1 hr
60 hr*
Tl+
22 core,
60 virtual
381390
4
8.9 hr
-----

* Only 2 nonlinear iterations have been completed for this system


Multiple sequence alignment

In collaboration with C-DAC, Scientists from Anna University, Chennai have developed a parallel genetic algorithm code on PARAM 10000 for multiple sequence alignment problem. The algorithm searches for an alignment among the independent isolated evolving populations by optimizing weighted sums-of- pairs an objective function, which measures the alignment quality. Using isolated sub - populations of alignments in a quasi - evolutionary manner, this approach gradually improves the fitness of the sub - populations as measured by the objective function. This approach performs better than the sequential approach and an alternative method CLUSTAL W.

Results

To assess the efficiency of optimization by parallel genetic algorithm on the weighted sum - of - pairs function, 8 test cases were designed. They are based on sequences with known three - dimensional structures, for which a structural alignment is available. All of the test cases were extracted from the 3D_ali Release 2.0. We chose test cases of varying length (52 - 248 residues) and various numbers of sequences (6-37). The results of PGA are compared with the best - known results of other algorithms for multiple alignment test cases. PGA was executed 110 times per test cases with varying parameters. Table 3 presents the best results for all the algorithms. Figure 1 shows the behavior in the context of all subpopulations, that is, it presents the convergence behavior of the best individuals in each of the parallel evolving subpopulations. It indicates the importance of migration to avoid premature stagnation by implementing new genetic material into a stagnating subpopulation. Furthermore, the plot points out the "stabilizing" effect of migration as expressed in the limited variation among the best subpopulations.

Table 3: Comparison of PGA with other methods

Test
Case
Nseq
Length
Clustal
W
Score
SGA
Score
PGA
Score
CPU
time
(secs)
N.G.
ac port
14
248
18239760
18052396
17905996
4563
485
Gcr
8
52
1201472
1187558
1186959
93
48
Globin A
15
183
10898878
10857787
10832517
2176
320
Globin B
17
183
14050372
14030690
13930367
2799
360
Igb
37
194
44756936
43984368
43306467
62744
545
Lzm
6
213
1724257
1694431
1677190
262
120
Sbt
7
331
2357581
2349134
2320023
670
160
s prot
15
229
21284220
21005754
20915316
6959
660

Test case: generic name of the test case, taken from 3d_ali (ac prot: acid proteases, gcr: crystallins, globin A and globin B: globins/phycocyanins/collicins, igb: immunoglobulin fold, lzm: lyzozymes/lactalbumin, sbt: subtilisin, s prot: serine protease fold). Nseq: number of sequences in the alignment. Length: length of the reference alignment. Score: score measured with weighted sum of pairs objective function. CPU time: cpu time in seconds using a Sun Ultra 450 workstation. N.G.: number of generations needed by PGA to find the solution.


Catalyst Modeling

Scientists from National Chemical Laboratory, Pune have used GAMESS for designing reforming (worldwide capacity = 8.7 m bbl/day) catalysts by accurate quantum chemical calculations needing high - end computing power of PARAM 10000 and for identifying the active sites in zeolite so that catalysts with higher conversion and selectively are possible. GAMESS is a program for ab initio quantum chemistry. Briefly, GAMESS can compute wave functions ranging from RHF, ROHF, UHF, GVB, and MCSCF, with CI and MP2 energy corrections available for some of these. Analytic gradients are available for these SCF functions, for automatic geometry optimization, transition state searches, or reaction path following. Computation of the energy Hessian permits prediction of vibration frequencies. A variety of molecular properties, ranging from simple dipole moments to frequency dependent hyper polarizabilities may be computed. Many basis sets are stored internally, and together with effective core potentials, all elements up to Radon may be included in molecules. Many of the computational functions can be performed using direct techniques, or in parallel on appropriate hardware.

GAMESS BENCHMARK

GAMESS is a public domain package and very widely used by Quantum Chemists. The parallel version of GAMESS has been ported and tested on PARAM 10000 for various molecular systems. The Table below gives the wall clock timings for this package:

S.No.
System
Basis set
No. of
Contractions
No. of
Processors
Wallclock
time
(secs)
1
FM-8H2O
6-31G**
260
6
500
2
FM-12H2O
6-31G**
360
7
1099
3
FM-16H2O
6-31G**
460
13
1584
4
T-6H2O
6-31G**
315
27
376

Pt-LTY system

The catalysis group, at National Chemical Laboratory, has performed Hatree-Fock calculations on Ptn (where n=1 to 12) cluster models using parallel version of GAMESS on PARAM. It was observed that 3-d Ptn clusters are more stable than 2-d Ptn clusters for fixed 'n' values. The 'binding energy of the cluster per Pt atom' for the Ptn cluster models showed that Pt5 and Pt6 is the most stable among the small clusters. Hence, Pt atom and Pt5 cluster deposited over zeolite LTY catalyst were modeled in this study.

LTY has hexagonal prisms, sodalite cages and super cages. There are three types of non-framework cationic sites, namely sites I, II and III. In a unit cell of LTY, [MxSi192-xAlxO384], there are 96 possible locations for non-framework cations. The 96 locations are distributed as: 16 sites I, 32 sites II and 48 sites III (refer to the Figure). We have represented the active site in LTY by a 6-m ring cluster, which represents two non-framework cation sites namely I and II. Site I is inside the hexagonal prism, whereas site II is in the window connecting the sodalite and super cages. The 6-m ring cluster represents such a window.

In the zeolite cluster models (with Si/Al=2, the binding energy of alkali metals decreases with increasing size of the alkali metal from Li to K. However, Rb and Cs are strongly bonded than Li. The net charge on various metals shows that smaller cations prefer MI site, whereas larger cations prefer MII (refer Table 4).

When a single Pt5 cluster deposited over LTY cluster model, the binding energy is correlating with binding energy of alkali metal to zeolite. The electron density on Pt and Pt5 increases as the extra-framework cation varies from H to Rb, indicating the catalytic influence.


Table 4:
The electronic properties of the cluster model [MIMIIi4Al2O18H12], where MI is the alkali metal present in the cationic site I and MII is alkali metal present in the site II.

Cluster
Nseq
Total
Energy
(a.u.)
Binding
Energy#
(Kcal/mol)
Net Electron
Charge on
MI
MII
MI
MII
[MIMIISi4Al2O18H12]
H
H
-309.3972
-510.8953
0.64
0.74
[MIMIISi4Al2O18H12]
Li
Li
-309.2468
-369.4493
0.66
0.66
[MIMIISi4Al2O18H12]
Na
Na
-309.1949
-336.9668
0.85
0.75
[MIMIISi4Al2O18H12]
K
K
-309.1270
-294.4704
0.92
0.88
[MIMIISi4Al2O18H12]
H
Rb
-309.3449
-470.9649
0.62
0.94
[MIMIISi4Al2O18H12]
H
Cs
-309.3155
-453.4407
0.62
0.93
[MIMIISi4Al2O18H12]
Na
Li
-309.2107
-346.8556
0.82
0.66
[MIMIISi4Al2O18H12]
Na
Na
-309.1949
-336.9668
0.85
0.88
[MIMIISi4Al2O18H12]
Na
K
-309.1622
-316.5010
0.81
0.89
[MIMIISi4Al2O18H12]
Na
Rb
-309.1854
-331.0211
0.82
0.93
[MIMIISi4Al2O18H12]
Na
Cs
-309.1537
-311.1811
0.81
0.94

#B.E.= T.E. [MIMIISi4Al2O18H12]- {T.E. [Si4Al2O18H12]2- + T.E.[MIMII]}

Legal Notices | Privacy Policy | © 2010 C-DAC. All rights reserved.
NPSF
NPSF Objectives
Resources at NPSF
PARAM 10000
PARAM Anant
NPSF Technical
Affiliation Scheme
Virtual Walkaround
of NPSF