Disaster Management

Disaster Management

GARUDA Grid Network monitoring, resource monitoring, operation, services, administration and maintenance are very essential for contineous availability and smooth functioning of the grid. For this purpose lot of tools are available and lot of activities

are taking place. Alert mails are generated by various monitoring tools for link failure, cluster failure and if Globus services are not running. Weekly operation meetings

are conducted with other centers. All these activities under single umbrella called

GARUDA Grid Operation and Administration – GGOA.

The mssion of GGOA is network maintenance, trouble shooting and status reporting. It operates on daily basis. Periodically it conducts meeting and training. GGOA operates and coordinates from CDAC-KP, Bangalore.

Functions of GGOA

Various network parameters like Network bandwidth utilization, Packet loss, Latency, Link status, Cluster status are monitored. Any abnormalities will be reported to ERNET.

All the activities are grouped as daily , weekly , monthly , quarterly and yearly basis.

All the GARUDA Links are monitored. If any link fails, alert mail is generated and status is reported to ERNET/SIFY accordingly. All the resource provider cluster is monitored. If cluster is down automatic alert mail is generated. Cluster trouble shooting is done, if required respective local system administrators will be contacted to make cluster healthy.

Globus services are monitored, if any service is not running alert mail is generated also service will be automatically re started. If still problem exists respective local system administrators will be contacted. Scheduling planned downtime is very important activity. Down time will be announced to all registered GARUDA users and Local system administrators well in advance. After the installation or maintenance activity is completed again users will be intimated. To support all these activities GGOA web site is available.