Skip to main content | Skip to navigation
Home | C-DAC Centers | Sitemap
Search
English | Hindi | Choose_Language
  • assamese
  • bangala
  • bodo
  • dogri
  • gujrati
  • kannada
  • konkani
  • konkani_n
  • kashmiri
  • kashmiri_keshur
  • maithili
  • malyalam
  • manipuri
  • manipuri_n
  • marathi
  • nepali
  • oriya
  • punjabi
  • santali
  • santali_n
  • sanskrit
  • sindhi
  • sindhi_n
  • tamil
  • telugu
  • urdu
About C-DAC  |  Products & Services  |  Research & Development  |  Press Kit  |  Downloads  |  Careers  |   Tenders    |  Contact Us
High Performance Computing,
Grid & Cloud Computing
Multilingual Computing
Professional Electronics
Software Technologies
Cyber Security
Health Informatics
Education & Training
 
About
Tech. Prog
Muti-Core
GPGPUs
GPU-Cluster
App. Kernels
Reg./Contact

Mode-1 Multi-cores | Memory Allocators | OpenMP | Intel TBB| Pthreads | Java - Threads| Charm++ Prog.
Message Passing(MPI) |MPI-OpenMP|MPI-Intel TBB |MPI-Pthread|Compiler Opt. Features
Threads-Perf. Math. lib. |Threads-Prof. & tools |Threads-I/O Perf. |PGAS:UPC/CAF/GA |Home

Programming on Multi-Core Processors Using Pthreads (POSIX Threads)
Pthreads

Pthreads are defined as a set of C-language programming types and procedure calls, implemented with a pthread.h header/include file and a thread library. Solaris threads are easily understood by someone familiar with POSIX threads, and while Java threads and the multi-threading in the Win32 and OS/2 APIs are a little different . The subroutines, which comprise the Pthreads APIs, can be formally grouped into three classes such as Thread Management, Mutex Variables and Condition Variables. Threaded applications offer potential performance gains and practical advantages over non-threaded applications in several other ways as we can observe from the different programs.

Example programs using different APIs. Compilation and execution of Pthread programs, programs numerical and non-numerical computations are discussed using different thread APIs to understand Performance issues on mutli-core processors.

Example 2.1 Pthread prorgam to compute Pie value by Numerical Integration method.
Example 2.2 Write Pthread code to perform Vector-Vector Multiplication using block striped partitioning.
Example 2.3
Write Pthread code to find Infinity norm of the Matrix using block striped partitioning (Row-wise Partitioning of Matrix)
Example 2.4
Write Pthread code to find Infinity norm of the Matrix using block striped partitioning (Column-wise Partitioning of Matrix)
Example 2.5
Write a Pthreads program to solve a system of linear equations AX=b using Parallel Jacobi Method.


(Source - References : Books     Multi-threading -[MCMTh-01], [MCMTh-02], [MCMTh-I03], [MCMTh-05], [MCMth-09], [MCMth-11], [MCMTh-15], [MCMTh-21], [MCBW-44] )



Description of Pthread Programs

Example 2.1: Write a Pthreads program to compute the value of PI function by numerical integration. ( Download source code : ) pthread-numerical-integration.c


  • Objective
  • Write a Pthreads program to compute the value of PI function by numerical integration.

  • Description
  • This program computes the value of PI over a given interval using Numerical integration. The main thread distributes the given interval uniformly over the number of threads. Each thread calculates its part of the interval and finally adds it up to the result variable. Each thread locks a mutex before doing the same to guarantee the atomicity of the operation.

    Threaded APIs provide support for implementing critical sections and atomic operations using mutex-locks (mutual exclusision locks). Each thread calculates its part of the interval and finally adds it up to the result variable. Mutex-Locks have two states : locked and unlocked. At any point of time, only one thread can lock a mutex lock.

    Each thread locks a mutex before doing the same to guarantee the atomicity of the operation. The Mutex-lock is an atomic operation generally associated with a piece of code that manipulates shared data. To access the sared data, a thread must first try to acquire a mutex-lock. If the mutex-lock is already locked, the process trying to acquire the lock is blocked.

  • Input
  • Number of Threads.

  • Output
  • Calculated Value of PI

Example 2.2: write a Pthreads program to compute the vector-vector multiplication with Pthreads using block striped partitioning for uniform data distribution.
( Download source code : ) pthread-vectorvector-multi.c


  • Objective
  • To write a Pthreads program to compute the vector-vector multiplication with Pthreads using block striped partitioning for uniform data distribution.

  • Description
  • This is an implementation of Vector-Vector multiplication using the block striped partitioning algorithm. Each thread multiplies the corresponding elements and writes the product into the result vector. A Mutex is used on the result vector to guarantee atomicity. The thread accesses the elements based on its id which is allocated by the main thread in the order of their creation. As the number of threads and the number of elements is known, the corresponding elements to be accessed can easily be computed.

  • Input
  • Vector Size and Number Threads. Number of threads should be a factor of Vector size.

  • Output
  • Dot Product of the given vectors.

Example 2.3: Write Pthread code to find Infinity norm of the Matrix using block striped partitioning - Row Wise distribution. ( Download source code : ) pthread-infinitynorm-rowwise.c


  • Objective
  • Write Pthread code to find Infinity norm of the Matrix using block striped partitioning - Row Wise distribution.

  • Description
  • Infinity Norm of a Matrix: The Row-Wise infinity norm of a matrix is defined to be the maximum of sums of absolute values of elements in a row, over all rows.In the row wise distribution, each thread finds the sum of the elements of that row and compares it with the result variable which is initialized to zero. If the sum is greater than the current value of the result, it updates the result variable to reflect the new result. A Mutex is used on the result variable. Distribution of rows is done knowing the id of each thread assigned by the main thread in the order of their creation and the total number of threads.

  • Input
  • Number of Threads and the input file. Number of threads must be a factor of number of rows.

  • Output
  • Infinity Norm of the given Matrix.

Example 2.4: Write Pthread code to find Infinity norm of the Matrix using block striped partitioning - column Wise distribution. ( Download source code : ) pthread-infinitynorm-colwise.c


  • Objective
  • Write Pthread code to find Infinity norm of the Matrix using block striped partitioning - column Wise distribution.

  • Description
  • In this method, the total columns are distributed among the child threads. Each thread adds the value to the corresponding element in a vector whose length is equal to the number of rows of the matrix. Each element of the vector that holds the row-wise sum of the matrix is protected by a mutex to guarantee granularity of the operation. When all the threads are done, the vector holds the sum of each row of the matrix. The main thread now picks the maximum of the sums and prints it as the Infinity Norm.

  • Input
  • Number of Threads and the input file. Number of threads must be a factor of number of columns.

  • Output
  • Infinity Norm of the given Matrix.

Example 2.5: Write a Pthreads program to solve a system of linear equations AX=B using Parallel Jacobi Method. ( Download source code : ) pthread-jacobi.c


  • Objective
  • Write a Pthreads program to solve a system of linear equations AX=B using Parallel Jacobi Method.

  • Description
  • Write a Pthread program to solve the system of linear equations [A]{x} = {b} using Gaussian elimination without pivoting and a back-substitution. Assume that A is symmetric positive definite dense matrix of size n. You may assume that n is evenly divisible by p.

  • Input
  • Number of Threads, Size of Real Symmetric Positive definite Matrix in terms of Class where

    Class A : 1024
    Class B : 2048
    Class C : 4096

  • Output
  • The solution of Ax=b and the number of iterations for convergence of the method and also the execution time.