Using SIRAF

<<add stuff here>>

Programming

SIRAF User Meetings

10-02-09 SIRAF Users Meeting

Topics: Introduction to Statistical Computing using R on SIRAF (R. Zur)

Using R on SIRAF

R is a programming language and software environment for data analysis. It is the de facto program for statisticians. Here are some reasons why R is used over Matlab:

There are versions of R for Windows, Mac, and Linux. The Windows version has a GUI, and the Linux version must all be done from command line.

R can function as:

To get help, type (for function called "anova"):

For more help, visit: www.r-project.org.

R has a pretty good random number generator: rnorm, rlnorm, rbinorm, rpois. However, R is not very good at running things with loops in them. A suggestion is to write the code in C and make libraries that R can access. Lorenzo has a library for proproc in C.

Although parallel programming may exist for R, it sounds hard! You may use array jobs to run many programs in R. Since no license is required, you can run as many as you like without being restricted.

Questions:

What is the difference between R and Matlab?

Where can I get help?

Adding packages to SIRAF?

Linux commands

We are making a list of Linux commands for people new to using command line. Please email any questions you have about Linux commands (or suggestions for ones to add to the handbook) to Beverly or Ingrid. Thanks!

08-07-09 SIRAF Users Meeting

Topics: Planning ahead - talks from SIRAF Users about why and how they're using the Cluster

Parallel Programming using CILK ++ (C. Chan)

Parallel Programming using CILK ++

Computers have been made faster not by increase processor speed, but mainly by adding more processors. This is why parallel programming is an ideal tool to speed up your program run time. Most programs are written for serial execution. Parallel execution can decrease the amount of time a program takes to run by many fold; however, parallel programming (using OpenMP or MPI) requires writing new code and verifying that new code. CILK ++ is a jacket for parallel programming, which allows the user to easily change serial code into parallel code. Although using CILK may not deliver optimum speed up, it will deliver speed up with very little time investment to change your existing code. About a 4-16x speed up can be expected from CILK, which is considered acceptable.

History

CILK development was started in 1994 at MIT. It was originally written for SGI IRIX, but it was later ported to Linux, Mac OS, and Windows, and it was renamed CILK ++ and commercialized. The commercial company is called CILK Arts. Academic licenses for CILK are free for use and distribution. CILK ++ is compatible with C/C++ libraries in both directions.

CILK

Pthreads

OpenMP

Convolution Example

Important Notes

One should always check to make sure you get a sufficient speed up time before allocating CPUs to run your code. This is done using the following command:

>> cilkview filename

07-01-09: SIRAF Users Meeting

Topics:

GPU Programming for Matlab (J Bryant)

GPU Programming for Python and IDL (C Chan)

GPU Programming for Matlab GPU

Graphical processing unit, highly specialized functions specifically to do certain computing functions very quickly, for instance, those needed to display graphics. A lot are packed on to a single board. Can be very efficient. For linear processes.

Accelereyes - Jacket for Matlab, links Matlab interface to Cuda, NVIDIA libraries.

Add jacket engine path to Matlab path.

Basic linear subset examples: blas_example Runs on CPU and GPU and prints out the speed-up ratio.

GPU is 15x faster for simple matrix operations (for example, matrix multiply, rotate, add, FFT).

Exporting commands to GPU adds a large overhead in computation time.

gfor - parallel for loop restricted to GPU computing. If each for loop requires a lot of memory, then it's not worth it; that is, it will not speed up the computation). gfor is faster the second time you run it.

speed up time

gfor(GPU) vs for(GPU)

1.5x

gfor(GPU) vs for(CPU

8x

If the matrices are small, then the overhead is large and GPU doesn't help over CPU. Exportation to GPU is important. As long as it's a decent-sized problem, GPU will give a speed-up.

Jacket costs: $100 one node for student, $500 one node academic non-student.

GPU for IDL and Python

GPU libraries

>> ssh username [at] siraf-login.bsd.uchicago.edu

>> cd /opt/gpulib

>> ls (tells you about how to get started)

>> cd IDL

>> cd lib (.pro are procedures)

gpuDiv.pro: has commands on how to program with it in IDL.

>> cd /opt/gpulib/MATLAB/lib

>> less gpuArray.m (Documentation is source code and comments, look at function comments and try it out.)

>> less gputest (testing of each operation and tells the speed-up)

Important note: MOST GPUs only handle single floating point operations. We are waiting for the GTX 64-bit floating point GPUs to be cheaper to buy.

Question: How can we check what kind of GPU card we have on our computers?

Answer: >> /sbin/lspci (tells video cards, ethernet controller, peripheral buses). There is no video card on the master node.

The Big nodes on the cluster have two NVIDIA cards, each one with 1GB of memory.

SUMMARY

GPU is good for:

GPU is not good for:

06-03-09: SIRAF Users Meeting

Topics:

Parallel Programming (R Tomek)

Wiki Update (I Reiser)

PARALLEL PROGRAMMING

Parallel computing vs. parallel programming: parallel computing is taking one program and running many copies of it, each copy on a different processor. Parallel programming is taking a single program code and running parts of it on multiple processors. For instance, taking a for loop, and using 8 processors to reduce the total amount of time for the for loop. Note that this is not possible if each loop requires information from the previous loop.

Common libraries:

Library

Pros

Cons

Open MP

Easier to code and debug,
Gradual parallelization

Shared memory only,
Loop parallelization only

Open MPI

Each processor can have its
own local variables

Harder to code

How to code Open MP for C++, type the following before a for loop: #pragma omp parallel for

How to code Open MP for Fortran, type: !$ OMP PARALLEL ... !$ OMP DO ... !$ OMP END PARALLEL

How to code Open MPI: MPI_init(&argc,&argv); MPI_comm_rank(MPI_COMM_WORLD...) etc. Also, need to make sure you define datatypes: MPI_char, MPI_int, etc.

Submitting to SIRAF: shm.pe - Shared memory parallel environment >> echo appname filename > ~/script && qsub -q all.q -pe shm 8 ~/script

dist.pe - Distributed memory parallel environment >> echo appname filename > ~/script && qsub -q all.q -pe dist 16 ~/script

For MATLAB: >> qrsh -q small.q -pe shm 4 -now n matlab

This asks for 4 threads to do the math on one of the small nodes, and it tells the program not to kill itself if it can't find enough processors to run using the requested number of threads. Instead, it will wait until enough processors are available to run it.

Useful websites:

SIRAF WIKI


05-20-09 SIRAF Users Meeting

Topics:

Queue submission (I Reiser)

Job monitoring (A Jamieson)

Processes and threads (C Chan)

QUEUE SUBMISSION

Logging in: -Log in to the master node. >>ssh username(at)siraf-login.bsd.uchicago.edu

Submitting jobs: Do not run jobs in the login node. Instead, use qlogin (interactive submissions) or qsub (batch submissions)

Deleting jobs: >> qdel JOB_ID

Submitting array jobs: Used to submit many jobs using the same shell script, but different input files.

JOB MONITORING

Ways to monitor jobs:

Qmon - a program for monitoring the cluster. Two most heavily used features in QMON are Queue Control and Job Control, the two buttons at the upper left corner of the panel. These will tell you who is running jobs on the cluster and how much memory is being used in each node of the cluster.

PROCESSES AND THREADS

Concurrency - program can be divided into subtasks that can be run independently.

Parallelism - program can be divided into subtasks that can be run simultaneously.

SIRAFUserPages (last edited 2009-10-06 15:33:05 by BeverlyLau)