SpeedIT 2.4 (OpenCL backend)

Page 1

SpeedIT OpenCL SpeedIT 2.4 Reference Manual Vratis Ltd.

www.vratis.com speed-it.vratis.com

September 2013


Changes from version 2.2 • Description of SpeedIT OpenCL library

2


CONTENTS

Contents 1 INTRODUCTION 1.1 Licensing . . . . . . . . . . . . . . . 1.2 Hardware requirements . . . . . . . . 1.3 Supported Operating Systems . . . 1.4 Font conventions . . . . . . . . . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

4 4 4 4 4

. . . . . . . . . . . . . . . . . . . . . . . . . . . user programs . . . . . . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

5 5 5 5 5 5

3 USING THE LIBRARY 3.1 Code Examples of equation solving using OpenCL . . . 3.1.1 Creating matrix example . . . . . . . . . . . . . 3.1.2 Sample equation solving without preconditioner . 3.1.3 Adding Jacobi preconditioner to equation solving 3.1.4 Examples of using CG solver . . . . . . . . . . . 3.1.5 Examples of using BiCGStab solver . . . . . . . 3.2 Data Formats . . . . . . . . . . . . . . . . . . . . . . . . 3.2.1 Vector storage format . . . . . . . . . . . . . . . 3.2.2 Sparse matrix storage format . . . . . . . . . . . 3.3 Preconditioners . . . . . . . . . . . . . . . . . . . . . . . 3.3.1 Preconditioner usage . . . . . . . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

6 6 6 6 7 7 8 8 8 9 10 10

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . (BiCGStab) . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

11 11 11 12 12 12 12 13 13 15

2 GETTING STARTED 2.1 Obtaining SpeedIT libraries . . . . 2.2 System requirements . . . . . . . . 2.3 Installation . . . . . . . . . . . . . 2.4 Using SpeedIT OpenCL libraries in 2.5 Uninstallation . . . . . . . . . . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

4 Application Programming Interface 4.1 Data storage . . . . . . . . . . . . . . . . . . . 4.1.1 IO interfaces . . . . . . . . . . . . . . . 4.1.2 Matrix format . . . . . . . . . . . . . . 4.1.3 Vector format . . . . . . . . . . . . . . . 4.2 Linear Algebra . . . . . . . . . . . . . . . . . . 4.2.1 Preconditioners . . . . . . . . . . . . . . 4.2.2 Conjugate Gradient solver (CG) . . . . 4.2.3 Stabilized Bi-Conjugate Gradient solver 4.3 Error handling . . . . . . . . . . . . . . . . . .

3

. . . .


1

INTRODUCTION

SpeedIT OpenCL is a library that provides a set of accelerated solvers for sparse systems of linear equations. Acceleration is done by exploiting the computational capabilities of modern Graphics Processing Units (GPUs) with OpenCL technology enabled. All computations are performed with single or double floating point precision.The library interface is written in C and is designed to be called from C/C++, Fortran and other high-level languages. SpeedIT OpenCL library is designed with specific goal in mind. It should be easy to use by a person without the knowledge of OpenCL technology. This approach allows to avoid a steep learning curve for the beginners and deeper control and optimization for advanced OpenCL users. SpeedIT OpenCL provides memory management included inside of the classes, which allows user to focus on main task and minimizes concerns regarding detailed informations about access to memory. SpeedIT OpenCL library is a low level library, therefore it is capable of maximizing its performance, portability and compatibility with existing software. However, it provides some helpers for user, e.g. matrix and vector loaders, allowing user to omit steps needed to prepare input buffers with exact content and size.

1.1

Licensing

SpeedIT OpenCL is available for academic, government and commercial institutions. For more information see our web page http://speed-it.vratis.com. SpeedIT OpenCL is utilising ViennaCL under MIT licence.

1.2

Hardware requirements

To fully utilize SpeedIT OpenCL library processing power, GPU with OpenCL drivers enabled is required.

1.3

Supported Operating Systems

SpeedIT OpenCL libraries require hardware platform with OpenCL enabled GPU. SpeedIT OpenCL library was built and tested on the following systems: • Ubuntu 10.0 Desktop 64-bit • Ubuntu 12.04 Desktop 64-bit • Debian 7.1 64-bit

1.4

Font conventions

In this document the following notations are used: • Courier New - fragment of source code, variable names etc. • A - matrices ~ - vectors • X 4


2 2.1

GETTING STARTED Obtaining SpeedIT libraries

SpeedIT OpenCL libraries can be obtained from our sales team after registration at http: //speed-it.vratis.com. Before download please inform us about your system configuration so that we could choose a proper library version. If your operating system is not listed (see 1.3) you can still try SpeedIT OpenCL. However, SpeedIT OpenCL library was tested only on the systems cited in section 1.3.

2.2

System requirements

SpeedIT OpenCL library requires hardware platform with OpenCL-enabled device GPU with OpenCL driver installed. After installing OpenCL on Linux please remember to update your PATH and LD LIBRARY PATH (and/or ldconfig configuration) environment variables as mentioned in OpenCL documentation.

2.3

Installation

SpeedIT OpenCL library is distributed as a dynamic linked library and header files. To install the binary you need only to copy a library file into a directory that is accessible by the operating system. On Linux systems it is usually one of the directories /lib, /usr/lib or /usr/local/lib and directories from environment variable LD LIBRAY PATH. You may also add manually the path to SpeedIT OpenCL library to LD LIBRAY PATH environment variable.

2.4

Using SpeedIT OpenCL libraries in user programs

To use SpeedIT OpenCL libraries in user program on Linux systems two steps are required. First, path to directory containing library header file should be passed to compiler. It can be done by passing option -I/path to SpeedIT OpenCL header files to gcc compiler. The second step requires linking binary library file with your program. This is done by passing options -L/path to SpeedIT OpenCL library -lSpeeditOpenCL to the linker (usually gcc). Remember, that SpeedIT OpenCL library requires also mpi and mpi cxx libraries. Example of program code using SpeedIT OpenCL and Linux compilation command is provided with library.

2.5

Uninstallation

To uninstall SpeedIT OpenCL libraries it is enough to remove binary library file and header file from your system.

5


3

USING THE LIBRARY

3.1 3.1.1

Code Examples of equation solving using OpenCL Creating matrix example

// C r e a t e s p a r s e matrix i n CSR format // // | 0 1 0 2 | // | 0 0 3 0 | // | 4 5 6 0 | // | 0 0 7 8 | double v a l s [ ] = { 1 . 0 , 2 . 0 , 3 . 0 , 4 . 0 , 5 . 0 , 6 . 0 , 7 . 0 , 8.0} ; i n t c o l i d x [ ] = { 1 , 3 , 2 , 0 , 1 , 2 , 2 , 3} ; i n t r o w i d x [ ] = { 0 , 2 , 3 , 6 , 8} ; i n t rows = 4 ; int cols = 4 ; i n t nnz = 8 ; s i c l : : C r s m a t r i x <double> c r s m a t r i x ( v a l s , c o l i d x , row idx , c o l s , rows , nnz ) ; crs matrix . print ( ) ;

3.1.2

Sample equation solving without preconditioner

#i n c l u d e #i n c l u d e #i n c l u d e #i n c l u d e #i n c l u d e #i n c l u d e #i n c l u d e #i n c l u d e #i n c l u d e

” s p e e d i t c l . h” ” s i c l m a t r i x . h” ” s i c l v e c t o r . h” ” s i c l m e m o r y . h” ” s i c l p r e c o n d i t i o n e r s . h” ” s i c l c g . h” ” s i c l i o . h” <s t d i o . h> < s t d l i b . h>

i n t main ( i n t argc , c h a r ∗∗ argv ) { s i c l : : i n i t ( −1 , s i c l : : VIENNA CL, s i c l : : ERROR) ; d o u b l e ∗ v a l s = NULL; i n t ∗ r o w i d x = NULL; i n t ∗ c o l i d x = NULL; i n t nnz = 0 ; int cols = 0; i n t rows = 0 ; s i c l : : l i n a l g : : s o l v e r s : : c g c o n v i n f o tag ; // matrix d e c l a r a t i o n , a l l o c a t i o n and l o a d i n g s t d : : s t r i n g m a t r i x f i l e = ” / path / t o / matrix / f i l e . mtx” ; s i c l : : i o : : l o a d m a t r i x ( m a t r i x f i l e , nnz , rows , c o l s , &v a l s , &row idx , &c o l i d x ) ; 6


3.1

Code Examples of equation solving using OpenCL

s i c l : : C r s m a t r i x <double> c r s m a t r i x ( v a l s , c o l i d x , row idx , c o l s , rows , nnz ) ; crs matrix . print ( ) ; // v e c t o r d e c l a r a t i o n , a l l o c a t i o n and l o a d i n g d o u b l e ∗ v e c d a t a = NULL; i n t v e c s i z e =0; s t d : : s t r i n g v e c t o r f i l e = ” / path / t o / v e c t o r / f i l e . t x t ” ; s i c l : : i o : : l o a d v e c t o r p l a i n ( v e c t o r f i l e , v e c s i z e , &v e c d a t a ) ; s i c l : : Vector<double> r h s ( v e c s i z e , v e c d a t a ) ; s t d : : c o u t << ”RHS : ” ; r h s . p r i n t ( ) ; s i c l : : Vector<double> x ( v e c s i z e ) ; x. clear (); s t d : : c o u t << ”X : ” ; x . p r i n t ( ) ; // t e s t with no p r e c o n d i t i o n e r s i c l : : l i n a l g : : s o l v e r s : : cg<double > : : s o l v e ( x , c r s m a t r i x , rhs , t a g ) ; s t d : : c o u t << ”X s o l v e d : ” ; x . p r i n t ( ) ; x. clear (); // end t e s t with no p r e c o n d i t i o n e r } First code line is library initialization. After equation definition you have to invoke solver.

3.1.3

Adding Jacobi preconditioner to equation solving

// t e s t with p r e c o n d i t i o n e r // c r e a t e j a c o b i p r e c o n d i t i o n e r s i c l : : l i n a l g : : p r e c o n d i t i o n e r s : : j a c o b i <double> jacobi preconditioner ( crs matrix ) ; // j a c o b i can be exchanged f o r a n o t h e r s u i t a b l e // p r e c o n d i t i o n e r from s i c l : : l i n a l g : : p r e c o n d i t i o n e r s namespace s i c l : : l i n a l g : : s o l v e r s : : cg<double > : : s o l v e ( x , c r s m a t r i x , rhs , tag , j a c o b i p r e c o n d i t i o n e r ) ; s t d : : cout<< ”X j a c o b i s o l v e d : ” ; x . p r i n t ( ) ; x. clear (); You have to add one line to define preconditioner and include it in solver invocation.

3.1.4

Examples of using CG solver

// t e s t with no p r e c o n d i t i o n e r s i c l : : l i n a l g : : s o l v e r s : : cg<double > : : s o l v e ( x , c r s m a t r i x , rhs , t a g ) ; // p r e c o n d i t i o n e r from s i c l : : l i n a l g : : p r e c o n d i t i o n e r s namespace 7


3.2

Data Formats

s i c l : : l i n a l g : : s o l v e r s : : cg<double > : : s o l v e ( x , c r s m a t r i x , rhs , tag , j a c o b i p r e c o n d i t i o n e r ) ; 3.1.5

Examples of using BiCGStab solver

// t e s t with no p r e c o n d i t i o n e r s i c l : : l i n a l g : : s o l v e r s : : b i c g s t a b <double > : : s o l v e ( x , c r s m a t r i x , rhs , t a g ) ; // p r e c o n d i t i o n e r from s i c l : : l i n a l g : : p r e c o n d i t i o n e r s namespace s i c l : : l i n a l g : : s o l v e r s : : b i c g s t a b <double > : : s o l v e ( x , c r s m a t r i x , rhs , tag , j a c o b i p r e c o n d i t i o n e r ) ;

3.2

Data Formats

The SpeedIT OpenCL uses two important data types: dense vectors and sparse matrices. Dense vectors are expressed in SpeedIT OpenCL Vector type (see Sec. 3.2.1). Sparse matrices are expressed in compressed sparse row (CSR) format (see Sec. 3.2.2). Dense vectors and sparse matrices may contain data in one of the following types: • floating point single precision numbers – float • floating point double precision numbers – double 3.2.1

Vector storage format

Vector can be supplied to SpeedIT OpenCL functions via pointer to Vector object, which can be created by loading mtx file, text file or one integer and one C array: • size (integer) — the number of vector elements. • scalar (pointer) — an array of elements in float or double format. Example Let vector A be defined as A=

h

0 1 0 2

Then data representation is: size = 4 scalar = [0 1 0 2]

8

i


3.2

Data Formats

3.2.2

Sparse matrix storage format

A sparse matrix is represented in the well-known compressed sparse row (CSR) format. Matrices in CSR can be supplied to SpeedIT OpenCL functions via pointer to Crs matrix object, which can be created by loading mtx file or by defining three C arrays and three integers: • vals (pointer) — an array holding all nonzero matrix values in row-major order. The number of elements in this array is equal to the number of all nonzero matrix elements. • col idx (pointer) — an integer array of column indices of the corresponding elements in array vals. • row idx (pointer) — an integer array of indices in vals corresponding to first nonzero elements in consecutive matrix rows (the j-th element of the array stores the index of the first nonzero element in the j-th matrix row for j = 0, . . . ,nrows). The number of elements in row idx is equal to number of rows + 1. The value of the last element in the array is equal to the number of nonzero elements in the matrix. Alternatively, row idx can be defined as an array such that the difference row idx[j+1] − row idx[j] gives the number of nonzero elements in the j-th row. • rows (integer) — the number of matrix rows. • cols (integer) — the number of matrix columns. • nnz(integer — the number of non zero values. All indices are zero based (the index of the first matrix row or column is 0). Example Let sparse matrix A be defined as    

A=

0 0 4 0

1 0 5 0

Then its CSR representation is: vals = [1 2 3 4 5 6 7 8] col idx = [1 3 2 0 1 2 2 3] row idx = [0 2 3 6 8] rows = 4 cols = 4 nnz = 8

9

0 3 6 7

2 0 0 8

    


3.3

Preconditioners

3.3

Preconditioners

SpeedIT OpenCL offers set of preconditioners which improve the rate of convergence of iterative solvers. Following precondtioners are available: • Jacobi - Jacobi (or diagonal) preconditioner • Rowscaling - Row scaling preconditioner • ILU0 - Incomplete LU factorization with a 0 filling 3.3.1

Preconditioner usage

For each type of preconditioner SpeedIT OpenCL offers set of functions to create preconditioner structure in memory i.e.: Listing 1: BiCGStab solver without and with preconditioner // s o l v e r w i t h o u t p r e c o n d i t i o n e r s i c l : : l i n a l g : : s o l v e r s : : b i c g s t a b <double > : : s o l v e ( x , c r s m a t r i x , rhs , t a g ) ; // s o l v e r with p r e c o n d i t i o n e r s i c l : : l i n a l g : : p r e c o n d i t i o n e r s : : j a c o b i <double> jacobi preconditioner ( crs matrix ) ; s i c l : : l i n a l g : : s o l v e r s : : b i c g s t a b <double > : : s o l v e ( x , c r s m a t r i x , rhs , tag , j a c o b i p r e c o n d i t i o n e r ) ;

10


4

Application Programming Interface

4.1

Data storage

SpeedIT OpenCL have two types of data storage: matrix and vector. Data can be loaded from a file or defined in source code. 4.1.1

IO interfaces

Loading matrix from mtx file sicl::io::load_matrix(filename, nnz,n_rows,n_cols,values,r_idx,c_idx); Parameters • filename - std::string • nnz - integer. Returns number of non zero values • n rows - integer. Returns number of rows • n cols -integer. Returns number of cols • values - scalar*. Returns pointer to array of values • r idx - integer*. Returns pointer to array of row offset • c idx - integer*. Returns pointer to array of colums indices Load vector from mtx file sicl::io::load_vector(filename,n_element,data); Parameters • filename - std::string • n element - integer. Returns number of values • data - integer*. Returns pointer to array of values Load vector from text file sicl::io::load_vector_plain(filename,n_element,data); Parameters • filename - std::string • n element - integer. Returns number of values • data - integer*. Returns pointer to array of values

11


4.2

Linear Algebra

4.1.2

Matrix format

Matrix in SpeedIT OpenCL is contained in object of sicl::Crs matrix class. It has two constructors: Crs_matrix(scalar* values, int* col_idx, int* row_offset, int cols, int rows, int nnz); Crs_matrix(const Crs_matrix & matrix); And following functions: void copy(scalar* values, int* col_idx, int* row_offset, int cols, int rows, int nnz); int getNumberOfColumns() const; int getNumberOfRows() const; int getNumberOfNz() const; void setNumberOfColumns(int cols); void setNumberOfRows(int rows); void setNumberOfNz(int nnz); bool isOwner(); void print(); Parameters are described in 3.2.2 4.1.3

Vector format

Vector in SpeedIT OpenCL is contained in object sicl::Vector class. Constructor: Vector(unsigned int size, scalar* data); Functions: void copy(unsigned int size, scalar* data); scalar* getData(); void print( unsigned int elements = 10); Parameters are decribed in 3.2.1

4.2 4.2.1

Linear Algebra Preconditioners

SpeedIT OpenCL provides following classes containing preconditioners: • Jacobi - sicl::linalg::preconditioners::jacobi • Row scalling - sicl::linalg::preconditioners::row scaling • ILU0 - sicl::linalg::preconditioners::ilu0 All preconditioners have the same interface (with little exception of row scaling preconditioner), they expect a pointer to CRS matrix as first argument, e.g.: sicl::linalg::preconditioners::jacobi<double> jacobi_preconditioner(crs_matrix); Row scaling preconditioner takes two arguments where first one is a pointer to matrix as in previous example and second one is norm argument (0 for infinity), e.g.: 12


4.2

Linear Algebra

//rowscaled1 l^1 sicl::linalg::preconditioners::row_scaling<double> rowscl1(crs_matrix, 1); //rowscaled1 l^infinity sicl::linalg::preconditioners::row_scaling<double> rowscl0(crs_matrix, 0);

4.2.2

Conjugate Gradient solver (CG)

CG solver contains following solving functions: // non-preconditioned sicl::linalg::solvers::cg<double>::solve(x, crsmatrix, rhs, tag); // preconditioned sicl::linalg::solvers::cg<double>::solve(x, crsmatrix, rhs, tag, precond); Description The functions solve linear system of equations defined as A ∗ ~x = ~b where ~x and ~b are dense vectors and A is a sparse matrix represented in CSR format with zerobased indexing. Function parameters • x - pointer to sicl::Vector. Vector which stores the solution vector. On input the vector must be allocated. On output vector is filled with calculated solution • crsmatrix - pointer to sicl::Crs matrix (A matrix from equation). On input the matrix must be allocated and contain data. • rhs - pointer to sicl::Vector. Vector which stores b vector data from equation. On input the vector must be allocated and contain data. • tag - pointer to sicl::linalg::solvers::cg conv info object which stores informations about total error, tolerances and number of iterations made while solving • precond - pointer to preconditioner object. Following preconditioners are available for this solver: – Jacobi – Rowscaling – ILU0 Return value On success function returns SICL SUCCESS. In case of errors function may return another value. 4.2.3

Stabilized Bi-Conjugate Gradient solver (BiCGStab)

sicl::linalg::solvers::bicgstab<double>::solve(x, crsmatrix, rhs, tag); sicl::linalg::solvers::bicgstab<double>::solve(x, crsmatrix, rhs, tag, precond);

13


4.2

Linear Algebra

Description The functions solves linear system of equations defined as A ∗ ~x = ~b where ~x and ~b are dense vectors and A is a sparse matrix represented in CSR format with zerobased indexing. Function parameters • x - pointer to sicl::Vector. Vector which stores the solution vector. On input the vector must be allocated. On output vector is filled with calculated solution • crsmatrix - pointer to sicl::Crs matrix (A matrix from equation). On input the matrix must be allocated and contain data. • rhs - pointer to sicl::Vector. Vector which stores b vector data from equation. On input the vector must be allocated and contain data. • tag - pointer to sicl::linalg::solvers::bicgstab conv info object which stores informations about total error, tolerances and number of iterations made while solving • precond - pointer to preconditioner object. Following preconditioners are available for this solver: – Jacobi – Rowscaling – ILU0 Return value On success function returns SICL SUCCESS. In case of errors function may return another value.

14


4.3

4.3

Error handling

Error handling

Error handling in SpeedIT OpenCL library is based on exceptions. Developer can use following macrodefinition to invoke exception: THROW_EXCEPTION(data) data argument should contain string describing the nature of problem which triggered exception. This macrodefinition, before invoking exception, adds following string to the standard output: SICL: [data] in function [FUNC] at [FILE]: [LINE] Where FUNC, FILE and LINE represents function name, file path and line number, respectively.

15


Turn static files into dynamic content formats.

Create a flipbook
Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.