[GRASS-dev] gmath BLAS/LAPACK wrapper

Hi Brad, Dear devs,
i am currently working on a ATLAS - BLAS/LAPACK implementation
within the gpde library.
The idea is to provide full access to ATLAS BLAS level 1-3
functions and have multi-threaded BLAS 1-3 functions implemented in GRASS
as backup. I have almost finished the BLAS level 1 vector routines and
working on
level 2 the vector - matrix and level 3 matrix - matrix implementation.

The grass BLAS level 1-3 implementation is of course not as fast as
the ATLAS library.
ATLAS is cache optimized and uses recursive functions.
Additionally, ATLAS uses pthreads to provide multi threaded level 3 functions.

But for each specialised ATLAS function, a more general grass functions
will exists. Eg: there are level 3 functions for quadratic, symmetric and hermitian
matrices in BLAS, grass will provide only one matrix-matrix function.
If the user compiles grass without ATLAS support, all the modules
which are using
BLAS functions will still work ... but not that fast. :wink:

I try to keep the interface as simple as i can. The vectors are 1d
float or double arrays
and matrices are 1d float or double arrays with additional row
pointers (using the G_alloc_vector and
G_alloc_matrix functions). I use the same low level implementation
as the ATLAS interface. The user must take care of correct allocation
and row, column counting.

Additionally i will try to create a wrapper to the LAPACK solver
provided in ATLAS.
IMHO the gpde lu decomposition with row pivoting
and the gpde bandwidth optimized cholesky solver are the backup routines
for most of the LAPACK sover.

The krylov space solver (cg and bicgstab) will use the ATLAS BLASS 1-2
functions.

So the gpde interface is much more low level than the gmath LAPACK/BLAS
wrapper. And i think we can use the gpde routines instead of the native
lapack routines in the gmath warpper. But the matrix functions will work
in row major order.
The blas functions are named like the (IMHO horrible)
netlib-blas naming convention, accept that i have put a N_ before the name.
Eg:
N_cblas_ddot, N_cblas_sdot ... .

If you use the gpde interface you will automatically use ATLAS. And if a user
don't compile grass with ATLAS support,
the modules which are using the gmath structures
will still work. :slight_smile:

What do you think? Unfortunately Glynn did not respond yet about this topic.
But i think renaming the gmath interface is a good idea. I had also in mind
to rename it in Gmath_ or G_math_ ... .

Here an ATLAS-gpde Interface example:

/*grass blas level 1 implementation of the dot product*/
double N_blas1_d_x_dot_y(double *x, double *y, int rows)
{
    int i;
    double s = 0.0;
#pragma omp parallel for schedule (static) reduction(+:s)
    for (i = rows - 1; i >= 0; i--) {
  s += x[i] * y[i];
    }
    return s;
}

/*The atlas wrapper with grass backup routine*/
double N_cblas_ddot(double *x, double *y, int rows)
{
#if defined(HAVE_ATLAS)
     return cblas_ddot(rows, x, 1, y, 1);
#else
     return N_blas1_d_x_dot_y(x, y, rows)
#endif
}

Best regards
Soeren

Btw.:
I have updated the allocation routines in the gmath library but did not
committed it to cvs. I think i will have a look on your changes
before i commit.

--
Der GMX SmartSurfer hilft bis zu 70% Ihrer Onlinekosten zu sparen!
Ideal für Modem und ISDN: http://www.gmx.net/de/go/smartsurfer

On Sat, 2007-08-25 at 20:45 +0200, "Sören Gebbert" wrote:

Hi Brad, Dear devs,
i am currently working on a ATLAS - BLAS/LAPACK implementation
within the gpde library.
The idea is to provide full access to ATLAS BLAS level 1-3
functions and have multi-threaded BLAS 1-3 functions implemented in GRASS
as backup. I have almost finished the BLAS level 1 vector routines and
working on
level 2 the vector - matrix and level 3 matrix - matrix implementation.

It took me a little bit to get back on this. I needed to ensure that
ATLAS supports all of the LAPACK functions we need (it only supports a
subset).

We also missed each other (probably by minutes) several times on
IRC. :slight_smile:

The grass BLAS level 1-3 implementation is of course not as fast as
the ATLAS library.
ATLAS is cache optimized and uses recursive functions.
Additionally, ATLAS uses pthreads to provide multi threaded level 3 functions.

Sounds good to me. I'll have to look into replacing autotools
BLAS/LAPACK detection with ATLAS. It shouldn't be terribly difficult
unless we want to support all derivatives of CBLAS/CLAPACK.

I could use a little direction there as to which way to go. Glynn?
Markus?

But for each specialised ATLAS function, a more general grass functions
will exists. Eg: there are level 3 functions for quadratic, symmetric and hermitian
matrices in BLAS, grass will provide only one matrix-matrix function.
If the user compiles grass without ATLAS support, all the modules
which are using
BLAS functions will still work ... but not that fast. :wink:

That works for me. This prevents the current caveat of "install
BLAS/LAPACK or these modules won't compile".

Do all of the lower level functions in lib/gmath, then lib/gpde calls
the gmath function, which automatically determins which version to run
(actually, it'll be determined at compile time).

I try to keep the interface as simple as i can. The vectors are 1d
float or double arrays
and matrices are 1d float or double arrays with additional row
pointers (using the G_alloc_vector and
G_alloc_matrix functions). I use the same low level implementation
as the ATLAS interface. The user must take care of correct allocation
and row, column counting.

Additionally i will try to create a wrapper to the LAPACK solver
provided in ATLAS.
IMHO the gpde lu decomposition with row pivoting
and the gpde bandwidth optimized cholesky solver are the backup routines
for most of the LAPACK sover.

The krylov space solver (cg and bicgstab) will use the ATLAS BLASS 1-2
functions.

So the gpde interface is much more low level than the gmath LAPACK/BLAS
wrapper. And i think we can use the gpde routines instead of the native
lapack routines in the gmath warpper. But the matrix functions will work
in row major order.

If you have lower level functions, then those should be moved to
lib/gmath, IMO.

The blas functions are named like the (IMHO horrible)
netlib-blas naming convention, accept that i have put a N_ before the name.
Eg:
N_cblas_ddot, N_cblas_sdot ... .

I'd rather use G_math_ddot (), G_math_sdot ()...

I think it'll be simpler to determine the function operation at compile
time rather than run time.

--
73, de Brad KB8UYR/6 <rez touchofmadness com>

Hi,

-------- Original-Nachricht --------

Datum: Thu, 30 Aug 2007 03:01:34 +0000
Von: Brad Douglas <rez@touchofmadness.com>
An: "Sören "Gebbert\\"" <soerengebbert@gmx.de>
CC: grass-dev@grass.itc.it, Glynn Clements <glynn@gclements.plus.com>, Markus Neteler <neteler@itc.it>
Betreff: Re: [GRASS-dev] gmath BLAS/LAPACK wrapper

On Sat, 2007-08-25 at 20:45 +0200, "Sören Gebbert" wrote:
> Hi Brad, Dear devs,
> i am currently working on a ATLAS - BLAS/LAPACK implementation
> within the gpde library.
> The idea is to provide full access to ATLAS BLAS level 1-3
> functions and have multi-threaded BLAS 1-3 functions implemented in
GRASS
> as backup. I have almost finished the BLAS level 1 vector routines and
> working on
> level 2 the vector - matrix and level 3 matrix - matrix implementation.

It took me a little bit to get back on this. I needed to ensure that
ATLAS supports all of the LAPACK functions we need (it only supports a
subset).

We also missed each other (probably by minutes) several times on
IRC. :slight_smile:

Ill keep on trying. :slight_smile:

> The grass BLAS level 1-3 implementation is of course not as fast as
> the ATLAS library.
> ATLAS is cache optimized and uses recursive functions.
> Additionally, ATLAS uses pthreads to provide multi threaded level 3
functions.

Sounds good to me. I'll have to look into replacing autotools
BLAS/LAPACK detection with ATLAS. It shouldn't be terribly difficult
unless we want to support all derivatives of CBLAS/CLAPACK.

I could use a little direction there as to which way to go. Glynn?
Markus?

> But for each specialised ATLAS function, a more general grass functions
> will exists. Eg: there are level 3 functions for quadratic, symmetric
and hermitian
> matrices in BLAS, grass will provide only one matrix-matrix function.
> If the user compiles grass without ATLAS support, all the modules
> which are using
> BLAS functions will still work ... but not that fast. :wink:

That works for me. This prevents the current caveat of "install
BLAS/LAPACK or these modules won't compile".

Do all of the lower level functions in lib/gmath, then lib/gpde calls
the gmath function, which automatically determins which version to run
(actually, it'll be determined at compile time).

Thats a good idea, i will place the lower level functions from the gpde library
into the gmath library. This will include the ATLAS wrapper, the grass blas level 1-3
implementation and all the linear equation solver i have implemented.

> I try to keep the interface as simple as i can. The vectors are 1d
> float or double arrays
> and matrices are 1d float or double arrays with additional row
> pointers (using the G_alloc_vector and
> G_alloc_matrix functions). I use the same low level implementation
> as the ATLAS interface. The user must take care of correct allocation
> and row, column counting.
>
> Additionally i will try to create a wrapper to the LAPACK solver
> provided in ATLAS.
> IMHO the gpde lu decomposition with row pivoting
> and the gpde bandwidth optimized cholesky solver are the backup routines
> for most of the LAPACK sover.
>
> The krylov space solver (cg and bicgstab) will use the ATLAS BLASS 1-2
> functions.
>
> So the gpde interface is much more low level than the gmath LAPACK/BLAS
> wrapper. And i think we can use the gpde routines instead of the native
> lapack routines in the gmath warpper. But the matrix functions will work
> in row major order.

If you have lower level functions, then those should be moved to
lib/gmath, IMO.

Agreed.

> The blas functions are named like the (IMHO horrible)
> netlib-blas naming convention, accept that i have put a N_ before the
name.
> Eg:
> N_cblas_ddot, N_cblas_sdot ... .

I'd rather use G_math_ddot (), G_math_sdot ()...

I think it'll be simpler to determine the function operation at compile
time rather than run time.

Ok, i will name the blas 1 - 3 ATLAS function as you suggested.
But how should i name the grass backup blas functions? Currently i dont use the
blas naming convention because its to cryptic.
Because the grass blas implementation is more general, i have named those
functions like: N_blas1_d_xdoty() or N_blas2_d_Ax_by() and so on.
So i will name it G_math_blas1_d_xdoty() ....

What parts of the LAPACK functions do we need in grass currently and
ion the future? Because ATLAS supports only a few of them, this is an important
question i think.

Best regards
soeren

--
73, de Brad KB8UYR/6 <rez touchofmadness com>

--
Ist Ihr Browser Vista-kompatibel? Jetzt die neuesten
Browser-Versionen downloaden: http://www.gmx.net/de/go/browser