[GRASS-dev] Licence problems in gmath LU decomposition

Dear developers,
i just noticed that the code of G_ludcmp() is an exact copy of
the numerical recipes algorithm. :frowning:
We need to replace it.

I have implemented a new simple parallel LU decomposition in the gpde library (since several months in CVS).
But this algorithm lacks the ability of pivoting (it would be very hard to implement this as a parallel version).
So it is not usable within the rst library in grass and maybe others? This is a big problem.

The rst library creates linear equation systems with a zero entry in the diagonal matrix. All the
algorithms implemented in gpde library to solve linear equation systems are not able to
handle this (LU, Gauss, jacobi, SOR) except the bicgstab algorithm.
The bicgstab is a state of the art iterative linear equation solver algorithm for non-symmetric matrices.
But it will fail on very bad conditioned matrices, so it is not the best choice if a direct solve can be used.

I have created a very simple patch for the rst library (regular spline with tension) to use the bicgstab
algorithm form the gpde lib instead of G_ludcmp(). Just for testing.
It seems to work and it is not slower than G_ludcmp(). The patch is attached.
It would be better if the rst lib would create matrices with a better condition and without
zero entries at the diagonal matrix. Then we could use other gpde algorithms than bicgstab.

The main question is, what to do with G_ludcmp()? There is a free implementation of the LU
decomposition algorithm with pivoting in the GSL library, which we can use. Or we use the bicgstab
algorithm in cases where the simple LU algorithm from the gpde library will fail?

Any suggestions are welcome
Best regards
Soeren

(attachments)

rst_patch_bicgstab_solver_01_may_07.diff (2.28 KB)

Soeren -

by default rst does not have 0 on the diagonal - it has the value of the smoothing parameter
which is 0.1 or 0.01 (I would have to check, I just found it is missing in the man page).
This keeps the solution stable.
It runs with zero smoothing too if you have sufficiently high tension,
but it gets unstable for small tension and zero smoothing. That is why the defaults and
overall internal adjustments of the parameters are set to avoid it and the program
gives warnings about the overshoots. There is more here (see e.g. fig. 2)
http://skagit.meas.ncsu.edu/~helena/gmslab/papers/IEEEGRSL2005.pdf

But the G_ludcmp() needs to be replaced - it was discussed long time ago and
I somehow thought that it was done and the function does not include the num. recipes
program.

Thanks for the patch, I planned to look at the solver because of the performance issues,
so this helps, although the strange performance behavior since somewhere GRASS5*
may be more related to recently reported stunning degradation of performance for
rasters between 4.* and 5.* than in lineq. solver. I always though it is slower due to the
floating point introduction and did not realize that it affects integer rasters too,

Helena

(attachments)

LINEQS.c (2.96 KB)

Helena,

Helena Mitasova schrieb:

Soeren -

by default rst does not have 0 on the diagonal - it has the value of the smoothing parameter
which is 0.1 or 0.01 (I would have to check, I just found it is missing in the man page).
This keeps the solution stable.

This is right, but while assembling the matrix a new row and column is added
to the linear equation system. Because i don't know how the rst method works,
i have no clue if this can be avoided.
(The matrix assembling code looks like good old Fortran code converted to C)
Take a look at this sample session with 10 points:

GRASS 6.3.cvs > v.surf.rst --o input=ranvect elev=test segmax=600 tension=100 smooth=10
Percent complete:
Reading lines from vector map ... Reading nodes from vector map ... 100%
  111%
WARNING: strip exists with insufficient data

WARNING: 10 points given for interpolation (after thinning) is less than
          given NPMIN=300

The number of points from vector map is 10
The number of points outside of 2D/3D region 0
The number of points being used is 10
Processing all selected output files
will require 40016 bytes of disk space for temp files
Percent complete:
0.00000 1.00000 1.00000 1.00000 1.00000 1.00000 1.00000 1.00000 1.00000 1.00000 1.00000 * 0.00000 = 0.00000
1.00000 -10.00000 4.44122 5.22566 3.98839 5.46536 4.72084 3.82040 5.43142 4.38633 3.76136 * 0.00000 = 0.00000
1.00000 4.44122 -10.00000 4.48757 1.62831 4.11558 1.52070 2.10155 3.65654 2.48392 3.45935 * 0.00000 = 1.00000
1.00000 5.22566 4.48757 -10.00000 4.32808 3.35973 4.08760 4.81887 4.28695 3.66799 3.93563 * 0.00000 = 2.00000
1.00000 3.98839 1.62831 4.32808 -10.00000 4.27582 2.35813 1.77920 4.10082 1.82269 2.52327 * 0.00000 = 3.00000
1.00000 5.46536 4.11558 3.35973 4.27582 -10.00000 3.48756 4.69876 2.60567 3.71819 4.42648 * 0.00000 = 4.00000
1.00000 4.72084 1.52070 4.08760 2.35813 3.48756 -10.00000 3.14712 3.01723 1.98567 3.48508 * 0.00000 = 5.00000
1.00000 3.82040 2.10155 4.81887 1.77920 4.69876 3.14712 -10.00000 4.41198 3.17120 3.37596 * 0.00000 = 6.00000
1.00000 5.43142 3.65654 4.28695 4.10082 2.60567 3.01723 4.41198 -10.00000 3.78608 4.56449 * 0.00000 = 7.00000
1.00000 4.38633 2.48392 3.66799 1.82269 3.71819 1.98567 3.17120 3.78608 -10.00000 2.30508 * 0.00000 = 8.00000
1.00000 3.76136 3.45935 3.93563 2.52327 4.42648 3.48508 3.37596 4.56449 2.30508 -10.00000 * 0.00000 = 9.00000
Number of unknowns 11
BiCGStab -- iteration 0 error 285
BiCGStab -- iteration 1 error 778.166
BiCGStab -- iteration 2 error 2.90633
BiCGStab -- iteration 3 error 2.27454
BiCGStab -- iteration 4 error 0.369957
BiCGStab -- iteration 5 error 0.000523873
BiCGStab -- iteration 6 error 0.000268323
BiCGStab -- iteration 7 error 5.94182e-08
BiCGStab -- iteration 8 error 1.63458e-12
  100%
history initiated
v.surf.rst complete.

The first entry of the matrix is zero.
IMHO the first column and first row can not be removed. The diagonal entries [i][i] = -10
reflecting the smoothing parameter and will be zero if the smoothing parameter is zero
(exactly as you explained). Luckily the bicgstab method can handle this.

Here an example with smoothing = 0:

GRASS 6.3.cvs > v.surf.rst --o input=ranvect elev=test segmax=600 tension=100 smooth=0
Percent complete:
Reading lines from vector map ... Reading nodes from vector map ... 100%
  111%
WARNING: strip exists with insufficient data

WARNING: 10 points given for interpolation (after thinning) is less than
          given NPMIN=300

The number of points from vector map is 10
The number of points outside of 2D/3D region 0
The number of points being used is 10
Processing all selected output files
will require 40016 bytes of disk space for temp files
Percent complete:
0.00000 1.00000 1.00000 1.00000 1.00000 1.00000 1.00000 1.00000 1.00000 1.00000 1.00000 * 0.00000 = 0.00000
1.00000 -0.00000 4.44122 5.22566 3.98839 5.46536 4.72084 3.82040 5.43142 4.38633 3.76136 * 0.00000 = 0.00000
1.00000 4.44122 -0.00000 4.48757 1.62831 4.11558 1.52070 2.10155 3.65654 2.48392 3.45935 * 0.00000 = 1.00000
1.00000 5.22566 4.48757 -0.00000 4.32808 3.35973 4.08760 4.81887 4.28695 3.66799 3.93563 * 0.00000 = 2.00000
1.00000 3.98839 1.62831 4.32808 -0.00000 4.27582 2.35813 1.77920 4.10082 1.82269 2.52327 * 0.00000 = 3.00000
1.00000 5.46536 4.11558 3.35973 4.27582 -0.00000 3.48756 4.69876 2.60567 3.71819 4.42648 * 0.00000 = 4.00000
1.00000 4.72084 1.52070 4.08760 2.35813 3.48756 -0.00000 3.14712 3.01723 1.98567 3.48508 * 0.00000 = 5.00000
1.00000 3.82040 2.10155 4.81887 1.77920 4.69876 3.14712 -0.00000 4.41198 3.17120 3.37596 * 0.00000 = 6.00000
1.00000 5.43142 3.65654 4.28695 4.10082 2.60567 3.01723 4.41198 -0.00000 3.78608 4.56449 * 0.00000 = 7.00000
1.00000 4.38633 2.48392 3.66799 1.82269 3.71819 1.98567 3.17120 3.78608 -0.00000 2.30508 * 0.00000 = 8.00000
1.00000 3.76136 3.45935 3.93563 2.52327 4.42648 3.48508 3.37596 4.56449 2.30508 -0.00000 * 0.00000 = 9.00000
Number of unknowns 11
BiCGStab -- iteration 0 error 285
BiCGStab -- iteration 1 error 165.856
BiCGStab -- iteration 2 error 6.9055
BiCGStab -- iteration 3 error 1.2445
BiCGStab -- iteration 4 error 0.39798
BiCGStab -- iteration 5 error 0.155552
BiCGStab -- iteration 6 error 0.0760317
BiCGStab -- iteration 7 error 0.00483319
BiCGStab -- iteration 8 error 0.0191541
BiCGStab -- iteration 9 error 1.27935e-05
BiCGStab -- iteration 10 error 3.69912e-08
BiCGStab -- iteration 11 error 1.60303e-10
  100%
history initiated
v.surf.rst complete.

It runs with zero smoothing too if you have sufficiently high tension,
but it gets unstable for small tension and zero smoothing. That is why the defaults and
overall internal adjustments of the parameters are set to avoid it and the program
gives warnings about the overshoots. There is more here (see e.g. fig. 2)
http://skagit.meas.ncsu.edu/~helena/gmslab/papers/IEEEGRSL2005.pdf

But the G_ludcmp() needs to be replaced - it was discussed long time ago and
I somehow thought that it was done and the function does not include the num. recipes
program.

>
> Thanks for the patch, I planned to look at the solver because of the
> performance issues,
> so this helps, although the strange performance behavior since somewhere
> GRASS5*
> may be more related to recently reported stunning degradation of
> performance for
> rasters between 4.* and 5.* than in lineq. solver. I always though it is
> slower due to the
> floating point introduction and did not realize that it affects integer
> rasters too,
>

The strength of the bicgstab algorithm is to solve sparse linear equation systems very fast.
And i guess we will always get fully occupied matrices in case of the rst algorithm (each point
influences each other point).
The speed of the bicgstab solver depends on the initial guess. The better the guess the faster the solution is.
So i'm not sure if the bicgstab is really faster then a LU decomposition, because it depends
on the initial guess and the constitution of the matrix, which is getting
better with higher smoothing factors.

In case of the examples above, the bicgstab will be faster than a LU decomposition if the number of iterations
is much lower than the number of unknowns. Internally two matrix multiplications
and several vector operations are performed for each iteration.
If the smoothing factor is zero, the number of iterations is larger
than the number of unknowns. In this case the LU decomposition would be faster.

A short benchmark with 500 points proved my guess. The G_ludcmp() solver is faster for matrices with bad
condition (smoothing 0 - 7), the bicgstab will be faster with smoothing factors greater than 9.

IMHO a LU decomposition with pivoting would be the best choice in case of bad conditioning.
If we can avoid zeros and really bad matrix condition
(smoothing factor should be always greater than zero), we can use the parallelized LU decomposition of the gpde
library. The bicgstab solver is parallelized too and can be used in case of good conditions.
Maybe we can implement booth algorithms and let the user choose?
The rst computation will definitely benefit from parallel solvers.

Best regards
Soeren

Helena

Helena Mitasova
Dept. of Marine, Earth and Atm. Sciences
1125 Jordan Hall, NCSU Box 8208,
Raleigh NC 27695
http://skagit.meas.ncsu.edu/~helena/

On May 1, 2007, at 3:44 PM, Sören Gebbert wrote:

Dear developers,
i just noticed that the code of is an exact copy of
the numerical recipes algorithm. :frowning:
We need to replace it.

I have implemented a new simple parallel LU decomposition in the gpde library (since several months in CVS).
But this algorithm lacks the ability of pivoting (it would be very hard to implement this as a parallel version).
So it is not usable within the rst library in grass and maybe others? This is a big problem.

The rst library creates linear equation systems with a zero entry in the diagonal matrix. All the
algorithms implemented in gpde library to solve linear equation systems are not able to
handle this (LU, Gauss, jacobi, SOR) except the bicgstab algorithm.
The bicgstab is a state of the art iterative linear equation solver algorithm for non-symmetric matrices.
But it will fail on very bad conditioned matrices, so it is not the best choice if a direct solve can be used.

I have created a very simple patch for the rst library (regular spline with tension) to use the bicgstab
algorithm form the gpde lib instead of G_ludcmp(). Just for testing.
It seems to work and it is not slower than G_ludcmp(). The patch is attached.
It would be better if the rst lib would create matrices with a better condition and without
zero entries at the diagonal matrix. Then we could use other gpde algorithms than bicgstab.

The main question is, what to do with G_ludcmp()? There is a free implementation of the LU
decomposition algorithm with pivoting in the GSL library, which we can use. Or we use the bicgstab
algorithm in cases where the simple LU algorithm from the gpde library will fail?

Any suggestions are welcome
Best regards
Soeren
Index: interp_float/matrix.c

RCS file: /home/grass/grassrepository/grass6/lib/rst/interp_float/matrix.c,v
retrieving revision 2.3
diff -u -u -r2.3 matrix.c
--- interp_float/matrix.c 29 Mar 2006 20:05:08 -0000 2.3
+++ interp_float/matrix.c 1 May 2007 19:41:01 -0000
@@ -154,12 +154,13 @@
         matrix[i][j] = A[m];
       }
     }
-
- if (G_ludcmp(matrix,n_points+1,indx,&d)<=0) { /* find the inverse of the mat
-rix */
+/*
+ if (G_ludcmp(matrix,n_points+1,indx,&d)<=0) {
+
       fprintf(stderr,"G_ludcmp() failed! n=%d\n",n_points);
       return -1;
     }
+ */
/*
     G_free_vector(A);
*/
Index: interp_float/segmen2d.c

RCS file: /home/grass/grassrepository/grass6/lib/rst/interp_float/segmen2d.c,v
retrieving revision 2.5
diff -u -u -r2.5 segmen2d.c
--- interp_float/segmen2d.c 9 Feb 2006 03:08:58 -0000 2.5
+++ interp_float/segmen2d.c 1 May 2007 19:41:02 -0000
@@ -8,6 +8,7 @@
#include <math.h>
#include <grass/gis.h>
#include <grass/glocale.h>
+#include <grass/N_pde.h>

#include <grass/interpf.h>

@@ -263,7 +264,30 @@
         for (i = 0; i < data->n_points; i++)
           b[i+1] = data->points[i].z;
         b[0] = 0.;
- G_lubksb(matrix,data->n_points+1,indx,b);
+
+ /*create the gpde lineare equation system and transfer the data*/
+ N_les * les = N_alloc_les(data->n_points + 1, N_NORMAL_LES);
+ for(i = 0; i < data->n_points + 1; i++) {
+ les->b[i] = b[i];
+ for(j = 0; j < data->n_points + 1; j++) {
+ les->A[i][j] = matrix[i][j];
+ }
+ }
+ /*solve it with the bicgstab algorithm*/
+ N_solver_bicgstab(les, 100000, 0.000000001);
+ double d;
+
+ /* for testing */
+ //G_ludcmp(matrix,data->n_points + 1,indx,&d);
+ //G_lubksb(matrix,data->n_points + 1,indx,b);
+
+ /*transfer the result*/
+ for(i = 0; i < data->n_points + 1; i++)
+ b[i] = les->x[i];
+
+ /*release memory*/
+ N_free_les(les);
+
         params->check_points(params,data,b,ertot,zmin,dnorm,skip_point);
         } else if (segtest == 1)
         {
@@ -273,7 +297,6 @@
                   G_lubksb(matrix,data->n_points,indx,b);
                   params->check_points(params,data,b,ertot,zmin,dnorm,skip_point);
         }
-
         } /*cv loop */

     if(!params->cv)
_______________________________________________
grass-dev mailing list
grass-dev@grass.itc.it
http://grass.itc.it/mailman/listinfo/grass-dev

------------------------------------------------------------------------

_______________________________________________
grass-dev mailing list
grass-dev@grass.itc.it
http://grass.itc.it/mailman/listinfo/grass-dev

I forgot an important information.
Because the rst matrix is symmetric, the N_solver_bicgstab and the N_solver_cg
can be used to solve the linear equation system. The cg (conjugate gradient)
solver is optimized for symmetric matrices and therefore much faster than the bicgstab algorithm (only
one matrix*vector operation per iteration) and parallelized as well.

Best regards
Soeren

Sören Gebbert schrieb:

Helena,

Helena Mitasova schrieb:

Soeren -

by default rst does not have 0 on the diagonal - it has the value of the smoothing parameter
which is 0.1 or 0.01 (I would have to check, I just found it is missing in the man page).
This keeps the solution stable.

This is right, but while assembling the matrix a new row and column is added
to the linear equation system. Because i don't know how the rst method works,
i have no clue if this can be avoided.
(The matrix assembling code looks like good old Fortran code converted to C)
Take a look at this sample session with 10 points:

GRASS 6.3.cvs > v.surf.rst --o input=ranvect elev=test segmax=600 tension=100 smooth=10
Percent complete:
Reading lines from vector map ... Reading nodes from vector map ... 100%
111%
WARNING: strip exists with insufficient data

WARNING: 10 points given for interpolation (after thinning) is less than
         given NPMIN=300

The number of points from vector map is 10
The number of points outside of 2D/3D region 0
The number of points being used is 10
Processing all selected output files
will require 40016 bytes of disk space for temp files
Percent complete:
0.00000 1.00000 1.00000 1.00000 1.00000 1.00000 1.00000 1.00000 1.00000 1.00000 1.00000 * 0.00000 = 0.00000
1.00000 -10.00000 4.44122 5.22566 3.98839 5.46536 4.72084 3.82040 5.43142 4.38633 3.76136 * 0.00000 = 0.00000
1.00000 4.44122 -10.00000 4.48757 1.62831 4.11558 1.52070 2.10155 3.65654 2.48392 3.45935 * 0.00000 = 1.00000
1.00000 5.22566 4.48757 -10.00000 4.32808 3.35973 4.08760 4.81887 4.28695 3.66799 3.93563 * 0.00000 = 2.00000
1.00000 3.98839 1.62831 4.32808 -10.00000 4.27582 2.35813 1.77920 4.10082 1.82269 2.52327 * 0.00000 = 3.00000
1.00000 5.46536 4.11558 3.35973 4.27582 -10.00000 3.48756 4.69876 2.60567 3.71819 4.42648 * 0.00000 = 4.00000
1.00000 4.72084 1.52070 4.08760 2.35813 3.48756 -10.00000 3.14712 3.01723 1.98567 3.48508 * 0.00000 = 5.00000
1.00000 3.82040 2.10155 4.81887 1.77920 4.69876 3.14712 -10.00000 4.41198 3.17120 3.37596 * 0.00000 = 6.00000
1.00000 5.43142 3.65654 4.28695 4.10082 2.60567 3.01723 4.41198 -10.00000 3.78608 4.56449 * 0.00000 = 7.00000
1.00000 4.38633 2.48392 3.66799 1.82269 3.71819 1.98567 3.17120 3.78608 -10.00000 2.30508 * 0.00000 = 8.00000
1.00000 3.76136 3.45935 3.93563 2.52327 4.42648 3.48508 3.37596 4.56449 2.30508 -10.00000 * 0.00000 = 9.00000
Number of unknowns 11
BiCGStab -- iteration 0 error 285
BiCGStab -- iteration 1 error 778.166
BiCGStab -- iteration 2 error 2.90633
BiCGStab -- iteration 3 error 2.27454
BiCGStab -- iteration 4 error 0.369957
BiCGStab -- iteration 5 error 0.000523873
BiCGStab -- iteration 6 error 0.000268323
BiCGStab -- iteration 7 error 5.94182e-08
BiCGStab -- iteration 8 error 1.63458e-12
100%
history initiated
v.surf.rst complete.

The first entry of the matrix is zero.
IMHO the first column and first row can not be removed. The diagonal entries [i][i] = -10
reflecting the smoothing parameter and will be zero if the smoothing parameter is zero
(exactly as you explained). Luckily the bicgstab method can handle this.

Here an example with smoothing = 0:

GRASS 6.3.cvs > v.surf.rst --o input=ranvect elev=test segmax=600 tension=100 smooth=0
Percent complete:
Reading lines from vector map ... Reading nodes from vector map ... 100%
111%
WARNING: strip exists with insufficient data

WARNING: 10 points given for interpolation (after thinning) is less than
         given NPMIN=300

The number of points from vector map is 10
The number of points outside of 2D/3D region 0
The number of points being used is 10
Processing all selected output files
will require 40016 bytes of disk space for temp files
Percent complete:
0.00000 1.00000 1.00000 1.00000 1.00000 1.00000 1.00000 1.00000 1.00000 1.00000 1.00000 * 0.00000 = 0.00000
1.00000 -0.00000 4.44122 5.22566 3.98839 5.46536 4.72084 3.82040 5.43142 4.38633 3.76136 * 0.00000 = 0.00000
1.00000 4.44122 -0.00000 4.48757 1.62831 4.11558 1.52070 2.10155 3.65654 2.48392 3.45935 * 0.00000 = 1.00000
1.00000 5.22566 4.48757 -0.00000 4.32808 3.35973 4.08760 4.81887 4.28695 3.66799 3.93563 * 0.00000 = 2.00000
1.00000 3.98839 1.62831 4.32808 -0.00000 4.27582 2.35813 1.77920 4.10082 1.82269 2.52327 * 0.00000 = 3.00000
1.00000 5.46536 4.11558 3.35973 4.27582 -0.00000 3.48756 4.69876 2.60567 3.71819 4.42648 * 0.00000 = 4.00000
1.00000 4.72084 1.52070 4.08760 2.35813 3.48756 -0.00000 3.14712 3.01723 1.98567 3.48508 * 0.00000 = 5.00000
1.00000 3.82040 2.10155 4.81887 1.77920 4.69876 3.14712 -0.00000 4.41198 3.17120 3.37596 * 0.00000 = 6.00000
1.00000 5.43142 3.65654 4.28695 4.10082 2.60567 3.01723 4.41198 -0.00000 3.78608 4.56449 * 0.00000 = 7.00000
1.00000 4.38633 2.48392 3.66799 1.82269 3.71819 1.98567 3.17120 3.78608 -0.00000 2.30508 * 0.00000 = 8.00000
1.00000 3.76136 3.45935 3.93563 2.52327 4.42648 3.48508 3.37596 4.56449 2.30508 -0.00000 * 0.00000 = 9.00000
Number of unknowns 11
BiCGStab -- iteration 0 error 285
BiCGStab -- iteration 1 error 165.856
BiCGStab -- iteration 2 error 6.9055
BiCGStab -- iteration 3 error 1.2445
BiCGStab -- iteration 4 error 0.39798
BiCGStab -- iteration 5 error 0.155552
BiCGStab -- iteration 6 error 0.0760317
BiCGStab -- iteration 7 error 0.00483319
BiCGStab -- iteration 8 error 0.0191541
BiCGStab -- iteration 9 error 1.27935e-05
BiCGStab -- iteration 10 error 3.69912e-08
BiCGStab -- iteration 11 error 1.60303e-10
100%
history initiated
v.surf.rst complete.

It runs with zero smoothing too if you have sufficiently high tension,
but it gets unstable for small tension and zero smoothing. That is why the defaults and
overall internal adjustments of the parameters are set to avoid it and the program
gives warnings about the overshoots. There is more here (see e.g. fig. 2)
http://skagit.meas.ncsu.edu/~helena/gmslab/papers/IEEEGRSL2005.pdf

But the G_ludcmp() needs to be replaced - it was discussed long time ago and
I somehow thought that it was done and the function does not include the num. recipes
program.

>
> Thanks for the patch, I planned to look at the solver because of the
> performance issues,
> so this helps, although the strange performance behavior since somewhere
> GRASS5*
> may be more related to recently reported stunning degradation of
> performance for
> rasters between 4.* and 5.* than in lineq. solver. I always though it is
> slower due to the
> floating point introduction and did not realize that it affects integer
> rasters too,
>

The strength of the bicgstab algorithm is to solve sparse linear equation systems very fast.
And i guess we will always get fully occupied matrices in case of the rst algorithm (each point
influences each other point).
The speed of the bicgstab solver depends on the initial guess. The better the guess the faster the solution is.
So i'm not sure if the bicgstab is really faster then a LU decomposition, because it depends
on the initial guess and the constitution of the matrix, which is getting
better with higher smoothing factors.

In case of the examples above, the bicgstab will be faster than a LU decomposition if the number of iterations
is much lower than the number of unknowns. Internally two matrix multiplications
and several vector operations are performed for each iteration.
If the smoothing factor is zero, the number of iterations is larger
than the number of unknowns. In this case the LU decomposition would be faster.

A short benchmark with 500 points proved my guess. The G_ludcmp() solver is faster for matrices with bad
condition (smoothing 0 - 7), the bicgstab will be faster with smoothing factors greater than 9.

IMHO a LU decomposition with pivoting would be the best choice in case of bad conditioning.
If we can avoid zeros and really bad matrix condition
(smoothing factor should be always greater than zero), we can use the parallelized LU decomposition of the gpde
library. The bicgstab solver is parallelized too and can be used in case of good conditions.
Maybe we can implement booth algorithms and let the user choose?
The rst computation will definitely benefit from parallel solvers.

Best regards
Soeren

Helena

Helena Mitasova
Dept. of Marine, Earth and Atm. Sciences
1125 Jordan Hall, NCSU Box 8208,
Raleigh NC 27695
http://skagit.meas.ncsu.edu/~helena/

On May 1, 2007, at 3:44 PM, Sören Gebbert wrote:

Dear developers,
i just noticed that the code of is an exact copy of
the numerical recipes algorithm. :frowning:
We need to replace it.

I have implemented a new simple parallel LU decomposition in the gpde library (since several months in CVS).
But this algorithm lacks the ability of pivoting (it would be very hard to implement this as a parallel version).
So it is not usable within the rst library in grass and maybe others? This is a big problem.

The rst library creates linear equation systems with a zero entry in the diagonal matrix. All the
algorithms implemented in gpde library to solve linear equation systems are not able to
handle this (LU, Gauss, jacobi, SOR) except the bicgstab algorithm.
The bicgstab is a state of the art iterative linear equation solver algorithm for non-symmetric matrices.
But it will fail on very bad conditioned matrices, so it is not the best choice if a direct solve can be used.

I have created a very simple patch for the rst library (regular spline with tension) to use the bicgstab
algorithm form the gpde lib instead of G_ludcmp(). Just for testing.
It seems to work and it is not slower than G_ludcmp(). The patch is attached.
It would be better if the rst lib would create matrices with a better condition and without
zero entries at the diagonal matrix. Then we could use other gpde algorithms than bicgstab.

The main question is, what to do with G_ludcmp()? There is a free implementation of the LU
decomposition algorithm with pivoting in the GSL library, which we can use. Or we use the bicgstab
algorithm in cases where the simple LU algorithm from the gpde library will fail?

Any suggestions are welcome
Best regards
Soeren
Index: interp_float/matrix.c

RCS file: /home/grass/grassrepository/grass6/lib/rst/interp_float/matrix.c,v
retrieving revision 2.3
diff -u -u -r2.3 matrix.c
--- interp_float/matrix.c 29 Mar 2006 20:05:08 -0000 2.3
+++ interp_float/matrix.c 1 May 2007 19:41:01 -0000
@@ -154,12 +154,13 @@
         matrix[i][j] = A[m];
       }
     }
-
- if (G_ludcmp(matrix,n_points+1,indx,&d)<=0) { /* find the inverse of the mat
-rix */
+/*
+ if (G_ludcmp(matrix,n_points+1,indx,&d)<=0) {
+
       fprintf(stderr,"G_ludcmp() failed! n=%d\n",n_points);
       return -1;
     }
+ */
/*
     G_free_vector(A);
*/
Index: interp_float/segmen2d.c

RCS file: /home/grass/grassrepository/grass6/lib/rst/interp_float/segmen2d.c,v
retrieving revision 2.5
diff -u -u -r2.5 segmen2d.c
--- interp_float/segmen2d.c 9 Feb 2006 03:08:58 -0000 2.5
+++ interp_float/segmen2d.c 1 May 2007 19:41:02 -0000
@@ -8,6 +8,7 @@
#include <math.h>
#include <grass/gis.h>
#include <grass/glocale.h>
+#include <grass/N_pde.h>

#include <grass/interpf.h>

@@ -263,7 +264,30 @@
         for (i = 0; i < data->n_points; i++)
           b[i+1] = data->points[i].z;
         b[0] = 0.;
- G_lubksb(matrix,data->n_points+1,indx,b);
+
+ /*create the gpde lineare equation system and transfer the data*/
+ N_les * les = N_alloc_les(data->n_points + 1, N_NORMAL_LES);
+ for(i = 0; i < data->n_points + 1; i++) {
+ les->b[i] = b[i];
+ for(j = 0; j < data->n_points + 1; j++) {
+ les->A[i][j] = matrix[i][j];
+ }
+ }
+ /*solve it with the bicgstab algorithm*/
+ N_solver_bicgstab(les, 100000, 0.000000001);
+ double d;
+
+ /* for testing */
+ //G_ludcmp(matrix,data->n_points + 1,indx,&d);
+ //G_lubksb(matrix,data->n_points + 1,indx,b);
+
+ /*transfer the result*/
+ for(i = 0; i < data->n_points + 1; i++)
+ b[i] = les->x[i];
+
+ /*release memory*/
+ N_free_les(les);
+
         params->check_points(params,data,b,ertot,zmin,dnorm,skip_point);
         } else if (segtest == 1)
         {
@@ -273,7 +297,6 @@
                   G_lubksb(matrix,data->n_points,indx,b);
                   params->check_points(params,data,b,ertot,zmin,dnorm,skip_point);
         }
-
         } /*cv loop */

     if(!params->cv)
_______________________________________________
grass-dev mailing list
grass-dev@grass.itc.it
http://grass.itc.it/mailman/listinfo/grass-dev

------------------------------------------------------------------------

_______________________________________________
grass-dev mailing list
grass-dev@grass.itc.it
http://grass.itc.it/mailman/listinfo/grass-dev

_______________________________________________
grass-dev mailing list
grass-dev@grass.itc.it
http://grass.itc.it/mailman/listinfo/grass-dev

I have created a new patch for the rst library to test the gpde solvers
with the rst matrices. I have added a new function to rst/interp_float/segmen2d.c (solve_matrix).
If you want to test the different gpde solvers (only cg and bicgstab will work),
just remove or add a comment within the new function.
The created linear eqution system can be printed to stdout,
just remove the comment before N_print_les(les).

This is a quick and dirty hack and only for testing purpose. :slight_smile:

You have to patch the rst library and v.surf.rst to get it work.

This is the v.surf.rst patch:

Index: Makefile

RCS file: /home/grass/grassrepository/grass6/vector/v.surf.rst/Makefile,v
retrieving revision 1.7
diff -u -r1.7 Makefile
--- Makefile 20 Feb 2004 17:43:33 -0000 1.7
+++ Makefile 2 May 2007 01:16:20 -0000
@@ -5,7 +5,7 @@
  EXTRA_CLEAN_DIRS=doxygenhtml

  LIBES = $(INTERPDATALIB) $(QTREELIB) $(INTERPFLLIB) $(BITMAPLIB) $(LINKMLIB) \
- $(VECTLIB) $(VECTLIB_REAL) $(DBMILIB) $(SITESLIB) $(GISLIB) $(DATETIMELIB) $(GMATHLIB)
+ $(VECTLIB) $(VECTLIB_REAL) $(DBMILIB) $(SITESLIB) $(GISLIB) $(DATETIMELIB) $(GMATHLIB) $(GPDELIB)
  DEPENDENCIES = $(INTERPDATADEP) $(QTREEDEP) $(INTERPFLDEP) $(BITMAPDEP) $(LINKMDEP) \
              $(VECTDEP) $(DBMIDEP) $(GISDEP) $(DATETIMEDEP) $(GMATHDEP)
  EXTRA_INC = $(VECT_INC)

The rst library patch is attached.

Best regards
Soeren

(attachments)

rst_patch_bicgstab_solver_02_may_07.diff (4.03 KB)