[GRASS5] v.extract problems

Glynn,

I think the change you made is a good one, but I found another problem
that might not be related. I ran the updated code against a vector file
with more than 10,000 categories and tried to extract subsets ranging
from 11 to more than 8,000 categories and consistently got seg faults.
The problem always happens at line 272 in main.c.

The problem might be related to skipped values in the category list, but
I can't confirm that. It's hard enough putting together a file with
several thousand categories. Doing it with several thousand categories
perfectly in sequence might be a bit much to expect :slight_smile:

Roger

rgrmill@rt66.com wrote:

I think the change you made is a good one, but I found another problem
that might not be related. I ran the updated code against a vector file
with more than 10,000 categories and tried to extract subsets ranging
from 11 to more than 8,000 categories and consistently got seg faults.
The problem always happens at line 272 in main.c.

Well, I changed the loop logic from:

     cat_count = 0;
     while(cat_array[cat_count])
       {
         G_set_cat( cat_array[cat_count], G_get_cat(cat_array[cat_count],
                &cats), &temp_cats );
         cat_count++;
       }

to:
  for (i = 0; i < cat_count; i++)
      G_set_cat(cat_array[i], G_get_cat(cat_array[i],&cats), &temp_cats);

AFAICT, every element of cat_array between 0 and cat_count-1 is
guaranteed to have been assigned by a prior call to call to add_cat()
so, if any of the values are bogus, the problem is in the
initialisation.

I can't see where, in the original code, an element would be
explicitly set to zero. The array is initialised to zeroes by virtue
of being static data, so the first unused element would be zero. So
looping (i = 0; i < cat_count; ...) should have the same effect unless
an explicit zero was added with add_cat. That would to terminate the
list prematurely, which doesn't make much sense to me.

Also, I've just noticed what appears to be another problem with the
original code. In the fourth case (file of category numbers), each
supplied range overwrites the beginning of the array, rather than
being appended to it:

         while (1)
           {
           if (!fgets (buffr, 39, in)) break;
           sscanf (buffr, "%[a-zA-Z., -_/$%@!#0-9]", text);
       /*sscanf (buffr, "%s", text); */
    /*scan %s stops at whitespace?*/
           scan_cats (text, &x, &y);
--> cat_index = 0;
           while (x <= y)
              {
              cat_array[cat_index] = x++; cat_index++;
              }
           }

The problem might be related to skipped values in the category list, but
I can't confirm that.

Note that there is a double indirection here. The elements of
cat_array are the categories; the indices into cat_array *aren't*
themselves categories. E.g. if you used:

  v.extract list=1-3,5-7 ...

then cat_array would be {1,2,3,5,6,7} and cat_count would be 6.

In any case, I'll have another look at in the morning.

--
Glynn Clements <glynn.clements@virgin.net>

rgrmill@rt66.com wrote:

I think the change you made is a good one, but I found another problem
that might not be related. I ran the updated code against a vector file
with more than 10,000 categories and tried to extract subsets ranging
from 11 to more than 8,000 categories and consistently got seg faults.
The problem always happens at line 272 in main.c.

I can't immediately see anything which would cause this.

Which options were used to specify the categories? I.e. were either of
"-n" and/or "file=" used?

--
Glynn Clements <glynn.clements@virgin.net>

[Roger: mail to rgrmill@rt66.com is bouncing.]

Roger Miller wrote:

> Which options were used to specify the categories? I.e. were either of
> "-n" and/or "file=" used?

As far as I got in gdb, the fault at line 272 was caused because "&cats" in
the arguments for G_get_cats was zero.

That can't happen; "cats" is a local variable, so "&cats" will always
be an address within the stack. I can only assume that gdb is "lying";
try compiling without optimisation (e.g. "CFLAGS= ./configure ...", or
edit the head.<platform> file).

I didn't use -n. I got the same result using both file= and list=.

OK, that narrows down the possibilities.

In my most recent test using the current code from CVS the full command line
was:

v.extract i=HidalgoCoTx o=test new=0 type=line list=39-50

Category 39 is the lowest category. Category 42 is missing. If I use
line=39-41 v.extract runs without errors. If I use line=39-43 v.extract gets
a seg fault.

AFAICT, the most likely reason is that G_get_cat() is returning NULL,
and G_set_cat() is calling G_store(NULL), which segfaults.

I'll add a check for NULL labels.

When it does run, I get a correct-looking dig_cats file, an empty dig_att
file and a dig_ascii file with nothing but a header. The lack of results may
be a completely different problem.

The "dig_att" file and the actual "dig" file are written by
xtract_{line,area}, which haven't been changed. AFAICT, v.extract
won't write a "dig_ascii" file.

--
Glynn Clements <glynn.clements@virgin.net>