[GRASS5] Raster directory structure, comments

Hi

I beleive there is an argument against the directory structure
pondered in the "features needed for 6.2" thread.

<mapset>/raster/<mapname>/filetypes

VS

<mapset>/raster/<filetypes>/<mapname>

There does appear to be a limitation on the number of subdirectories
a directory supports, whereas, there appears to be no realistic limit
on the number of files a directory can contain.

No I haven't dug into the code to see where it is but, currently grass
is limited (by the operating system) to 32000 raster maps per mapset
due to some hard limit on my systems which precludes making a
subdirectory in cell_misc for map 32001. At the same time I can add
(touch) additional files in fcell, cellhd, cell, cats ...

I realize 32000 is not a significant limit for most uses of grass
today. As we work with more 3D, volume, animation and modeling it will
probably become more of an issue.

Anyone willing to test this on their platform? I will gladdly compare
versions and such, ls -l of the mapset directory provides the (32000)
number of links in cell_misc where it fails here, and yes here it is
an nfs mount.

so for what I'm worth :slight_smile: , I'd prefer to side step the limit and go
with

<mapset>/raster/<filetypes>/<mapname>

(including the break up of cell_misc)

allowing a pack/unpack tool to add the creature comforts.

Thanx
  Ray

Raymond Burns wrote:

I beleive there is an argument against the directory structure
pondered in the "features needed for 6.2" thread.

<mapset>/raster/<mapname>/filetypes

VS

<mapset>/raster/<filetypes>/<mapname>

There does appear to be a limitation on the number of subdirectories
a directory supports, whereas, there appears to be no realistic limit
on the number of files a directory can contain.

No I haven't dug into the code to see where it is but, currently grass
is limited (by the operating system) to 32000 raster maps per mapset
due to some hard limit on my systems

Unix OSes impose a limit on the number of hard links to an inode. Each
subdirectory has a ".." entry which adds an extra hard link to the
parent.

You can determine the limit for a given filesystem using:

  pathconf(path, _PC_LINK_MAX)

where "path" is any directory within that filesystem.

which precludes making a
subdirectory in cell_misc for map 32001. At the same time I can add
(touch) additional files in fcell, cellhd, cell, cats ...

I realize 32000 is not a significant limit for most uses of grass
today. As we work with more 3D, volume, animation and modeling it will
probably become more of an issue.

I don't consider that we need to support tens of thousands of maps in
a single mapset.

Apart from the hard limit on the directory's link count, having that
many entries in a directory will adversely affect performance on
filesystems where directories are stored as unsorted lists (requiring
a linear search to look up a filename). It will have an even worse
effect upon the performance of anything which wants a sorted list of
maps.

Consequently, most people dealing with that many maps will probably
want to split the data into multiple mapsets.

--
Glynn Clements <glynn@gclements.plus.com>

I realize 32000 is not a significant limit for most uses of grass
today. As we work with more 3D, volume, animation and modeling it will
probably become more of an issue.

Between g.mapset's ability to change mapset midstream, g.mapsets and
map@other_mapset working with every module, is there really any reason
not to break your work up into multiple mapsets by the time you get to
32k maps? Personally, as far as problems on the horizon go I'm much more
worried about getting 64bitness right.

Hamish