Jody,
Note that the current javadoc of org.geoserver.platform.resource.Files is highly problematic.
From the original javadoc we get:
"Utility class for handling Resource paths in a consistent fashion.
This utility class is primarily aimed at implementations of ResourceStore …"
This utility class was indeed intended for resource paths and the resource store, not file paths and the file system.
"The base location is represented with “”, * relative paths are not supported.
The sentence ‘relative paths are not supported’ was meant with respect to the resource store. The meaning here is that all paths in the resource store are absolute, with respect to the REsource Store root, not the file system root!
And then comes the new sentence:
" Absolute paths are supported, with Linux systems using a leading {@code /}, and windows using * {@code L:}."
This sentence contradicts everything before it. If this is a resourcestore utility class,
why are we suddenly talking about absolute paths in the operating system? The resource store has nothing to do with this.
There never should have been anything inside this class
testing on operating system. There should never have been the method called “isAbsolute”. This should have been put inside another class that is used for file paths.
This will incredibly confuse new developers. They will not at all understand the purpose of the
resource store as an abstraction that should not be assumed to be part of the file system.
Instead of clarifying things for developers, make sure they understand how to use it and not abuse
it, it seems we are giving up on its intended purpose completely and just making the confusion
worse by writing nonsensical javadoc about it. Unless this is rolled back, we may as well
delete the ‘jdbcstore’ module and give up entirely on the resource API.
If that is the chose direction, fine, it is not up to me, but then be clear about it.
Kind Regards
Niels
···
On 03/08/2023 08:46, Niels Charlier via Geoserver-devel wrote:
Jody,
Perhaps I can offer a suggestion that may be considered a reasonable compromise.
The “resource:” URLs are not used because they were never documented in user docs. (The javadoc however clearly states that it always was the intention to be used for access to the ‘data directory’ via URLs.) I can see that one of the issues you have been working to resolve is the usage of URLs without a prefix, which are just naked paths and where there is ambiguity in interpretation.
The resourcestore is low level API only accessible to developers, not users. Users will always access it via some layer in between, like rest or something that calls Resources.fromURL. When a developer call resourcestore.get he knows he is not necessarily using the file system, and the paths should not be considered file system paths. With Resources.fromURL this is not so clear.
My suggestion is that we bring back / leave the resource store API to how it was, being lenient on the leading slash because there is no possible confusion there, and it avoids a huge amount of work. We may however agree (and document this) that the Resources.fromURL, as well as REST or other services accessible to users use the following guidelines (and we document this for both developers as well as users):
- file: URLS will use the file system
- resource: URLs will use the resource store
- naked path without URL prefix: absolute paths are interpreted as file: URLs, relative paths are interpreted as resource: URLs.
This way the issue is addressed at the level where it actually occurs, and OS/filesystem stuff does not creep into the resource store API.
Kind Regards
Niels
On 03/08/2023 01:49, Jody Garnett wrote:
Niels:
Comments inline
–
Jody Garnett
On Aug 2, 2023 at 12:20:43 PM, Niels Charlier <niels@anonymised.com> wrote:
Hi Jody,
I do support checking everything is handled correctly.
Thanks Niels, sorry I am a bit frustrated these days so am probably not explaining clearly.
I just don’t seem the point in breaking huge amounts of code for nothing. And that is what will happen if we start throwing exceptions in the resource store if the path starts with a “/”. Which is actually normal, because the resource store has a root.
ResourceStore no longer uses /
as a root.
The work done earlier this year removed the use leading /
from the codebase (including the REST API Controller).
Please: define “absolute path”. Is this OS dependent? Should the behaviour of the resource store be OS dependent? What /exactly/
I put the definitions into the document I shared and the javadocs (not sure why I cannot find the javadocs online?).
When I say “path” I mean the strings used by the class Paths.isAbsolute(String)
This is covered in the developers guide here: https://docs.geoserver.org/latest/en/developer/programming-guide/config/resource.html#paths
Paths are broken down into a sequence of names, as listed by Paths.names(path)
:
-
Path.names("data/tasmania/roads.shp")
is represented as a list of data
, tasmania
, roads.shp
.
-
On linux Path.names("/src/gis/cadaster/district.geopkg")
starts with a marker to indicate an absolute path, resulting in /
, src
, gis
, cadaster
, district.geopkg
.
-
On windows Path.names("D:/gis/cadaster/district.geopkg")
starts with a marker to indicate an absolute path, resulting in D:/
, gis
, cadaster
, district.geopkg
.
are you suggesting should be allowed or not allowed to access a resource from the resource store?
I am suggesting paths that return true for this method be not allowed to access resources from the resource store.
Yes, “paths” are used across the application. But “path” is a vague term, ‘xpaths’ are paths too, they have no relationship with file paths. Resource paths are no different, they live in their own world and are not part of the file system. The notion of a path being ‘absolute’ or ‘relative’ has no meaning when you are talking about paths in general, this is file system terminology.
Earlier in this email thread I shared the developer guide documentation which was written to define what Paths mean for GeoServer.
When someone gets a resource directly from the resource store, they are using the resource store as a black box, and the “path” is always relative to the root of the resource store.
Correct, relative paths are with respect to the resource store location. Actually when I did he QA I found that there was at one point a search path that could be scanned to allow “data directory” to be spread across several folders (terrifying).
Resource.fromURL is a different story, these are not hard-coded URL’s, they are user provided, and then we allow both the file system as the resource store to be accessed, where file: is for the file system and resource: is for the resource store.
I have not seen users use “resource:”, to my knowledge that was only used internally in the logic between in geoserver and geotools ( applied by a style visitor to help geotools resolve icons in the data directory.)
See the examples Resource.fromURL(base, url):
-
Resources.fromURL( null, "/src/gis/cadaster/district.geopgk")
- absolute file path (linux)
-
Resources.fromURL( baseDirectory, "D:\\gis\\cadaster\\district.geopkg")
- absolute file path (windows)
-
Resources.fromURL( baseDirectory, "file:///D:/gis/cadaster/district.geopkg")
- absolute file url (windows)
-
Resources.fromURL( baseDirectory, "ftp://veftp.gsfc.nasa.gov/bluemarble/")
- null (external reference)
See the examples Paths.convert(base, filename)
-
Paths.convert(base,file)
- uses URI relativize method to determine relative path (between file and base)
-
Paths.convert(base,folder, fileLocation)
- can resolve relative location, limited to content within the base directory
-
Paths.convert(base, filename)
See examples Files.fromURL( base, url)
-
Files.fromURL( null, "resource:styles/logo.svg")
- internal url format restricted to data directory content
-
Files.fromURL( null, "/src/gis/cadaster/district.geopgk")
- absolute file path (linux)
-
Files.fromURL( baseDirectory, "D:\\gis\\cadaster\\district.geopkg")
- absolute file path (windows)
-
Files.fromURL( baseDirectory, "file:///D:/gis/cadaster/district.geopkg")
- absolute file url (windows)
-
Files.fromURL( baseDirectory, "ftp://veftp.gsfc.nasa.gov/bluemarble/")
- null (external reference ignored as we cannot determine a file)
-
Files.fromURL( baseDirectory, "sde://user:pass@anonymised.com:port")
- null (custom strings are ignored as we cannot determine a file)
Folks use Paths (in the sense defined above) here also; the is the motivation for fixing absolute paths on windows (as the definition had changed in Java 11) and where the inconsistency of use was found.
I just haven’t seen any valid argument why the original design is flawed.
I do not believe the origional design was flawed, it was not consistently applied in the codebase. Where a path is used
Kind Regards
Niels
Does it help that path and absolute path is defined strictly above?
Thanks,
Jody
On 02/08/2023 16:40, Jody Garnett wrote:
Niels:
We may be on the same page, you indicate ResoueceStore should never use absolute paths for the file system.
That is the decision I wish to make, and then we should go do the work to enforce it. How much work? I am not sure, hence the work party.
There is something subtle here though. You are focused on the ResourceStore API, and I am asking you to consider the application as a whole.
We do have paths in the application including absolute paths. The QA activity to check that these are handled correctly is the work.
Paths are not only for ResourceStore.
Jody
On Tue, Aug 1, 2023 at 11:55 PM Niels Charlier <niels@anonymised.com> wrote:
Hello Jody,
Unfortunately I am not really on board with your plan. I really don’t think we need to make a new branch and do a lot of work. This seems like a lot of unnecessary work. I think the original design is fine, and does not require breaking the code base. We just need to go back to how things were and perhaps fix a few cases where people have gone wrong since then.
I disagree. The definition of ‘path’ for the resource store is not in any way related to the OS, because the resource store lives in a separate world from the file system. The resource store and the file system have no joint concept of ‘path’.
- Early resource store work was not strict about use of paths, and when to use Resource and when to use File
I disagree. It was always very clear about this. There should obviously never be hard coded absolute paths for the file system. In general the code should basically always use resources. Only where users can use URLs they have access to both the file system and the resource store, this is always how it was intended to work, as can be read from the javadoc.
I disagree. I think nearly all of the code base is fine with how it was designed to work.
Regards
Niels
On 01/08/2023 19:31, Jody Garnett wrote:
Niel, Gabe:
Our email thread on this topic has explored the topic by now, I think action rather than more communication is needed next.
Can we setup a 1/2 day breakout work-party to get this activity underway (prior to the Bolsena code sprint and the next release cycle). Even a couple hours to get the topic started on a branch would be great.
I am proposing two times below, and am willing to wake up early to see this topic move along:
- August 8th 15 CEST / 11 BRT / 6 PDT
- August 15th: 15 CEST / 11 BRT / 6 PDT
Is this a good idea
–
Jody Garnett
–
–
Jody Garnett
_______________________________________________
Geoserver-devel mailing list
[Geoserver-devel@lists.sourceforge.net](mailto:Geoserver-devel@anonymised.comsourceforge.net)
[https://lists.sourceforge.net/lists/listinfo/geoserver-devel](https://lists.sourceforge.net/lists/listinfo/geoserver-devel)