Unix pathnames form the fundamental reference technique for files in the file system. Considering that many things are represented as files in the system the paths are used for a great many things. While they work very well, they can't address things that are not represented as files in the filesystem, such as files on other servers.
Pathname2 addresses both local and remote files accessible over the internet. Some tools such as [:@SSHLA] have already begun to use network paths, but without a written specification. This document attempts to specify these new pathnames so that they can be used more widely.
[[user@]hostname[^port]:][pathname]
Any valid pathname is also a valid pathname2, except if it contains a colon, where it must be escaped with a backslash. The presence of a ":" preceeded by a valid hostname is what distinguishes a pathname2 from a plain pathname.
A port is expected to be entirely numeric, and a hostname cannot contain a colon, but if the user has a colon then it can be escaped with a backslash. For example, this is a valid pathname2:
my\:user@server1:abc.txt
Since paths can be combined with relative paths, and other operations performed on them it is important to cover what happens in the common cases. Consider the following directory path that is joined with a simple filename path to address a file on a remote server.
server1:specs <join> electrical-code.txt = server1:specs/electrical-code.txt
Similarly, a directory and path can be joined to a remote directory.
server1:specs <join> electrical/wi-code.txt = server1:specs/electrical/wi-code.txt
But also, a pathname2 without a pathname can be joined with a pathname (relative to the remote working directory).
server1: <join> poetry/mypoem.txt = server1:poetry/mypoem.txt
Other types of relative paths work virtually the same way as with Unix pathnames, including relative paths with "." and "..". It's just that the hostname portion remains the same throughout the operation.
If a pathname2 is joined with another pathname2 with a hostname portion the result is the second pathname2. There is no way to mutate portions of the hostname portion, or any sort of relative operation.
server1:foo.txt <join> poetry-server:limerick55.txt = poetry-server:limerick55.txt
There is a distinction between relative and absolute paths in both the original pathnames and now in this specification. In its original form relative paths are resolved in the local session based on the current working directory. Remote paths that are outside of the session are based on the default current working directory that will happen on the remote host when a connection is made. Typically, this is the home directory of the remote account, which can make remote relative paths also relative to the user's own account. For account independent access, using a shared "git" or "nobody" account, the relative paths can have a shared service-level significance, making them more sharable, like a hyperlink.
user@somesite.com:mygitrepo # This can be a different repo depending on the user
Remote absolute paths, like the local ones, have a more infrastructure-level significance, referring to more internal aspects of a server. Server and service administrators make use of these kinds of paths since they are more likely to be independent of the user account.
user1@myserver:/etc/server.conf # These two paths refer to the same file on the server user2@myserver:/etc/server.conf
Since pathname2 inherits from pathname it also inherits the character encoding independence except for the certain reserved characters: /, ., :, @, ^. The recommendation remains that all encoding should be in UTF-8 (strongly preferred), or other ASCII compatible encodings. This is so that path comprehension code can scan bytes directly for these characters. It also frees pathnames, and therefore file/directory names, to be flexible with all unicode content available while retaining readability. Here are some interesting path examples.
my trip to Ayutthaya ✈️.txt cats.com:cats playing in fountains ⛲.s.txt
The pathnames themselves can contain characters that are sensitive to certain shells. Unix shell reserved characters are commonly escaped with a backslash "\" or the path can be surrounded by quotes, which is preferred since it is more readable.
Pathname2 should support all of the most commonly used capabilities of URL's, such as server specifier, username, and ports. Noticeably missing is the protocol portion, which is effectively replaced by both the [:@SSHLA] system where the command determines the protocol, and everything is running over SSH.
The pathname portion of the pathname2 is equivalent to URL paths, except they provide the distinction between absolute (infrastructure) and relative (service-oriented) paths, which means that the same path system can be used by both teams managing the service, as well as the end-users of the service.
http://some-server.com/path/to/resource.txt # All paths are relative, never absolute some-server.com:path/to/resource.txt # Relative path some-server.com:/path/to/resource.txt # Absolute path, accessible to some, usually addresses the same file
Query parameters, are intentionally out of scope for this specification since in this space they are more commonly specified by the command, which has a much more flexible way of determining their own query syntax. Here's an abstract comparison with the "srch" command.
http://some-server.com/query/my/service?q=this+is+my+query srch -C some-server.com: <<EOF this is my query EOF
You will notice that the search command doesn't have to encode the spaces since it makes use of the "heredoc" capability of the Unix shell, reading the query in plain text from standard input.
URL fragments are also out of scope for this specification since the fragments are interpreted by the viewer or shell, not processed by the command, or service. When running multiple commands through a pipe, or other joined together, the fragment applies to the combined output, and not an individual path. There is a common convention for encoding fragment-link position specifiers using specially formatted arguments after a shell comment like this:
find . -name "foo.txt" # [:1] Shell/viewer might highlight the first line of the output here if it supports the convention cats "space bar.s.txt" # [:@list_of_drinks] This might bring you directly to this named anchor in the document curl http://myspace.com # "[:/blink tags .* harmful/]" Bring me to the first occurence of text that matches this regular expression
HAVE SOME FEEDBACK ON THIS DOCUMENT?
You can provide a conventional comment on this document.
ssh nobody@supertxt.net ccmnt specs/pathname2.s.txt <<EOF suggestion: Here's my actionable suggestion. EOF