After writing the content it's locked away on the computer where it was written. If you were to log back into that computer you can access it using its path (e.g. Documents/myfile.s.txt). While you could read/edit this file using a number of tools, a very basic way to do this is using cat.
cat Documents/myfile.s.txt
Still, you need to log into this computer (e.g. jens-machine) to run this command. How can you read the same file from a different computer? Let's assume for a moment that we have some infrastructure in place on the local network, including installing some enhanced tools. Reading the latest contents of that file could be something like this:
cats jens-machine:Documents/myfile.s.txt
This command 'cats' is an enhanced version of 'cat' that understands how to take into consideration paths that have computer host names in them. It'll also work without the host name, in which case it behaves much like the regular cat. Underneath, it's using Secure SHell (SSH), which is a very common system for permitting secure, and authenticated remote access to another machine. It's used by many system administrators and is relatively easy to get set up on a variety of systems.
You can install cats using either [:@homebrew] with a [:@tap].
brew tap supertxt/tap ssh://nobody@supertxt.net/git/st-brew brew install supertxt/tap/cats
Otherwise, if you have Go installed then you can install it with the go command.
go install supertxt.net/git/cats/cmd/cats@latest
It turns out that there's an entire class of programs that can use SSH to access remote paths like this. For example, there's an enhanced version of the 'cp' (copy) command that can copy that remote file to the local computer. Let's bring the latest version of the file over here.
scp jens-machine:Documents/myfile.s.txt Documents/myfile.s.txt
There is a substantial number of SSHLA (Secure Shell Layer Application) programs. They can take advantage of the features that SSH provides.
The authentication mechanism (password, key-based, etc.) is shared for a host between programs. If you used cats, scp, git, or even VS Code the process to connect to the same host is the same. There's no need to independently configure each tool to add that host connection. As you've seen above, the same remote path (with the host name) is used in both tools. The same applies to a number of the other tools, such as rsync, git, and borg (backup tool). Moving from one program to another is relatively easy.
We've assumed that the local username (jen) is the same for all the machines on the local network. Sometimes this can be different from computer to computer. Luckily, remote paths have a special syntax for specifying the remote username on the remote host if it's different from the local one.
cats jen@jens-machine:Documents/myfile.s.txt
To recap, the mechanisms described here are sufficient for hosting and accessing content on a local network securely. Various SSHLA applications can work with that content. It wasn't mentioned above, but many other sorts of files (e.g. .jpg, .png, .pdf) are managed in a very similar way. With the standard paths and authentication, this is relatively easy to set up on a local area network even in an ad-hoc manner. The cats tool is not yet very common, and can be very easily mocked using a simple shell script. How does one host content publicly on the internet? That's the topic of the next section.
The popular option for Internet hosting is using HTTP with a Web Browser to access it. Since SuperTXT is just simple text, it can be hosted in this way. The principles of SuperTXT, such as eschew hiding of details from the reader, the multi-tool paradigm, and Unix philosophy, strongly directs us in a different direction. You've already seen above how to access the content locally, and over a local network. Why not host and access the content in a similar manner over the internet? It turns out that it is possible.
Before, we saw that Jen was accessing content on a machine on their local network using commands such as scp, and cats. What would the command look like to access hosted content on an internet server? First, let's make sure that we've generated our own SSH keys so that we don't need to use username/passwords anymore. This will allow us to navigate the internet of SSH using the key alone.
ssh-keygen
With the SSH keys it gives an opportunity for internet servers to provide access without your username. Here's how easy it can be to read a SuperTXT file from the internet.
cats nobody@example.com:Documents/myfile.s.txt
This time, the username is set as "nobody". This is a conventional username in Unix/Linux for a user that is allowed to have limited access to a server's resources anonymously. It also prevents the local username from being sent to the server, which could be used as a form of data mining and/or tracking. If you want to limit tracking even further, you could re-generate your SSH using the key generator from time-to-time. There are even further possibilities of using per-host SSH configuration so that you use a distinct SSH key pair for each site that you visit. There's some powerful privacy capabilities there.
Internet servers need to be careful about the services they provide, since there are performance, and service considerations. Suppose that the service supports copying that file locally, then you could use scp in the usual way.
scp nobody@example.com:Documents/myfile.s.txt Documents/myfile.s.txt
Perhaps the server offers git capabilities to track version history of the files. You might be able to clone and/or recommend fixes to the content. The first step is to clone the repo.
git clone nobody@example.com:git/myrepo
Discovering the capabilities of a server, and relevant paths is currently out of scope here, but you might try checking a site's message of the day (MOTD) first:
ssh nobody@example.com
Some sites might provide the typical "man" command to lookup commands and their usages.
ssh nobody@example.com man # Show me what I can run ssh nobody@example.com man somecmd # Show me how I can use this command
The ergonomics of being able to use the same tools with similar usages, whether internet, local, or even local servers is a nice feature with this approach. You are in full control of your tools, and identity while browsing the internet. This is all without passwords!
If you've ever hosted your own OpenSSH server you might have noticed that there are some issues with the approach above. It's true that this SSH server implementation will not permit any public key to authenticate as the nobody user. It's likely that tailored SSH server implementations will be needed to address this concern among others to provide public access to users.
At the moment, many/most mainstream SSH server implementations are designed to provide trusted access to a server using a predetermined, and named account on that server. Once you've authenticated you have access to a broad set of capabilities through a shell. From there you can run many programs that can consume system resources, including CPU and memory. You can fill up the disk with files. These are the kinds of capabilities that are trusted you even as a regular user of the server.
A publicly available server will need to be more careful about how resources are shared with anonymous "nobody" users. The commands in the previous sections do not require an interactive shell. They are one-off commands that serve a particular purpose and exit immediately. SSH permits this kind of direct command execution without a shell. For public servers this is probably going to be the only mode of operation. Anonymous users are probably more interested in specific content, and services instead of the low-level infrastructure provided by the server itself.
Even without a shell, using only one-off direct command execution, an anonymous user can still over-use the resources provided by a server. They would have access to virtually any command that's accessible from the namespace. This is why there should be only a very limited set of commands available through an allowed commands list. To support the "cats" commands above, the "cat" command should be on that list, and only with a single local file path within the content directory.
cat <content_path>
Notice that this doesn't permit scp, git, or other sshla commands. The choice of the allowed commands list would be chosen on a site-by-site basis. If scp were permitted then the allowed command might look something like this, which is what scp runs on a server through SSH when it is setting up the connection to retrieve a file. This is not the same command used to send files, so those operations would continue to be blocked.
scp -f <content_path>
Git has a similar division between commands that are invoked on the remote that are different between retrieving and sending data using either 'git-receive-pack' or 'git-upload-pack'. Allowed command lists can be set up to distinguish between them and either allow or deny the operation. The pattern repeats with other sshla commands.
Public servers should also employ rate limiting to help avoid resource exhaustion. Also, connection timeouts due to inactivity can be used for the same since there's no expectation of interactive shells where users might take long pauses. An SSH server that's designed for this might provide some reasonable defaults to help get started.
It turns out that implementing SSH servers that follow the protocol, but permit anonymous access, is relatively approachable since there are libraries available. The standard C library is libssh. There's also a Go library that makes handling SSH requests similar to handling HTTP requests.
git clone git@github.com:gliderlabs/ssh.git
There's also a possibility of handling SSH requests without relying on local commands, or the filesystem at all. Handling a 'cat' of a file could be easily handled using some other persistence store, such as a database. The paths might be arranged in some way to match keys in there. Certain sshla commands describe their internal protocols, or even offer libraries to do the same, which might permit an SSH server to customize the persistence and retrieve of data. The libgit2 library comes to mind.
We can stop thinking of SSH as tied to one particular implementation that's focused on providing low-level, trusted infrastructure access to a server. Instead, we can consider it as a protocol that can support a wide variety of anonymous, and pseudo-anonymous services on the public Internet. Finally, we can unlock SuperTXT, and the entire SSHLA family of technologies with all of their great attributes, such as offering both remote and local modes of operation.
Conserv is an alternative SSH server implementation that's designed to support anonymous and pseudo-anonymous services without direct access to the server infrastructure. Instead of giving users access to all resources of the server it limits them to only one specific content directory and only allowed command patterns. The server also limits requests to essentially non-interactive ones by shutting down any idle sessions longer than a few seconds.
You can install conserv to give it a try using [:@homebrew] and a [:@tap].
brew tap supertxt/tap ssh://nobody@supertxt.net/git/st-brew
Once its tapped you can install conserv like this.
brew install supertxt/tap/conserv
If you have Go installed you can install it with the go tool instead.
go install supertxt.net/git/st-int/cmd/conserv@latest
Conserv runs in the directory for its site. There's a special directory layout that it will generate for you the first time it runs. The only required input from you is the location of the host encryption key, generated if it doesn't already exist.
conserv secure_location/hostkey
The stdout logs will look something like this.
time:2023-04-23T00:00:17Z facility:conserv severity:info msg:generating host-key: secure_location/hostkey time:2023-04-23T00:00:18Z facility:conserv severity:info msg:generating server contents time:2023-04-23T00:00:18Z facility:conserv severity:info msg:server started on address :2222
And you can now ssh to your server to give it a try. SSH will likely ask you if you want to add this new host key to your list of known hosts, which is part of its TOFU (Time Of First Use) authentication mechanism. If the hostkey ever changes then it will show you a "nasty" error message.
ssh -p 2222 localhost
The directory layout for the site resembles this.
content/ # This is where the static files go (SuperTXT, images, etc.) command.s.txt # This is the configuration file that controls the allowed file patterns (careful, this is live updated) bin/ # All allowed commands for the site are executed from here
You may start to see certain log files showing up at the top too, such as cat.error.log. These logs contain the types of messages that the commands normally output to stderr, which sometimes include low-level details of the server, such as file paths. One lesson that we learned from web servers is that this sort of information should not be given to anonymous internet users because it can lead to exploits. This is one reason that "bin/cat" is a shell script, so that it can redirect the error logs. Let's see how this works.
#!/bin/sh # Here you can control the location and version of the cat # command that runs. Also, the environment can be tuned and # further checks on the arguments. For some commands the # standard error can be redirected so that the user never # sees the specifics, which is a good practice for servers # with untrusted users. PATH=/bin:/usr/bin cat "$1" 2>>cat.error.log
This is the kind of pattern that you can use for other commands on your site. It specifies the precise version of the command that you want, redirects any internal stderr messages, and also constrains the arguments to the command to a known set. In this case any additional parameters to cat are ignored, only the first argument "$1". It's this kind of layered security approach with allowed lists (instead of deny lists) that helps internet sites to be more successful.
In terms of additional layering, you might also consider the following:
These safeguards, and others, can lead to a more thoughtful security design, which is super helpful in case of a security incident.
This hosting guide has covered broadly how SuperTXT and related content is hosted, local, local network, and on the internet at large. Along the way we've seen how SSH and SSHLA tools help to empower all three types of hosting in a uniform way. Hosting content on the internet for anonymous does require additional considerations and even tools. Conserv is a tool that is designed to help both as a proof of concept and to encourage good practices.
HAVE SOME FEEDBACK ON THIS DOCUMENT?
You can provide a conventional comment on this document.
ssh nobody@supertxt.net ccmnt hosting.s.txt <<EOF suggestion: Here's my actionable suggestion. EOF