The Symlink-approach to Grav Multisite

An alternative way of setting up multiple, connected installations

A few days ago I was asked what the best way to manage 30-something or so Grav sites would be, using either a “symlink approach” or a “subtree/submodule approach”. Here’s a quick run-through on what this sort of setup entails.

Grav has had native support for multisite - to “create and manage a network of multiple websites, all running on a single installation” - since version 1.0. The Admin-plugin has not, however, and so the benefit of multisite is really sharing resources and assets. And that can be achieved with a complex multisite-setup through setup.php and using PHP’s streams extensively with Grav.

The difference between symlinking - creating symbolic links on the filesystem between folders and files, and subtrees/submodules in Git is not all that big: Git will of course version-control every aspects of the environment, and symlinking will massively reduce the actual amount of files for a flat-file CMS like Grav - as well as the space required for many sites. Otherwise it’s largely two ways to achieve the same goal multisite itself does.

Convoluted setups in Git tends to make maintenance of the related and interconnected repositories difficult, and each repository and submodule will include the whole Git history unless it’s a shallow copy. If it is, the only difference between a regular or symlinked folder is the reference to the repository. This should be known regardless, and so the benefit of reducing size and the amount of actual files outweighs the reference-points to version-control.

A further improvement would be a base-installation with shallow clones via Git, which all other installations can symlink with.

Given the amount of sites - more than 3 in this case - we can assume that many things are common. However, they are complex because we don’t want to retype the symlink-commands every time we change something.

So, for local development, staging for testing, and production - on any of the 30 sites, a notebook of what was linked where and why wouldn’t go amiss. Or at least a simple manifest of what was clone and where. Any task-runner could set up your environments, but something with access to native functions - minimally to run the symlink-commands mklink on Windows and ln on Unix - would be best. I once wrote a piece in the Grav Docs on doing this with PHP: Scripted Upgrades.

In this regard, there are two things to consider: Do we want to ship the environment as a whole, or do we want to include Git for updating certain parts? As you’re probably aware, transferring thousands of files over SFTP is a slow process from any Operating System. Because of this, a task-runner can make it much quicker to provision the different environments. For example, it can git clone a specific version of Grav as well as common extensions, and create virtual copies - symlinks - to your various sites.

The “symlink-approach” is quite simple: Some sites share the same base, including files, extensions, and even assets. Grav even has helper-commands[1] for this. So say we have this structure somewhere across our environments:

/grav
/site1
/site2
/siteN

/grav is our base installation of Grav, and it holds everything that is always constant across each site. Things like the Error, Problems, and Email plugins, and the Quark theme. Probably our own extensions also. /site1 is our first site, and in this folder we symbolically link every common thing from the base installation. Let’s start with the basics: The bin, system, vendor, and webserver-configs folders, as well as any files at the root of /grav. Just navigate to the folder from the command line, eg. cd C:/web/grav and run php bin/grav sandbox -s "C:/web/grav2". The -s parameter tells Grav to create a symlinked environment, if possible.

Inside /grav2, everything is like a normal Grav-installation, except the above folders are virtual copies linked to what’s in /grav. So if I upgrade the installation in /grav, it’s also updated in /grav2. This is because /grav2 expects /grav to have these folders and files. We could repeat this for every site, and only customize what’s in /gravN/user.

But that’s not enough. Having created a virtual copy of what’s minimally required to run Grav, we want common extensions to follow along as well. Even without Grav’s helper-commands, this is as simple as mklink /J "C:/web/grav2/user/plugins/error" "C:/web/grav/user/plugins/error" on Windows. Of course, this is something to automate as much as possible.

For each installation that naturally shares a variety of resources with the base installation, or each other, create a virtual folder that refers to the resource. You could of course also have non-Grav resources, like an /assets-folder full of images that you symlink to your theme or /siteN/user/pages/images, or a /git-folder where you store things that you clone from a Git-provider.

The idea is to create virtual copies of everything possible, to reduce the amount of files and storage space needed on the production-server. The pure Git-approach is not apt for this, as Git creates a lot of overhead, as explained above. I also find managing subtrees and submodules a fairly painful thing to do, because there’s a lot of work involved in very integrated Git-workflows.

A benefit of the symlink-approach is that it’s quite straightforward to write the task-runner, so it can be pointed at any server and any folder to set up there. It will download or clone via Git any needed resources, and create all virtual copies that you define. If anything goes wrong at a later point, it’s supremely easy and fast to recreate the environment, or a specific site. A virtual copy of something, in any OS, takes milliseconds to create. That is not the case with copying many files or even cloning something from Git if the history has a lot of catching up to do.

Task-runners #

For a task-runner in PHP - which will be easy to run across Operating Systems - https://deployer.org/ is an excellent option, and what is recommended in the aforementioned section of the Docs. You basically define hosts - target servers - to run commands on. A set of commands is defined as a task, and you can run them grouped, one by one, in parallel, chained, basically however you’d like.

Against remote servers it’s all run via SSH, which is what I would recommend for pretty much all connectivity like this. The benefit of public and private key pair files is invaluable. Deployer is also apt for handling permissions well.

Any task-runner that can execute commands on different environments will do, but it’s best to choose one with a uniform and clean API, not something that requires you to retype a lot of commands or approximates batch-files with a lot of customizations.


  1. https://learn.getgrav.org/16/cli-console/grav-cli ↩︎