1 post tagged “rfc”
Introduction
This blog is a proposal and request for comments regarding adopting the XDG Filesystem Hierarchy as a option for managing all the non code data composing a Catalyst application.
The Problem
Right now when you create a new Catalyst application the non code data by default goes to either {home}/root (for templates and static stuff) or {home} (for configuration files), where {home} is the root directory of the application. So you get a directory structure like:
MyApp
myapp.conf
/t
/root
/static
/lib
MyApp.pm
/MyApp
/Model
/View
/Controller
Now, this {home} directory is something of a hack, since we use Catalyst::Utils:home() to try to figure it out based on certain expectations. Perl doesn't have this idea of {home} built into it. If your application is 'installed' (via cpan or make install), we guess the location based on the physical address of the application modules (whatever you got that is inheriting from Catalyst). If it's not installed (which is the common case when you are developing and just running the development server or tests) it walks the directory structure looking for a Makefile.PL or a Build.PL and then decides that's good enough to call {home}.
Since this method can be a bit flaky, a lot of people are recommending that you use File::ShareDir (see here for a good overview). This module intergrates well with Module::Install and leverages the fact that a Perl module can have a share directory associated with it. Using this, you might create a directory structure like:
MyApp-Web
/etc
myapp-web.conf
/t
/share
/static
/lib
MyApp.pm
/MyApp
Web.pm
/Web
/Model
/View
/Controller
I also modified the directory hierarchy a bit to reflect the growing consensus that your Catalyst application should ideally live one level further down from your application root. In this case I choose 'MyApp/Web.pm' which seems to be the most popular choice and one that is semantically meaningful. This represents the idea that your MVC layer should be the thinnest possible over your true domain and interface logic, which sits in the MyApp directory. I also moved the configuration files to {home}/etc since that makes sense from people used to finding configuration in /etc
Although this is an improvement, it still suffers from several issues. First of all one problem with File::ShareDir is that it can only find the share directory for installed applications. For the common case where you are actively developing, or running tests, you still need some code like Catalyst::Utils::home() to guess the directory for you. In this way it's not much better than what Catalyst::Utils::home() provides out of the box.
Also, when your share data is installed into the perl library path, this means that your application server (or user running apache mod_perl or fastcgi) would need the correct level of access to the path. This complicates configuration. This setup is this is not what most Unix administrators will expect. There are reasonably well defined norms for where your configuration should go (/etc or ~/.config) as well as where the logs go and all that.
Although you can override the {home} directory with environment variables, this is not ideal if our goal is to minimize installation hassle and make everything work well out of the box. It complications your installation for users as well as configuration the web servers that will run the code.
It also complicate customization. For example, let's say I am using the MojoMojo wiki and want to run three instance of it. Each instance will have unique configuration and I want to slightly modify the theme files for each. Right now, the only way I can do this is via the method of overridding the environment variable for home for each running instance. Although this works, this is a 'roll my own' approach that is likely to vary from administrator to administrator, making it more difficult to onboard new admins due to the uniqueness of each application. I strongly feel that we should have clear standards for all the most common case deployment issues, since this reduces errors, speeds deployment as well as counter the argument I often hear that Perl is hard to maintain. A standard will also help grow a set of best practices surrounding deployment issues which we can document and promote.
Proposal
This is a case where Perl is not well leveraging existing norms, which really goes against the grain for us, considering CPAN with it's "reuse, recycle" mantra is one of our primary claims to fame. My recommendation is that we adopt an existing standard and make this available as a plugin or set of roles for Catalyst. The most relevent standard is the XDG Filesystem Hierarchy which exists specifically as a standard for where installed applications put configuration and data files, both locally that users can overide as well as global stuff that only admins should touch.
Although this standard is aimed at Linux, it's fairly straighforward and similar methods are employed by Windows Server and MacOSX Server so that is should be possible to create a pluggable support mechanism that is broadly applicable.
the XDG Filesystem Hierarchy defines some environment variables and defaults for the most common types of non code data, as well as offers a system for separating user configuration from global configuration.
I recommend you review the standard, since it's very short, but here's a summary. The standard defines 4 enviroment variables useful to us:
XDG_DATA_HOME
These is the location of data oriented files that a user running the application should be able to customize (or will be customized during installation or use of the application). By default these go into "~/.local/share".
XDG_CONFIG_HOME
Similar to XDG_DATA_HOME but specifically for configuration files. Defaults to "~/.config".
XDG_DATA_DIRS
Takes a string of paths (delimited by ":") where to local for systemwide data. These could be things like templates or static assets that shouldn't be changed by users and that would be shared by all instances of the application. The default is: "/usr/local/share/:/usr/share/".
XDG_CONFIG_DIRS
Like XDG_DATE_DIRS but for configuration. Defaults to "/etc/xdg".
The way I'd see this working is that if the application we being run in development mode, we'd first look for files local to the application file path, and then fall back to looking at the XDG defined directories. Additionally, we'd probably need some boilerplate install scripts that authors can use to prompt for the desired path information (which rational defaults). So our application distribution would possible look like:
MyApp-Web
/t
/etc
myapp-web.conf
/share
/local
/lib
/MyApp
Web.pm
/Web
/Model
/View
/Controller
And during installation we'd copy "MyApp-Web/share/local" to "$XDG_DATA_HOME/myapp-web" and "MyApp-Web/share/" to "$XDG_DATA_DIRS/myapp-web" (we'd either just copy to the first one in the path or prompt at install time). Handling configuration would be a bit trickier. My thougth here is that we'd copy "MyApp-Web/etc/*" to "$XDG_CONFIG_DIRS/myapp-web" but when running the application would like in both XDG_CONFIG_DIRS and XDG_CONFIG_HOME, merging both to allow locally overriding of the configuration.
Overall I believe this will give us a smoother and more professional installation experience, make it easier to administer Catalyst applications and help start a best practices dialog.
Thoughts, criticism, abuse welcome :)