RFC: Catalyst::TraitFor::Model::DBIC::Schema::AutoDeploy?
One of my personal goals for this year is to reduce some of the boilerplate code I have to write in order to get working with my applications. This effort to ease Perl Programming is inspired by great projects like Moose, which really allows me to get focused on the job at hand and not worry about framework and setup. One area that I'd like to see improved relates to the effort involved in setting up development and testing databases. My recent cpan release of Test::DBIx::Class is my swing at the latter; this blog is my outline and RFC for the former.
One of my goals for Test::DBIx::Class is to make it trivial to deploy, seed and cleanup testing databases. Currently we have the ability to automatically deploy DBIC based Schemas to SQLite, MySQL and Postgresql. This ability is built on top of DBICs deployment code as well as some other bits from CPAN. This makes the job of deploying, setting up and testing as easy as:
use Test::More;
use Test::DBIx::Class qw(People);
fixtures_ok 'basic',
'basic fixtures installed';
is_fields 'first_name', People, [qw/john vanessa vincent/],
'Got expected first names';
done_testing();
Which will automatically create a database (using SQLite by default, but you can override and deploy to MySql or Postgresql as well), install the 'basic' fixtures (from a configuration file, but you can 'inline' create statements easily as well, see the docs for more) and test the schema resultset called 'People' to see if that set contains the asked for first_names. This reduces a lot of boilerplate code for deploying the database, checking it, etc (and we've all written 10 half baked version of that, right :) ) and offers some helpers for actually testing the data. For example, the 'is_fields' will ignore sorting order by default, comparing just the actual set. The above test assertion would probably have to be rewritten as:
is_deeply [sort map { $_->first_name } $schema->resultset('People')->all],
[qw/john vanessa vincent/],
'Got expected first names';
If you wanted to duplicate most of it's functionality, but there might be edge cases missed, like date fields and columns that inflate, weird differences in null handling, etc. And that's for a very simple case where you are testing a single field. If your test is more involved the syntax becomes increasingly verbose.
One of the things I really wanted for this module is to make it very easy to create a test database not only for SQLite, but for at least Postgresql and Mysql. Both these goals are helped along by the cpan distributions Test::mysqld and Test::postgresql which autodetect the presence of Mysql or Postgresql (they need to be installed, but don't need to be running) and creates a temporary database installation in /tmp or the directory of your choice. These databases live until the final tests complete and are then automatically cleaned up, unless you want to keep them, using a configuration setting or an %ENV variable.
This grants you a lot of benefits when you are trying to speed development along. You don't need to login and create the testing database, worry about setting up all the correct permissions, etc. If you are working in a multi developer company (like most of us) you probably have to create testing databases for everyone, unless you are willing to deal with the chaos of everyone's tests banging into each other. This method basically just creates a database for the programmer on the fly, reducing that effort considerable. Finally, since each temporary database is given it's own area to live, this allows you to run prove in parallel using the --jobs (or -j) option. Usually you can't run database tests in parallel, since each test will essentially be contending for a single testing database instance, using classic methods. Running parallel test jobs can greatly speed completing your tests, even with the overhead needed to instantiate a test database.
Now that I feel pretty comfortable with the code functionality, I am looking to factor out some core bits and use it to build a similar tool for development databases. This would be a trait loadable by newer versions of Catalyst::Model::DBIC::Schema that would autodetect your target database, instantiate and deploy one as needed. This way it would make it much easier to just get going with development, particularly if you are following a methodology were each developer has a personal development database.
I'd like your feedback on the idea, and request comments on the features you'd like to see most. For example, I am currently working on supporting Replication in Test::DBIx::Class, since if you are using replication in production then your test cases really should run against a replicated setup. This may not be needed in development. On the other hand we might want our development databases to support the versioning system built into DBIC. Since in test I generally build the database from scratch each time, I don't care about versioning so much.
My imagination tells me we'd want something called “Catalyst::TraitFor::Model::DBIC::Schema::AutoDeploy” which would add a configuration option to your Catalyst::Model::DBIC::Schema called 'target_driver' or similar so you could specify SQLite, MySQL or Postgresql, as well as whether you wanted to preserve the database between running the application server and so forth. I also figure by default to built the database files in $home/share/data, or similar, (again looking for suggestions on the best default).
This would be intended primarily for development, but would probably also be useful for people deploying catalyst applications to shared hosting setups. This would allow one to basically deploy everything for an application to a single directory root.
Thoughts?
Comments
I am also thinking about publishing some example DBIC schemas with data that could be used for testing CPAN modules. Of course this could never be a comprehensive set - but perhaps something to make a quick start of the testing would be useful?
The docs for Test::DBIx::Class definitely need a lot of work, don't feel bad to criticize. This first go mostly pulls together docs from several other sources with the minimum added to tie it together. I plan a Cookbook and tutorial style POD for later, after I get the remaining TODO stuff solved. Right now I just have replication support, auto seeding of default fixtures and possibly support for bulk loading of fixtures using native database methods, like the mysql loader, since if you have a huge fixture set, loading them via DBIC might be time prohibitive. Anyway, that's last on the list for now.
I'd love to work together on adding more and better stuff to Test::DBIx::Class::Example::Schema. My goal there is both instructional and practical. I'd like something I can leverage when building DBIC components and I'd like a repository of things that newcomers can look at to get the idea. Again, patches welcomed. In fact, you can just clone the github and hack away: "git://github.com/jjn1056/Test-DBIx-Class.git"
If you do use it, please let me know of any pain points. I figure this dist can replace a lot of repeated effort and let us pull together our precious development time.
As far as the RFC I will try hard to make docs a higher time priority from the start. Since the Catalyst trait is going to be mostly from scratch it should be easier to get te docs first. With Test::DBIx::Class I spent most of the time gluing together ugly bits and writing test cases. I'll refactor the underlying code once the feature set and test cases get stable.
Thanks for the feedback!