schema

Schema.module Considered Mandatory

In the last couple days, two unrelated patches were committed to Drupal 6 with a common flaw: they caused inconsistencies between the database schema as defined by every module's hook_schema() function and the actual layout of the database. They are not the first two; I've personally found and fixed at least a dozen such mistakes, usually via a follow-up patch after the bug has already been committed to CVS. What makes these bugs so irritating is that they are so absolutely trivial to detect and usually very easy to fix before the initial patch goes in. Fixing them afterward takes a lot more overhead (and, more importantly from my narrow point of view, my overhead). So I want to put an end to it.

Therefore, with no authority whatsoever, I hereby declare the following law:

Every developer that submits or reviews any patch that modifies a Drupal 6 (or later) .install file must install schema.module and, using its Compare report, assert upon pain of public humiliation that the patch introduces no database schema inconsistencies into a fresh install or an upgrade from the previous core version.

Drupal 6's self-documenting database schema

After many previous attempts and false starts, Drupal 6's entire core database schema is now documented and cross referenced. You can view the documentation using the contributed Schema module; see the attached screenshot for a demonstration (click to enlarge). Thanks to webchick, add1sun, pwolanin, JirkaRybka, Moshe, chx, yched, and others for contributing their knowledge of the dusty corners of the schema.

If you are a module author, when you port a module to Drupal 6 and create your hook_schema() function, you should also document the schema. Doing so is easy: just add 'description' fields to each table and field in the schema structure. For example:

<?php
function mymodule_schema() {
 
$schema['mytable'] = array(
   
'description' => t('Stores whiz-bang cool data for nodes provided by the MyModule module.'),
   
'fields' => array(
     
'nid' => array(
       
'type' => 'int',
       
'not null' => TRUE,
       
'description' => t('The {node}.nid that this row is for.'),
      ),
 
// etc.
 
return $schema;
}
?>

Placing a table reference within a description inside of curly-braces causes the reference to be displayed as a hyperlink to that table's documentation on the Schema Describe page.

Alive, Kicking, and Speaking in Barcelona

Some of you may have noticed (who am I kidding? there is no "you" reading this site regularly...) that I've been gone awhile; my previous post here was in May. Well, I just got married and so have been, ah, rather busy. :-) I am finally back from the land of endless planning and greatly looking forward to DrupalCon 2007 in Barcelona. Besides all the Drupal excitement, this will also be my first trip with my wife (I keep emphasizing that word assuming it will eventually seem real) since our wedding; the official honeymoon isn't until later this year.

More relevantly, I'll be giving a variety of presentations at DrupalCon 2007:

Schema API in Drupal 6

The Schema Project has achieved a major milestone: The Schema API 1 has been committed for inclusion in Drupal 6 core. The Schema API allows modules to declare their database tables in a structured array (similar to the Form API) and provides API functions for creating, dropping, and changing tables, columns, keys, and indexes. Thanks to Frando for co-authoring the patch with me and the numerous people who helped review and test it.

Documentation for the Schema API is still being written; the work in progress is available at http://drupal.org/node/146843.

Some of the benefits of the Schema API are:

  • No more separate CREATE TABLE or ALTER TABLE statements for each database. Module developers only need to create a schema structure and/or use the Schema API functions, and Drupal takes care of the rest. This makes writing install and update functions much simpler.
  • Since the API makes it equally easy for module developers to support all databases, Drupal's support for PostgresQL will improve substantially. Also, as new database engines such as Oracle, SQLite, or Microsoft SQL Server are supported, modules using the Schema API will automatically work with them.
  • Several advanced capabilities, such as incremental database updates, a simple and consistent CRUD API, form scaffolding, simpler CCK and Views, schema and data validation, become much easier to implement in future enhancements.

Schema Project status report

The Drupal Schema project has seen substantial progress since my previous post. The Schema module is nearly ready for experimental use by module developers wanting to eliminate CREATE TABLE statements and for the integration of additional database back-ends. We are hoping at least some of its capabilities can make it into Drupal 6 core but, even if not, it will soon provide significant benefits to the community. The code is currently available in the contributions repository.

The primary accomplishments so far are:

  • Creation of the database table definition array structure.
  • Support for converting table definitions into CREATE TABLE statements.
  • The ability to inspect the live database in order to:
    • Automatically generate table definition array structures for existing tables, making conversion to Schema simple, and
    • Compare the live database against the union of all declared tables, identifying mismatches that probably indicate bugs.
  • Support for all existing core and CCK tables.
  • Equal support for MySQL and PostgresSQL and a clear path to supporting additional databases.

The Schema Project: Database abstraction, reflection, and migration

Almost a year after first discussing the idea on the Drupal developer’s mailing list I am finally ready to announce the Drupal Schema project. The goal of the project is to provide a database schema abstraction layer for Drupal. The benefits of the project will include more reliable compatibility with a greater range of database