Schema Project status report

The Drupal Schema project has seen substantial progress since my previous post. The Schema module is nearly ready for experimental use by module developers wanting to eliminate CREATE TABLE statements and for the integration of additional database back-ends. We are hoping at least some of its capabilities can make it into Drupal 6 core but, even if not, it will soon provide significant benefits to the community. The code is currently available in the contributions repository.

The primary accomplishments so far are:

  • Creation of the database table definition array structure.
  • Support for converting table definitions into CREATE TABLE statements.
  • The ability to inspect the live database in order to:
    • Automatically generate table definition array structures for existing tables, making conversion to Schema simple, and
    • Compare the live database against the union of all declared tables, identifying mismatches that probably indicate bugs.
  • Support for all existing core and CCK tables.
  • Equal support for MySQL and PostgresSQL and a clear path to supporting additional databases.

This work so far represents the combined efforts of myself and Frando@drupal.org. Thanks, Frando!

Table definition array structure

Here is an example of the table definition array structure; it is an excerpt from the (automatically generated) structure for Drupal 5’s system tables:

function system_schema() {
  $schema['access'] = array(
    'cols' => array(
      'aid' => array('type' => 'serial', 'not null' => 1),
      'mask' => array('type' => 'varchar', 'length' => '255', 'not null' => 1, 'default' => ''),
      'type' => array('type' => 'varchar', 'length' => '255', 'not null' => 1, 'default' => ''),
      'status' => array('type' => 'int', 'size' => 'small', 'not null' => 1, 'default' => 0)),
    'primary' => array('aid'),
    );

  // more tables go here

  return $schema;
}

This declares the table “access” to have four columns and a primary key. The table array can have the following top-level keys:

  • cols: Associative array mapping column name to column specification.
  • primary: Array of columns forming the table’s primary key.
  • unique: Associative array mapping unique key name to the columns forming the key.
  • index: Associative array mapping index name to the columns forming the index.

Note that you do not have to create this structure for any database table that already exists; read about Inspection, below.

Columns

Each column can have the following standard attributes (others can and will be added):

  • type: The basic data type of the column. Can be serial, int, float, varchar, text, blob, or datetime. serial indicates an auto-incrementing value.
  • size: The size hint for the column. Can be small, medium, big, or normal, with normal as the default. Not all types can accept all sizes (but probably they should). This is a hint in the same way that C integer types (char, short, int, long) are hints: you only know that a smaller one will not have a larger range than a larger one.
  • length: The maximum length of the column; only used for varchar.
  • not null: Whether the column can contain NULLs. Can be 0 or 1.
  • default: The default value for the column. Can be a string or integer which are treated distinctly; 0 and ‘0’ are different.
  • unsigned: Whether the column is an unsigned value; only used for type int.
  • disp-width: The display width of non-varchar columns for the mysql command-line client. Only used by MySQL.

Primary key

The primary key attribute specifies an array of columns forming the table’s primary key. ‘nuff said.

Unique and Index

These define unique and non-unique indexes. Each one is an associative array mapping the index name to an array of columns forming the index.

For MySQL, the ‘unique’ array generates UNIQUE KEY statements and the ‘index’ array generates (non-unique) KEY statements, both part of CREATE TABLE. For PostgreSQL, the ‘unique’ array generates UNIQUE statements as part of CREATE TABLE and the ‘index’ array generates separate CREATE INDEX statements.

Management of table creation and destruction

The Schema module provides API entry points schema_create_table() and schema_drop_table() to executing the appropriate CREATE TABLE or DROP TABLE query for a single table structure. It also provides schema_create_schema() and schema_drop_schema() for operating on an array of table structures as is returned from a module’s hook_schema() function.

We have not yet finalized how this will be integrated into a typical module’s installation process. The basic idea is that a module’s .install file will look something like this:

function mymodule_schema() {
  $schema['mytable'] = array(...);
  return $schema;
}

function mymodule_install() {
  schema_create_schema(mymodule_schema());
}

Notice the lack of CREATE TABLE.

For updates, Schema will eventually provide functions for adding, altering, and removing columns, keys, and indexes, but these are not available yet (the code already exists in various places in Drupal and just needs to be collected into the database engines).

We are not currently planning to implement schema changes in a data-driven way. When your module’s schema changes, you’ll update your mymodule_schema() function and add mymodule_update_n() function just like you do now.

In short, update functionality is not currently affected. Also, you do not have to use Schema; you can keep executing CREATE TABLE statements directly if you want (it’s not like we could prevent it anyway).

Database inspection

Schema uses database inspection (the same thing that phpMyAdmin uses) to discover the existing structure of your database.

When you visit Administer >> Site building >> Schema >> Inspect, Schema displays the table definition structure for every table currently in the database. If your module is enabled, just scroll down to find your tables and copy the definition into your hook_schema() function.

When you visit Administer >> Site building >> Schema >> Compare, Schema compares your current database to the union of all table definition structures returned by all modules. All matching, mismatching, missing, and extra tables are reported; for mismatching tables, the exact differences are explained. This is very useful for debugging and is an excellent early-warning tool to detect unexpected database changes. It will eventually be integrated into the “Status Report” section of the Administer pages.

For additional laughs, you can visit Administer >> Site building >> Schema >> SQL which shows you the CREATE TABLE statements Schema will generate for each table definition; it can show you the SQL for MySQL or PostgresQL no matter which database you are using. Finally, Administer >> Site building >> Schema >> Show displays the internal schema structure and is only useful for Schema developers.

Core and CCK support

Schema currently exports table definition structures for all core and CCK-generated tables.

The core table structures are generated by inspection, of course, and hopefully mean that CREATE TABLE statements will be gone from Drupal 6.

The CCK table structures are generated dynamically from CCK’s data structures and thus always reflect your current types and fields. This will not really do anyone much good until support for entity-loading is finished, but at least you can use Schema’ Compare page to verify that CCK sets up its tables correctly.

Back-end database support

Schema currently supports MySQL and PostgresSQL. Supporting an additional database only requires writing a function to convert a table structure into a CREATE TABLE statement. The inspection capabilities are helpful but not required. Look in Schema’s engines/ subdirectory for examples.

How to help

To provide feedback, suggestions, or assistance, reply here or even better join and participate in the Drupal Database Schema API group.