Translating Entity-Relationship modeling to Drupal

Originally published at OpenCraft

Using the CCK + Views platform, we can replicate the traditional ER (Entity-Relationship) framework into a rapid application development environment that minimizes procedural and SQL programming by replacing most of it with "Visual ER programming": a way to specify data model and behaviour in terms of SQL-like abstractions directly inside the runtime Drupal environment - and reaping the benefits of incremental, always functional output.

ER Modeling

ER modeling deals with defining entities and relations among them. In Drupal, CCK content types correspond to entities, while relations can be specified through a number of ways, including CCK node references and Relativity module's parent-child relationship system. Neither are very comprehensive compared to the expressive power of ER relationships but it's a good start. Being able to map ER models to Drupal is a useful exercise that allows one to think in abstract ER terms and to implement the model 1-to-1 in software. Here are a few clues to perform that mapping:

  • In most cases, an entity type is mapped to a content type. Each instance of any content type is actually a Node, which can be thought of as the base class for all entities.
  • The entity's attributes are mapped to CCK fields. Most basic SQL types: integer, float, boolean, string, date, are all readily available as CCK fields, and the open CCK architecture allows to add many more besides programmatically. In that sense, the Drupal modeling system extends SQL with arbitrary field types.
  • In ER modeling, some entities need to be classified, such as movie genres or news topics. Typically, these are implemented in traditional database programming by creating lookup tables (for classification enums) and linking their PK to the entity being classified. Drupal provides this ability, in greatly enhanced form, through the Taxonomy module that allows the creation of arbitrary classifications that can be linear, tree-like, or even graph-like. Classification terms can be attached to any node. This is a huge time-saver and nullifies the problem of distinguishing between entity tables and lookup tables.
  • Simple relations can be defined using the CCK nodereference field type. This is a field that can hold a reference to another node in the system, in effect creating a relation between the referer and the referee. The name of the field can specify the relation (i.e., FK) name. The CCK nodereference can hold one or multiple references, enabling the cardinalities 1-to-1 or 1-to-many. Still, the CCK nodereference type lacks the ability to explicitly set and enforce specific cardinalities, and the relations just "feel" one-way, asymmetric, because they are defined as an attribute of one of the types, instead of link between two types. Relativity is another module that enables to explicitly set relations among types, and cardinalities are given more support. Its problem is that those relations are always implied to be of the parent-child type.

In conclusion, it is apparent that we use CCK to achieve both ER modeling and SQL CREATE TABLE statements in one go. That's a clear benefit of the system.

Now that we've defined the static ER model and mapped it to Drupal, we still need to be able to do two things: populating the database and querying on it. Let's take the second task first.

SQL Views

Querying entities in traditional SQL thinking involves constructing SQL views that correspond to desired reports. This is what the Drupal Views module accomplishes: it enables to construct named queries through a Web interface or API that return rows of nodes (or any other Drupal entity, like users, comments, taxonomy terms, etc.) according to specific criteria, just like a normal SQL SELECT statement would. Let's see how the Views mechanism corresponds to a SELECT statement:

  • A Drupal view (henceforth a view) has a unique name and a unique URL to be accessed from the UI.
  • The "SELECT tablename.fieldname AS label" idiom is achieved by specifying the fields of the main entity (or related entities) that should be displayed, and how each should be labelled.
  • The "FROM table1 JOIN table2 ..." idiom is achieved in two ways:
    • Implicitly because the Views system knows the table to which each selected field belongs. It then creates an internal JOIN between the master table and the other entity's table, assuming that the master PK is present as a foreign key in the second entity.
    • In addition, Views supports explicit programmer-supplied relationships that define how additional tables relate to the master table.
  • The "WHERE" clause is controlled through Filters where each filter specifies a condition, acting on a single field, that will filter the result set. Again, those fields used for filters affect the JOIN clause of the SQL statement.
  • Arguments are sometimes passed to SQL views. These are also provided by Drupal Views, through the Arguments section that expects arguments to be passed on the URL of the view. These arguments are added as conditions to the WHERE clause.
  • Finally, the displayed view can be shown on the browser with active filters that allow the user to further filter the query at runtime, through the Exposed Filters section. It is worth noting the the Views architecture is flexible enough to allow overriding most of these features with custom code. However, the default UI is usually enough to define most views without recourse to coding.

Populating and editing the database

Populating a traditional database involves executing SQL INSERT and UPDATE statements. Drupal provides more than one way to capture data:

  • Through the usual "Create content" and node editing mechanisms, whereby CCK displays data entry forms for the content types that are present in the system. In this sense, CRUD operations are automatically functional as soon as the CCK content type is created.
  • Through the Node Import module that allows CSV data to be imported into new instances of specified content types. It is worth noting that Drupal lacks a flexible ETL (extract, transform, load) module that generalizes this idea to arbitrary data sources and intermediate data transformations.
  • Through the Views Bulk Operations module that allows programmable actions to be executed on rows of nodes, as returned by Views queries.

Comments

Relativity and Views

I am new to Drupal and I have had the hardest time trying to create relationships using Drupal. Hopefully it is because I do not fully understand Drupal. I've been using CakePHP for the last 2 years and love how easy it is to create associations. When you query a "model", the associated (child) data is returned. I'm working on a simple project but can't seem to display these simple associations in Drupal. I have used the Relativity module to create the associations but can't get views to display the data the correct way. Honestly, I can't really get it to display it at all.

So I have 3 content types: meetings, time slots, and key points. Time slots and key points are both children of meetings and key points are children of key points and time slots. You may have noticed that key point is both a child and a parent. Like I said before, I believe I have set up the associations using the Relativity module.

Now the fun part, displaying the data. If the url is "/meetings/1" I would like to display meeting with id of 1 and all of its child data and the child data of its child data, each nested under its parents. Note that I did create a page view for meetings with nid as an argument. I've tried to use the Node Relativity as a relationship with no luck. I was able to get 1 level of child data but not deeper levels like child data of child data. Below is an example of what I am trying to display.

November Meeting (Meeting) Opening Comments (Time Slot) - Introductions (Key Point) Teach Me How to Drupal (Time Slot) - Installation (Key Point) - Modules (Key Point) - CCK (Key Point) - Views (Key Point) Closing Comments (Time Slot)

Any clues on how to accomplish this would be greatly appreciated. I want to love Drupal and I do for certain applications but this issue is almost a deal breaker for me and I have to believe that this is possible and easier than I am making it out to be.

As I mentioned in the

As I mentioned in the article, Drupal has weak modules (last time I looked) for node relationships.

You can start by making sure that the relationships are right by looking at the database values in the Relativity table(s).

To display them as you need, I would try the following:

  • Search the Relativity issue queue for similar requests. If not found, create a new support or feature request.
  • Check Relativity's integration with Views to see whether it supports hierarchies, like the Taxonomy module.
  • Hand-code a theme function to display the Relativity hierarchy as you need, and embed this function in the node display via a CCK Computed Field (for example).

Hope this helps!

Update for D7?

Excellent article. I was having this very discussion with my sister over the weekend and was glad to see that someone has written something up. This information applies to D6 though. I'm trying to translate to D7, which changes the game slightly as CCK is now in core and "everything as node" becomes "everything as entity". Perhaps you could consider updating this article for D7?

Thanks for your comment! This

Thanks for your comment! This discussion does need to be updated.

However, I'm not familiar with the D7 entity architecture yet. All my current projects still use D6 and I have barely started porting my modules to D7. When this happens, I'll make sure to update the article. Of related interest is a Views architecture presentation I gave whose slides are online.

Thanks for the writeup!

Great thoughts - I agree with previous commenter that if Drupal could get a better grasp of ER Modeling, referential integrity, etc it would really take it to another realm of possibility altogether.

I really appreciated your thoughts on using Taxonomy for classification instead of defining another table of classifications and setting relationships to them.

Thanks!

Object reference model lacking

I think one of the main thing lacking in Drupal is the possibility to mimick a relational database on the level of contenttypes en node relations. The node reference not being reciprocal and having no attributes of its own which can be used to query only certan kinds of relationships, makes the migration of custom build data repositories into drupal problematic.

Thanks for the useful

Thanks for the useful article. I'm looking for a solution which ensures 1:M cardinality with Node relationships module. But I don't think it's exist as a module for now. I tried Relativity but as you mentioned in the article despite its some useful features using node reference provides more functions.

10 months after the original

10 months after the original post the Relation module came out for D7 but I´m sure you all know that by know.