Monday, January 28, 2008

The Forgotten Cross-Join

Sometimes I forget the simplicity and power of cross joins. When first learning sql, I thought cross joins were the most useless of all joins. Later on, after accidentally doing a partial cross join on two very large tables, I quickly decided they just might be the satan of all joins ;)

The truth is they are neither. Cross joins are certainly used far less frequently than other joins. But a well thought out cross join can be quite useful. On the flip side, an accidental or poorly designed cross join can be dangerous! It could probably bring down a database in short order. Not to mention bringing your boss and/or DBA, to your doorstep, wanting to know what the hell you did to crash the database! Now, I do not say that to discourage people from using cross joins. Merely to emphasize that it is important to understand how they work and how to use them properly.

If you are not familiar with cross joins, they produce what is called a cartesian product. The rows from each table are joined, but unlike inner joins, the result is the total number of rows in the first table multiplied by the number of rows in the second table. So a cross join on two tables that contain a mere one thousand (1,000) records each, will produce a resultset of 1 million records! Why would you ever want to do this? Usually you would not. Yet sometimes a cross join is exactly what is needed.

Let us say you have three tables: Users, Groups and UserGroups. The table UserGroups stores the ID of each user, and the groups to which they are assigned. Note, I am using the plural here because user and group are likely reserved words.

You also have a simple form that allows you to select multiple users and assign them all to a selected set of groups.

Now when you process the form, you could loop through the list of userID's and perform a separate query to assign each one to the selected groupID's. But using a cross join provides a simpler and more elegant way to the insert all of the records in a single query.

<cfquery name="addAssignments" datasource="#dsn#">
INSERT INTO UserGroups ( UserID, GroupID )
SELECT u.UserID, g.GroupID
FROM Users AS u CROSS JOIN Groups AS g
<cfqueryparam value="#form.userIDList#" cfsqltype="cf_sql_integer" list="true">
AND g.GroupID IN
<cfqueryparam value="#form.groupIDList#" cfsqltype="cf_sql_integer" list="true">

As you can see from the results, the cross join inserts one record for each of the selected groups, for each user.

Now, there are also other uses for cross joins. In a pinch, you could even use them to simulate a crosstab report, with a small dataset. Though most databases provide a more efficient means for producing that type of output. But all in all, cross joins are not useless or evil after all ;)


  © Blogger templates The Professional Template by 2008

Header image adapted from atomicjeep