| |||||
|
Understand the ambiguities of compound JOINs in SQL By: Arthur Fuller
This article begins with a test. Look at the SQL below and determine, without using your computer, what the result set will look like.
SELECT dbo.Customers.CustomerID, dbo.Orders.OrderID FROM dbo.Customers LEFT OUTER JOIN dbo.Orders ON dbo.Customers.CustomerID = dbo.Orders.CustomerID AND dbo.Customers.CustomerID = N'WOLZA' How many rows would you expect this code to return? 7? 91? If you guessed either of these, you're incorrect. The query returns 97 rows. Now try this one: SELECT dbo.Customers.CustomerID, dbo.Orders.OrderID FROM dbo.Customers RIGHT OUTER JOIN dbo.Orders ON dbo.Customers.CustomerID = dbo.Orders.CustomerID AND dbo.Customers.CustomerID = N'WOLZA' This query returns 830 rows, one for each row in the Orders table. All but seven contain a NULL CustomerID. To put it another way, the AND clause has no effect. Looking at the SQL above, you might surmise that the AND clause is testing the wrong table's CustomerID column. If you run the following query, you'll get exactly the same result—830 rows, seven of which contain the CustomerID 'WOLZA'. Thus, it makes no difference which table you specify in the AND clause. SELECT dbo.Customers.CustomerID, dbo.Orders.OrderID FROM dbo.Customers RIGHT OUTER JOIN dbo.Orders ON dbo.Customers.CustomerID = dbo.Orders.CustomerID AND dbo.Orderss.CustomerID = N'WOLZA' To get rid of the 823 rows containing a NULL, you could add a WHERE clause, like this: SELECT dbo.Customers.CompanyName, dbo.Orders.OrderID, dbo.Orders.CustomerID FROM dbo.Customers RIGHT OUTER JOIN dbo.Orders ON dbo.Customers.CustomerID = dbo.Orders.CustomerID AND dbo.Orders.CustomerID = N'WOLZA' WHERE (dbo.Customers.CustomerID IS NOT NULL) We finally get exactly seven rows. In fact, it turns out that there's another way to achieve the same thing: by adding a WHERE clause to test the CustomerID column: SELECT dbo.Customers.CompanyName, dbo.Orders.OrderID, dbo.Orders.CustomerID FROM dbo.Customers RIGHT OUTER JOIN dbo.Orders ON dbo.Customers.CustomerID = dbo.Orders.CustomerID AND dbo.Orders.CustomerID = N'WOLZA' WHERE (dbo.Orders.CustomerID = N'WOLZA') And yet, if all we were after in the first place was the seven orders corresponding to 'WOLZA', we could have obtained them with a simple INNER JOIN. SELECT dbo.Customers.CompanyName, dbo.Customers.CustomerID, dbo.Orders.OrderID FROM dbo.Customers INNER JOIN dbo.Orders ON dbo.Customers.CustomerID = dbo.Orders.CustomerID WHERE (dbo.Customers.CustomerID = N'WOLZA') The point isn't that you should use INNER JOINs to get the list of seven WOLZA orders—or that OUTER JOINs are pointless (we've all seen situations in which they deliver exactly what we require). The point is that compound JOINs become nonsensical when one of the clauses references only one of the tables. Think about that as a general principle.
Other Related Articles ... to read more DBA articles, visit http://dba.fyicenter.com/article/ |
||||