SQL Antipatterns, Volume 1, B4: nudge readers to DISTINCT (page 172)

I’m enjoying the book and I’m happy to see that the discussion of GROUP BY in chapter 15 also includes a callout box titled “GROUP BY and DISTINCT”.

I’d like to suggest that the callout box nudge readers in the direction of DISTINCT over GROUP BY for the situation it describes.

The callout box compares two different queries:

SELECT DISTINCT date_reported, reported_by
FROM Bugs;
SELECT date_reported, reported_by
FROM Bugs
GROUP BY date_reported, reported_by;

The callout box concludes with:

Both queries produce the same result and should be optimized and executed similarly, so the difference in this example is only a matter of preference.

Comparing the two queries is indeed valuable, but I suggest adding a bit more language, like this:

Both queries produce the same result and should be optimized and executed similarly, so the difference in this example is only a matter of preference. With all else being equal, using DISTINCT has the advantage of communicating the intention more clearly.

My experience might be skewed, but I immediately know what I’m looking at when I see SELECT DISTINCT.

On the other hand, encountering a GROUP BY without any aggregation would make me wonder if I just found a bug.

This small way of increasing code clarity seems like another opportunity to avoid pitfalls.

Thanks Ted! I like the suggestion. I’ll add some phrasing.

1 Like

Thanks Bill!

I appreciate your consideration, and all your efforts in writing the second edition.

Thanks again, Bill. The change in B5 looks great to me.