This algorithm comes up a lot in the system I’m building. Get the most recent whatever for each user, along with all their information. Also known as the greatest-n-per-group problem. Here’s the most efficient SQL algorithm I’ve come across:
SELECT C.ClientId, C.LastName, L.LocationName AS CurrentLocation FROM Client C LEFT JOIN (SELECT t1.* FROM (SELECT * FROM Location WHERE UpdateDate < @Date) AS t1 LEFT JOIN (SELECT * FROM Location WHERE UpdateDate < @Date) AS t2 ON t1.ClientId = t2.ClientId AND t1.UpdateDate < t2.UpdateDate WHERE t2.ClientId IS NULL) AS L ON C.ClientId = L.ClientId
This query finds the latest location for each client as of the supplied
@Date. The main part of the algorithm reads as:
Return the row t1 for which no other row t2 exists with the same ClientId and a greater UpdateDate
The outer left join allows us to return users with no entries in the location table.