I was recently tasked with listing all the rows that had the second-highest value for a column in a table. For example, imagine a SQL table that stores player's scores for a game. It's fairly simple to list all players tied for first place:

SELECT name
  FROM game_scores
 WHERE score = (SELECT MAX(score) FROM game_scores)

But how would you list all players tied for second place? Turns out, there's a few ways to do this!

(Note: all the below solutions assume an index on the "score" column in order to be efficient and avoid full table scans.)

Solution 1: Where clause

This solution is simple, reusing the same subquery technique used above to find all players tied for first. It just needs a different subquery that retrieves the second-highest score instead. This can be done by finding the max score after explicitly excluding the actual highest score with a WHERE clause:

SELECT MAX(score)
  FROM game_scores
 WHERE score NOT IN (SELECT MAX(score) FROM game_scores)

The full query to list all players tied for second place then looks like this:

SELECT name
  FROM game_scores
 WHERE score = (
        SELECT MAX(score)
          FROM game_scores
         WHERE score NOT IN (SELECT MAX(score) FROM game_scores)
       )

The simplicity is convenient, but it's difficult to generalize this query to find all players with the third- or fourth- or fifth-highest scores.

Solution 2: Limit offset clause

Similar to solution 1, this solution relies on a subquery. This time, the subquery to retrieve the second-highest score is implemented with a LIMIT OFFSET clause:

  SELECT DISTINCT(score)
    FROM game_scores
ORDER BY 1 DESC
   LIMIT 1 OFFSET 1;

The DISTINCT keyword in the first line produces a list of unique scores found in the table, the ORDER BY clause sorts those unique scores from highest to lowest, and the LIMIT OFFSET clause returns 1 row from that sorted list after skipping 1 row (i.e. the row with the highest score is skipped).

The full query to list all players tied for second place then looks like this:

SELECT name
  FROM game_scores
 WHERE score = (
          SELECT DISTINCT(score)
            FROM game_scores
        ORDER BY 1 DESC
           LIMIT 1 OFFSET 1
       )

Though slightly more complex than solution 1, this solution allows us to more easily generalize which ranks we want to query. Players with the n-th highest scores can be found by replacing N with the appropriate value:

SELECT name
  FROM game_scores
 WHERE score = (
          SELECT DISTINCT(score)
            FROM game_scores
        ORDER BY 1 DESC
           LIMIT 1 OFFSET N-1
       )

Beware of excessively large offsets. An offset of 100,000 will require the database to find the highest 100,001 rows and throw away the first 100,000 of them. But who's looking for all the players tied for 100,001st place anyway?

Solution 3: Window function

Instead of reusing the subquery strategy from the first two solutions, this solution uses a JOIN. A common table expression (CTE) that calculates rankings is joined with the original table to simplify the WHERE clause filter:

  WITH rankings AS (---FILL ME IN---)
SELECT name
  FROM game_scores
  JOIN rankings ON (game_scores.score = rankings.score)
 WHERE rankings.rank = 2;

The rankings can be calculated with a window function:

SELECT DISTINCT(score) AS score,
       DENSE_RANK() OVER (ORDER BY score DESC) AS rank
  FROM game_scores

The full query looks like this:

  WITH rankings AS (
         SELECT DISTINCT(score) AS score,
                DENSE_RANK() OVER (ORDER BY score DESC) AS rank
           FROM game_scores
       )
SELECT name
  FROM game_scores
  JOIN rankings ON (game_scores.score = rankings.score)
 WHERE rank = 2;

Personally, I find this solution the most readable of the three. However, it does have some downsides. First, window functions are not available in MySQL until version 8.0. Second, some databases execute the JOIN clause before the WHERE clause, and since CTEs cannot have an index, this can result in a full table scan. I suspect this is a problem in SQL Server, but it didn't seem to be a problem when testing with MySQL or PostgreSQL. Of course, I only tested CTEs with a small number of rows (i.e. a large number of players but a small number of unique scores), so I'd have to do more research to understand the performance issues when more data is involved.

Overall, I thought this was an interesting exercise. I'm sure there are plenty of other ways to retrieve the same set of results. What solutions can you come up with?