SQL Wordle Series - Part Two: Playing
SQL Wordle Series π︎
- Part One: Cheating
- Part Two: Playing
Foreword π︎
Continuing the series - you can’t cheat at a game you cannot play. This time I’ve focused on the scoring algorithm. I have a hunch scoring will be useful for finding the optimal strategy.
When I was done, I thought it would be fun to add some input and scoring validation to simulate the original game.
⚠️ Disclaimer: This solution builds on the scripts created in the first part of this series.
Strategy π︎
From the start, I want to use a set-based solution. I want to split the words table and have each letter in a separate row. That way I can JOIN
or INTERSECT
them and generally have more options.
DROP TABLE IF EXISTS dbo.WordLetter
CREATE TABLE dbo.WordLetter
(
WordId smallint NOT NULL
, Word char(5) NOT NULL /* so I don't have to join */
, Letter char(1) NOT NULL
, Position tinyint NOT NULL
, CONSTRAINT PK_WordLetter PRIMARY KEY CLUSTERED (Letter, Position, WordId)
, INDEX IX_WordLetter_Position NONCLUSTERED (Position)
)
/* Unpivot the table */
INSERT INTO dbo.WordLetter WITH (TABLOCKX)
(WordId, Word, Letter, Position)
SELECT
aw.Id
, aw.Word
, SUBSTRING(aw.Word, cj.Position, 1)
, cj.Position
FROM dbo.AllWords AS aw
CROSS JOIN (VALUES (1), (2), (3), (4), (5)) AS cj(Position)
Duplicate letters π︎
One problem I ran into in my earlier solutions was duplicate letters. I’ve tested and debugged on this example. The solution is REBUS and my guess isβ¦ GUESS.
Only one S is consumed and that’s the one in 5th position because it’s a perfect match. No other S remains in the solution and that’s why it’s blacked out.
Since I’m only interested in the best match. I can join the two words on their shared letters and calculate the pairings distance like so.
SELECT
g.Word
, g.Letter
, g.Position
, s.Word
, s.Letter
, s.Position
, ABS(CAST(g.Position AS smallint) - s.Position) AS distance
FROM dbo.WordLetter AS g /* guess */
JOIN dbo.WordLetter AS s /* solution */
ON g.Letter = s.Letter
WHERE
g.Word = 'guess'
AND s.Word = 'rebus'
ORDER BY g.Position
Everything with distance = 0 is a perfect match. Distance > 1 is a partial match and everything else (not in the result) is not a match. I can partition by letter, order by distance and grab only the first row.
The final scoring algorithm wrapped in a function
CREATE OR ALTER FUNCTION dbo.ScoreWordle
(
@guessId smallint
, @solutionId smallint
)
RETURNS TABLE
AS
RETURN
WITH score AS
(
SELECT
g.WordId AS guessWordId
, s.WordId AS SolutionWordId
, g.Letter
, g.Position AS guessPosition
, s.Position AS solutionPosition
, ca.dist
, ROW_NUMBER() OVER (PARTITION BY s.WordId, s.Letter, s.Position ORDER BY ca.dist) AS rn
FROM
dbo.WordLetter AS g
JOIN dbo.WordLetter AS s
ON g.Letter = s.Letter
CROSS APPLY
(VALUES (ABS(CAST(g.Position AS smallint) - s.Position))) AS ca(dist)
WHERE
g.WordId = @guessId
AND s.WordId = @solutionId
)
SELECT
CONCAT
(
MAX(CASE WHEN s.guessPosition = 1 THEN ca.score ELSE 'B' END)
, MAX(CASE WHEN s.guessPosition = 2 THEN ca.score ELSE 'B' END)
, MAX(CASE WHEN s.guessPosition = 3 THEN ca.score ELSE 'B' END)
, MAX(CASE WHEN s.guessPosition = 4 THEN ca.score ELSE 'B' END)
, MAX(CASE WHEN s.guessPosition = 5 THEN ca.score ELSE 'B' END)
) AS score
FROM dbo.WordLetter AS wl
LEFT JOIN score AS s
ON wl.WordId = s.guessWordId
AND s.rn = 1
AND s.guessPosition = wl.Position
CROSS APPLY
(
VALUES
(
CASE
WHEN s.dist = 0 THEN 'G' /* Green = match letter and position */
WHEN s.dist > 0 THEN 'O' /* Orange = correct letter, wrong position */
ELSE 'B' /* Black = does not contain the letter */
END
)
) ca (score)
WHERE wl.WordId = @guessId
GROUP BY wl.WordId
The complete game π︎
Again, the whole code is also in my GitHub repo. I’m creating several helper objects:
- GameHistory - table to hold the attempts and scoring
- PlayWordle - a stored procedure that validates the input and provides the scoring for our guesses
- ResetGame - another stored procedure to reset a single or all games
Input validation
- The guessed word must be on the list of available words
- Either a date or a Wordle number must be provided to match the relevant solution
- When neither is provided - today’s date is picked
- The correct solution hasn’t been guessed yet
- Max number of attempts hasn’t been reached yet
DROP TABLE IF EXISTS dbo.GameHistory
CREATE TABLE dbo.GameHistory
(
SolutionId smallint NOT NULL
, GuessAttempt tinyint NOT NULL
, GuessId smallint NOT NULL
, Score char(5) NOT NULL
, CONSTRAINT PK_GameHistory PRIMARY KEY CLUSTERED (SolutionId, GuessAttempt)
)
CREATE OR ALTER PROCEDURE dbo.PlayWordle
(
@guess char(5)
, @date date = NULL
, @wordleNum smallint = NULL /* has precedence over @date */
)
AS
BEGIN
SET XACT_ABORT ON
BEGIN TRY
/* Input handling */
DECLARE
@guessId smallint
, @solutionId smallint
, @lastAttemptNum tinyint
, @score char(5)
/* Guessed word is valid */
SET @guessId = (SELECT Id FROM dbo.AllWords WHERE Word = @guess)
IF (@guessId IS NULL)
RAISERROR('Guessed word "%s" is invalid', 11, 1, @guess)
/* Solution Id is provided either directly or through a date */
SET @solutionId = @wordleNum
IF @solutionId IS NULL
BEGIN
SET @date = ISNULL(@date, GETDATE())
IF (@date < '2021-06-19 00:00:00.000' OR @date > '2027-10-21 00:00:00.000')
RAISERROR('Date must be between 2021-06-19 and 2027-10-21', 11, 1)
SET @solutionId = (SELECT s.Id FROM dbo.Solution AS s WHERE s.SolveDate = @date)
END
/* check if game is already finished */
IF EXISTS
(
SELECT 1/0 FROM dbo.GameHistory
WHERE SolutionId = @solutionId
AND GameHistory.Score = 'GGGGG'
)
BEGIN
SELECT
'Please reset using "EXEC dbo.ResetGame @wordleNum = ' + CAST(@solutionId AS varchar(5)) + '".' AS [You already won]
, gh.GuessAttempt
, aw.Word
, gh.Score
FROM dbo.GameHistory AS gh
JOIN dbo.AllWords AS aw
ON gh.GuessId = aw.Id
WHERE gh.SolutionId = @solutionId
ORDER BY gh.GuessAttempt
RETURN
END
/* Max number of attempts has not been reached yet */
SET @lastAttemptNum = (SELECT MAX(GuessAttempt) FROM dbo.GameHistory WHERE SolutionId = @solutionId)
IF (@lastAttemptNum >= 6)
RAISERROR('Solution #%i already has 6 attempts. Please reset using "EXEC dbo.ResetGame @wordleNum = %i".',11, 1, @solutionId, @solutionId)
/* Calculate current guess score */
SELECT @score = sw.score FROM dbo.ScoreWordle(@guessId, @solutionId) AS sw
/* Persist the guess attempt */
INSERT INTO dbo.GameHistory
(SolutionId, GuessAttempt, GuessId, Score)
SELECT
@solutionId, ISNULL(@lastAttemptNum, 0) + 1, @guessId, @score
SELECT
gh.GuessAttempt
, aw.Word
, gh.Score
FROM dbo.GameHistory AS gh
JOIN dbo.AllWords AS aw
ON gh.GuessId = aw.Id
WHERE gh.SolutionId = @solutionId
ORDER BY gh.GuessAttempt
/* Check for solution or end game condition */
IF @score = 'GGGGG'
SELECT CASE @lastAttemptNum + 1
WHEN 1 THEN 'Genius'
WHEN 2 THEN 'Magnificent'
WHEN 3 THEN 'Impressive'
WHEN 4 THEN 'Splendid'
WHEN 5 THEN 'Great'
WHEN 6 THEN 'Phew'
ELSE
'You won'
END AS Victory
IF (@score <> 'GGGGG' AND @lastAttemptNum = 5)
SELECT s.Word AS Solution
FROM dbo.Solution AS s
WHERE s.Id = @solutionId
END TRY
BEGIN CATCH
;throw
END CATCH
END
CREATE OR ALTER PROCEDURE dbo.ResetGame
(
@wordleNum smallint
, @deleteAll bit = 0
)
AS
BEGIN
IF @deleteAll = 1
TRUNCATE TABLE dbo.GameHistory
ELSE
DELETE FROM dbo.GameHistory
WHERE SolutionId = @wordleNum
END
And you can play like this:
EXEC dbo.ResetGame @wordleNum = 196
EXEC dbo.PlayWordle
@guess = 'guess'
, @wordleNum = 196
Enjoy the game!
Looking forward π︎
I think there will be a longer break before the next article in the series. Partly due to preparations for sqlbits conference, and due to the problem’s complexity.
I’m not sure how to calculate the optimal guesses yet, and I’m open to suggestions. Thank you for reading.