Project 2: Words Game¶

Introduction¶

In this project, you will develop a game similar to Wordle (Wikipedia) that we will just call Words.

In this game, the player will be able to select from a list of different word collections gathered from books or other sources. Choosing a collection will limit the game words to only words that appear in that collection. However, the game will be simplified from standard Wordle, because words with duplicate letters will not be included.

The player will be able to select a word length, and then they will proceed to play a Wordle-style game using a word of the selected length chosen at random from the selected word collection. The player will see the responses to their guesses in color (similar to in Wordle). They will get a limited number of guesses, which must be actual words. They have the option to see a hint or to quit at any time and see the word.

Word collection files¶

The code you write must read from one or more files that contain collections of words (maybe entire book texts or just word lists) for use in the game. You will process each file by:

reading the file and checking its format
splitting the text into words and removing punctuation
removing duplicate words
removing words that have duplicate letters
removing words that contain non-letters
removing words that are proper nouns or otherwise not recognized by the Natural Language Toolkit (nltk) for Python
converting the remaining words to uppercase
creating a dictionary of valid words, where the keys are integers representing word lengths and the values are lists of words with that length

Required file format¶

Each word collection file must have the following format:

WORDS collection title
line of text
line of text
...

The text WORDS must appear at the beginning of the first line, followed by a space. The remainder of the text on the first line should be interpreted as the title of the collection. The remainder of the file (line 2 onwards) should be interpreted as the text from which game words will be extracted.

Utility functions¶

You will also write several utility functions for the game, including:

checking which letters of the player's are in the word, or in the right place in the word
creating a color-coded string for displaying the correctness of the guess

Provided Code and data¶

Start with the following source files:

words_game_utils.py – UNFINISHED
words_file_utils.py – UNFINISHED
words_main.py – UNFINISHED
words_utils.py – UNFINISHED
words_game.py – Plays the game: fully implemented
brotherskaramazov.txt - a large sample input file (the entire text of Dostoyevsky's famous novel)
sample.txt - a small text file with just a few words

Install the following packages:

In order to print in color, you should use the colorama module. You can add it to Thonny using the Tools menu – Manage Packages. Search for colorama and install it.
In order to verify words, you should use the nltk module. You can add it to Thonny using the Tools menu – Manage Packages. Search for nltk and install it.

words_main.py¶

This module should consist only of import statements and a main block. Some initial code and comments are provided that explain the exact structure of this module. In the main block, you must read at least one file, but possibly many files given as command-line arguments, use the os module to test that the file exists, and then call several utility functions to process the text in the file. The result of processing the files must be a list of tuples. Each tuple will contain the title of a collection and the dictionary of valid words in that collection. This list of tuples can then be used as input to the game.

If there is an incorrect number of command-line arguments, print a usage statement of the following format: Usage: python word_main.py file1 [file2 ...]

Note

You can use the words_main module (even if unfinished) with the words_game module to experiment with the game by passing a hard-coded list of tuples to the play_game function.

words_utils.py¶

This module contains several functions to be written:

check_letters(word) Returns True if a word contains only uppercase letters
collect_unique_words(text) Returns a set of words
clean_word_set(word_set) Creates a list of valid words
categorize_words(word_list) Puts the valid words into a dictionary organized by length

words_file_utils.py¶

This module contains a single function to be written:

process_word_file(filename) Returns a tuple of the collection title and its text OR None if the file is not correctly formatted

words_game_utils.py¶

This module contains several functions that power the Words game. There are three constants defined, BULL, COW and WRONG, which are named because the games like this are all based on an older game known as Bulls and Cows.

BULL means a letter in the player's guess that is in the correct position in the secret word (shown as green in Wordle and Words).
COW means a letter that is in the secret word but in the wrong position in the player's guess (shown as yellow in Wordle and Words).
WRONG means a letter in the player's guess that is not in the secret word at all, which will be shown in red in the Words game.

`check_guess(secret, guess)`¶

This function checks a guess against the secret word and returns the bulls and cows information in a list. Its first parameter is a string, which is the word being guessed. The second parameter is a string, which is the guess that was entered by the player. The function should return a list that is the same length as the secret. If the user's guess at a position matches the letter at the same position, then the value in the returned list at that position should be BULL (2). If the user's guess at a position is in the secret, but not at that position, the value in the returned list at that position should be COW (1). The value WRONG (0) should be at each position in the returned list where the user's guess is a letter that is not in the word. A guess of the wrong length should return None.

Example

check_guess("BADE", "DEAL") returns [1, 1, 1, 0]
check_guess("DEAR", "DEAL") returns [2, 2, 2, 0]
check_guess("BADE", "BAD") returns None

Note

Your code must use the defined constants BULL, COW, and WRONG, not the literal integer values 2, 1 and 0.

`color_string(result, guess)`¶

This function takes two parameters. The first parameter is a list returned by check_guess, consisting of only 2's, 1's and 0's (BULL, COW, or WRONG). The second parameter is a string, which is the user's guess. The function returns a string of the user's guess colored with green for BULL, yellow for COW, and red for WRONG. Since some users cannot perceive the differences between certain colors, you must add another visual indicator for each letter's status:

Put square brackets around green letters: [A]
Put parentheses around yellow letters: (B)
Put nothing around red letters: E

The only color setting from the colorama module needed for this function is the foreground color. To color a string "ABC" so that A is green, B is red, and C is yellow, you create the string:

colorstring = Fore.GREEN + "A" + Fore.RED + "B" + Fore.YELLOW + "C"

The color indicators in a string apply to all the letters that follow a color indicator. Use the imports and settings for colorama that are already written in the starting files. The autoreset setting in words_main assures that colors get reset to normal on every line.

`collection_menu(word_dicts)`¶

This function takes a single parameter, which is a list of tuples of the form: [(str, dict), (str, dict), ...] Each tuple contains the title of a word collection followed by its dictionary of valid words, organized by length. Using this list, the function should return a single string of the format:

COLLECTIONS
0    collection_name_0
1    collection_name_1
...
n    collection_name_n

The word "COLLECTIONS" appears on a line alone, and on each following line, the collections are numbered starting at 0 with the collection title separate from the number by exactly 4 spaces.

`is_valid_word(word)` and `get_hint(word)`¶

These 2 functions are required to be completed in Part C of this project. In order to check the validity of a word and to get a hint (a definition or synonym for a word), you can use the wordnet module (WordNet Interface) of the Natural Language Toolkit nltk.

A valid word for the purposes of the is_valid_word function is one that can be successfully found by looking it up in wordnet. For the get_hint function, you should use the definition of the first item (position 0) in a word's list of synsets.

In order to use wordnet, you must call nltk.download('wordnet'). This call is in the provided words_main.py file, but if you test without that file, then you must make the call in your own code.

test_game_utils.py and test_file_utils.py¶

You will need to create these two files. Write at least 5 different unit tests each for the 3 functions you write in words_game_utils.py and the single function in words_file_utils.py.

Part A – Readiness Quiz (10 pts)¶

Before the deadline for Part A you should read this document carefully, then look at the starter code provided above. Once you have a clear understanding of the assignment's expectations, complete the readiness quiz in Canvas. The grading for this quiz will be all or nothing: your score on the quiz will be 0 if you miss any questions. You have unlimited attempts.

Warning

If you do not successfully complete the readiness quiz on time, you cannot receive any credit for the entire project (100 points total).

Part B – words_game_utils.py, words_file_utils.py, and tests (30 pts)¶

1) Complete the words_game_utils.py module (10 pts) and write a test_game_utils.py module (5 pts) and submit them to Gradescope by the deadline for your section. Two functions (get_hint and is_valid_word) in words_game_utils.py are written to return default values and are not required to be completed until Part C, and one function (get_random_word) is already completed and requires no changes.

Be sure to test your words_game_utils.py module carefully for correctness AND style before submitting into gradescope.

2) Complete the words_file_utils.py module (10 pts) and write a test_file_utils.py module (5 pts) and submit them to Gradescope by the deadline for your section.

Be sure to test your words_file_utils.py module carefully for correctness AND style before submitting into gradescope.

You will be limited to 10 submissions for Part B code.

Part C – words_main.py, words_utils.py, and words_game_utils.py (60 pts)¶

In part C you will write:

the function process_args in words_main.py to process the command-line arguments.
all the functions in word_utils.py.
the 2 functions (get_hint and is_valid_word) in words_game_utils.py that were not completed in Part B.

Five points of the 60 points in part C will be allocated to the style requirements.

Upload only your words_main.py, words_utils.py, and words_game_utils.py files to Gradescope. (Do not include words_file_utils.py in your submission.)

Before uploading your submission to Gradescope, be sure to complete the following steps:

Test your solution carefully.
Review the course style guide to ensure that your code meets the style requirements.
Run flake8 and eliminate all warnings.
Review and update docstrings and comments as needed.

You will be limited to 10 submissions for Part C code.

PA2 Attribution¶

Provide a short statement describing any assistance that you received on this assignment, either from another student, an online source, an AI-enabled tool, or any other sources.

Acknowledgments¶

The game Wordle was created by Josh Wardle. The image included at the top is from an NPR article about Wordle.