I managed to solve CS50’s “readability” problem some weeks ago, but I couldn’t find the time to explain how I worked it out until today. Anyway, a happy outcome is worth waiting for 🙂
This time, we were challenged to develop an algorithm that computes the grade level needed to comprehend some text (that must be entered by users) with the Coleman-Liau index. Higher reading levels correlate with longer sentences and words.
The index reads as follows:
index = 0.0588 * average number of letters per 100 words - 0.296 * average number of sentences per 100 words - 15.8
You can find an example of a similar algorithm (maybe this one uses another index) in the Hemingway app: http://www.hemingwayapp.com
As always, I first wrote some pseudocode to define the steps needed:
- Prompt users for the text.
- Count the number of letters. Letters can be any uppercase or lowercase alphabetic characters, but shouldn’t include any punctuation, digits, or other symbols.
- Print out the letter number count.
- Count the number of words in a sentence. Assume that a sentence will not start or end with a space, and will not have multiple spaces in a row.
- Print out the word number count.
- Count the number of sentences. A sentence is any sequence of characters that ends with “
.
” or “!
” or “?
“. - Print out the sentence number count.
- Modify the program to output the grade level according to the Coleman-Liau index . Round result to nearest whole number. If the resulting index number is 16 or higher, print out
"Grade 16+"
. If the index number is less than 1, print out"Before Grade 1"
.
Then, I started to write the code by adding the required libraries:
#include <stdio.h> #include <cs50.h> #include <string.h> #include <ctype.h> #include <math.h>
Next I prompted the user for the text and counted the number of letters, words and sentences as follows. I always add a lot of comments to the code, I know…it helps me organise the information and quickly identify the steps. Maybe it is not the “cleanest” approach, but it works for me.
I have used three new functions to solve this:
- strlen (string.h library) -> counts the length of a string (the number of letters in this case)
- isspace (ctype.h library) -> checks if a character is a white space or not.
- isalpha (type.h library) -> checks if the character is an alphabet (a-z both upper and lowercase) or not.
int main(void) { //Prompt user for a string of text string userinput = get_string("Text: "); // Calculate number of letters with strlen int lettertotal = strlen(userinput); int lettercount = 0; for (int i = 0; (i < lettertotal); i++) { if (isalpha(userinput[i])) { lettercount = lettercount + 1; } } //printf ("Number of letters: %i\n", lettercount); // Calculate number of words with strlen int wordtotal = strlen(userinput); // Attention! wordcount must start with 1, not with 0! int wordcount = 1; for (int i = 1; (i < wordtotal); i++) { if ((isspace(userinput[i])) && (isalpha(userinput[i + 1]))) { wordcount = wordcount + 1; } } //printf ("Number of words: %i\n", wordcount); int sentencetotal = strlen(userinput); int sentencecount = 0; for (int i = 0; (i < sentencetotal); i++) { if (userinput[i] == '.' || userinput[i] == '?' || userinput[i] == '!' || userinput[i] == ':') //|| userinput[i] == ';') { sentencecount = sentencecount + 1; } } //printf ("Number of sentences: %i\n", sentencecount);
Next I calculated the average numbers of letters and sentences per 100 words (I used the float function for that):
// Calculate average number of letters per 100 words. Float needed. float averagenumberoflettersper100words = ((float)lettercount * 100) / (float)wordcount; //printf ("Average number of letters per 100 words: %i\n", (int)averagenumberoflettersper100words); // Calculate average number of sentences per 100 words float averagenumberofsentencesper100words = ((float)sentencecount * 100) / (float)wordcount; //printf ("Average number of sentences per 100 words: %i\n", (int)averagenumberofsentencesper100words);
And finally the grade:
// Calculate grade float grade = 0.0588 * averagenumberoflettersper100words - 0.296 * averagenumberofsentencesper100words - 15.8; //float finalgrade = round(grade); if (grade >= 1 && grade <= 16) { printf("Grade %i\n", (int) round(grade)); //printf("\n"); } else { if (grade < 1) { printf("Before Grade 1\n"); //printf("\n"); } } { if (grade > 16) { printf("Grade 16+\n"); //printf("\n"); } } }