PA3
Transcription
PA3
Programming Assignment Three: PA3 (anagrams) Milestone: Wednesday night, May 13 @ 11:59pm Final: Tuesday night, May 19 @ 11:59pm Overview: The purpose of this programming assignment is to gain more experience with C programming, the Standard C Library routines, system calls, dynamic memory allocation on the Heap using malloc() and realloc(), sorting using qsort() and associated compare functions, hash table creation and insertion using hcreate() and hsearch(), reading data from files with fgets(), and more string manipulation using strlen(), strchr(), strncpy(), and strncmp(). In this program you will find various anagrams of a word read from stdin. You are writing an interactive program that prompts for input, takes input from stdin, directs the output to stdout, and prompts for another input. The program will not exit until the user types the control sequence to indicate no more input (EOF). On Unix the sequence is Control-D (^D), and Control-Z (^Z) on DOS-based environments. Grading: README: 10 points Compiling: 5 points Using our Makefile; no warnings. Style: 10 points Correctness: 75 points -10 points for each unimplemented module or module written in the wrong language (C vs Assembly and vice versa). Includes both abnormal and normal output, both to the correct destination (stderr vs stdout). Make sure you have all files tracked in Git - we will be checking for multiple commits of each file and that they were meaningful commits. Wrong Language: -10 points -10 for each module in the wrong language, C vs. Assembly or vice versa. Extra Credit: 5 points NOTE: If what you turn in does not compile with given Makefile, you will receive 0 points for this assignment. Setup and Git: You are required to use Git with this and all future programming assignments. Look at the PA0 writeup for additional information on Git. Setting up a local repository Create your pa3 directory and initialize a local git repository: [cs30xyz@ieng9]:$ mkdir ~/pa3 [cs30xyz@ieng9]:$ cd ~/pa3 [cs30xyz@ieng9]:pa3$ git init Starter Files [cs30xyz@ieng9]:pa3$ [cs30xyz@ieng9]:pa3$ [cs30xyz@ieng9]:pa3$ [cs30xyz@ieng9]:pa3$ [cs30xyz@ieng9]:pa3$ [cs30xyz@ieng9]:pa3$ [cs30xyz@ieng9]:pa3$ [cs30xyz@ieng9]:pa3$ cp cp cp cp cp cp cp cp ~/../public/pa3.h ~/pa3 ~/../public/pa3_strings.h ~/pa3 ~/../public/pa3_globals.c ~/pa3 ~/../public/Makefile-PA3 ~/pa3/Makefile ~/../public/test.h ~/pa3 ~/../public/testcharCompare.c ~/pa3 ~/../public/words ~/pa3 ~/../public/anagram_phrases ~/pa3 Example Output A sample stripped executable for you to try, and compare your output against, is available at: ~/../public/pa3test When there is a discrepancy between the sample output in this document and the pa3test output, follow the pa3test output. Below are some example outputs of this program. 1. Invalid Output 1.1. Wrong number of arguments. [cs30xyz@ieng9]:pa3:527$ ./anagrams words1 words2 Usage: ./anagrams dictionary_file dictionary_file - containing a list of words 1.2. Run out of memory. [cs30x2@ieng9]:pa3.sp15.anagrams-hash:553$ ulimit -d 8 [cs30x2@ieng9]:pa3.sp15.anagrams-hash:555$ ./anagrams words Error creating hash table: Not enough space 1.3. Invalid dictionary file. [cs30xyz@ieng9]:pa3:544$ ./anagrams words1 Error opening dictionary file: No such file or directory 2. Valid Output 2.1. Word with several anagrams. [cs30xyz@ieng9]:pa3:545$ ./anagrams words Enter a word to search for anagrams [^D to exit]: stop Anagram(s) are: opts, OSTP, Post, post, POTS, pots, SPOT, spot, TOPS, tops Enter a word to search for anagrams [^D to exit]: 2.2. Word with no anagrams. [cs30xyz@ieng9]:pa3:546$ ./anagrams words Enter a word to search for anagrams [^D to exit]: thisisnotaword No anagrams found. Enter a word to search for anagrams [^D to exit]: 2.3. Word with only itself as an anagram [cs30xyz@ieng9]:pa3:547$ ./anagrams words Enter a word to search for anagrams [^D to exit]: Zygote Anagram(s) are: zygote Enter a word to search for anagrams [^D to exit]: zygote No anagrams found. Enter a word to search for anagrams [^D to exit]: 2.4. Exiting successfully. [cs30xyz@ieng9]:pa3:548$ ./anagrams words Enter a word to search for anagrams [^D to exit]: ^D Overview The function prototypes for the various C and Assembly functions are as follows. C routines int appendAnagram(struct anagramHeader *head, struct anagram *anagram) struct anagramHeader* createAnagramHeader(const char *key) void createHashKey(char *key, const char *src, int (*compare)(const void *, const void *)) int initTable(FILE *file) void printAnagrams(struct anagramHeader *head, char *word) int main(int argc, char* argv[]) void searchForAnagrams() Assembly routines: int charCompare(const void *lhs, const void *rhs) void printUsage(const char *programName) void stripNewLine(char *str) Memory Overview Seen below is a diagram to help demonstrate the layout of the program. We will be using the C stdlib hash table functions: hcreate(), hsearch(), and hdestroy(). Each element of the table is a type ENTRY which is a struct containing two char pointers, key and data. The key is used to find elements within the table so each ENTRY must have a unique key. In our case the key is pointing to a character array of lower-case sorted letters. All words that are anagrams of each other share this key. For example “post” and “pots” are anagrams of one another and they both are “opst” when lower-case and sorted. The data will point to a struct anagramHeader (note that data is defined as a char* so we will need to use casting) which contains an array of characters (for the key in the entry to point to), a pointer to a dynamic array of struct anagram (each struct anagram has a char array for the word it represents, all of which share the same lower-case sorted key), and a count of the number of elements currently in the array. This struct anagramHeader is dynamically allocated using malloc(). The array of struct anagram that it holds a pointer to is also dynamically allocated, but using realloc() so that it can be resized as more words are appended to it. C Modules 1. appendAnagram int appendAnagram(struct anagramHeader *head, struct anagram *anagram) This function allocates additional memory at the end of the struct anagram array pointed to by the anagramHeader that is passed in using realloc() and copies the struct anagram that is passed in to the newly allocated memory at the end of the array using memcpy(). Return Value Returns -1 if the realloc() fails, otherwise returns the number of elements in the array after the new addition. 2. createAnagramHeader struct anagramHeader* createAnagramHeader(const char *key) This function allocates memory for one struct anagramHeader using malloc() and initializes its values. The key is set by using strncpy() to copy the key value passed in, the anagrams pointer is set to NULL, and numElements is set to zero. Return Value Returns NULL if the malloc() fails, otherwise returns a pointer to the struct. 3. createHashKey void createHashKey(char *key, const char *src, int (*compare)(const void *, const void *)) This function copies each character from src to key and makes each lower case using tolower(), then it sort the characters using qsort() and the passed in comparison function compare. Return Value None. 4. initTable int initTable(FILE *file) This function reads one line of the file at a time using fgets() and inserts the word into the hash table using hsearch(). If an ENTRY with the same key is already inserted in the table, the word must be appended to the anagrams array associated with that entry using appendAnagram(). If there is not already an ENTRY, then a struct anagramHeader must be allocated using createAnagramHeader() and a corresponding ENTRY inserted into the table before appending the new word. Return Value If createAnagramHeader(), apendAnagram() or hsearch() fail for any reason, return -1, otherwise return the number of words inserted into the table. 5. printAnagrams void printAnagrams(struct anagramHeader *head, char *word) This function must print out each word in the struct anagram array pointed to within head. It must not print any words that match the word passed in (because a word isn’t an anagram of itself). Return Value None. 6. main int main(int argc, char* argv[]) This function drives the program by checking for the expected number of arguments, reading the file passed in on the command line, creating the hash table using hcreate(), filling the table by calling initTable(), interacting with the user using searchForAnagrams(), then destroying the table using hdestroy(). Make sure to check for errors with any of the mentioned function calls and print the correct string found in pa3_strings.h. For the calls to hcreate() and fopen() make sure to use perror() for error message printing. Return Value If any function main() calls encounters an error, call exit(EXIT_FAILURE), otherwise return 0. 7. searchForAnagrams void searchForAnagrams() This function reads in a word from stdin using fgets(), then looks for an ENTRY in the table that has a matching key. If a match is found (that isn’t only the word itself), printAnagrams() is called. If there is no match, print that no anagrams were found. Keep re-prompting the user for a new word until they type in ^D. Note, that when you find an ENTRY and try to access its data pointer (which must be casted to a pointer to struct anagramHeader) that Lint will give you an error about “improper alignment”. Add the comment “/* LINTED */” immediately above the cast to suppress the Lint error. Return Value None. Assembly Modules 1. charCompare int charCompare(const void *lhs, const void *rhs) This function takes two pointers to characters (the prototype uses two void pointers, but it can be assumed that they are char pointers) and compares them. This function must be a leaf subroutine. Return Value Return -1 if the first char is smaller, +1 if the first char is larger, and 0 if they are the same. 2. printUsage void printUsage(const char *programName) This function must print the usage message to stderr. Return Value None. 3. stripNewLine void stripNewLine(char *str) This function must use strchr() to look for a newline character in the str. If one is found, it should be changed to a null character. Return Value None. Unit Testing Provided in the Makefile for PA3 are rules for compiling and running tests on individual functions that are part of this assignment. You are given the source for testcharCompare.c used to test your charCompare() function. Use testcharCompare.c as a template for you to write modules for your other test functions. Unit tests you need to write: testappendAnagram.c testcreateAnagramHeader.c testcreateHashKey.c teststripNewLine.c Think of how to test each of these functions -- boundary cases, special cases, general cases, extreme limits, error cases, etc. as appropriate for each function. As part of the grading, we will run all the required unit tests using the targets in the Makefile and manually grade your required unit test programs. README File Along with your source code, you will be turning in a README (use all caps and no file extension for example, the file is README and not README.txt) file with every assignment. Use vi/vim to edit this file! Your README file for this and all assignments should contain: - Header with your name, cs30x login - High level description of what your program does - How to compile it (usually just typing "make") - How to run it (give an example) - An example of normal output and where that normal output goes (stdout or a file or ???) - An example of abnormal/error output and where that error output goes (stderr usually) - How you tested your program - Anything else that you would want/need to communicate with someone who has not read the writeup Extra Credit There are 5 points total for extra credit on this assignment. [2 Points] Early turnin, 48 hours before regular due date and time. (1 point if you get it 24 hours early) [3 Points] Modify your program to accept phrases with whitespace and punctuation. Strip the non-alphanumeric characters before saving the lowercase sorted encoding of the line. Examples can be found in the anagram_phrases file. If you choose to do the extra credit, you will need to increase the MAX_WORD_LENGTH constant defined in pa3.h to support the phrases in anagram_phrases. Milestone and Turnin Instructions Milestone Check - due Wednesday night, May 13 @ 11:59 pm [16 points of Correctness Section] Before final and complete turnin of your assignment, you are required to turnin several modules to your local for Milestone check. Files required for Milestone: appendAnagram.c charCompare.s createAnagramHeader.c createHashKey.c stripNewLine.s pa3.h pa3_strings.h Makefile Each module is worth 3 points for a total of 15 points. Each module must pass all of our unit tests in order to receive full credit. The function charCompare() must be written as a lead subroutine or it will not receive points on the milestones and may have additional points taken off on the final assignment. A working Makefile with all the appropriate targets and any required headers files must be turned in as well. All five Makefile test cases must compile successfully via the commands make test*** each of the five modules required for the Milestone. In order for your files to be graded for the Milestone Check, you must use the milestone specific turnin script. cd ~/pa3 cse30_pa3milestone_turnin Complete Turnin - due Tuesday night, May 19 @ 11:59 pm Once you have checked your output, compiled, executed your code, and finished your README file (see below), you are ready to turn it in. Before you turn in your assignment, you should do make clean in order to remove all the object files, lint files, core dumps, and executables. How to Turn in an Assignment First, you need to have all the relevant files in a subdirectory of your home directory. The subdirectory should be named: pa#, where # is the number of the homework assignment. Besides your source/header files, you may also have one or more of the following files. Note the capitalization and case of each letter of each file. Makefile: To compile your program with make -- usually provided or you will be instructed to modify an existing Makefile. README: Information regarding your program. Again, we emphasize the importance of using the above names *exactly* otherwise our Makefiles won't find your files. When you are ready to submit your pa3, type: Cd ~/pa3 cse30turnin pa3 Additionally, you can type the following to verify that everything was submitted properly. cse30verify pa3 Failure to follow the procedures outlined here will result in your assignment not being collected properly and will result in a loss of points. Late assignments WILL NOT are accepted. If, at a later point you wish to make another submittal BEFORE the deadline: cd cse30turnin pa3 Or whatever the current pa# is, the new archive will replace/overwrite the old one. To verify the time on your submission file: cse30verify pa3 It will show you the time and date of your most recent submission. The governing time will be the one which appears on that file, (the system time). The system time may be obtained by typing "date". Your files must be located in a subdirectory of your home directory, named paX (where X is the assignment number, without capitalizations). If the files aren't located there, they cannot be properly collected. Remember to cd to your home directory first before running turnin. If there is anything in these procedures which needs clarifying, please feel free to ask any tutor, the instructor, or post on the Piazza Discussion Board. Style Requirements You will be graded for the style of programming on all the assignments. A few suggestions/requirements for style are given below. Read carefully, and if any of them need clarification do not hesitate to ask. - Use reasonable comments to make your code clear and readable. - Use file headers and function header blocks to describe the purpose of your programs and functions. Sample file/function headers are provided with PA0. - Explicitly comment all the various registers that you use in your assembly code. - In the assembly routines, you will have to give high level comments for the synthetic instructions, specifying what the instruction does. - You should test your program to take care of invalid inputs like nonintegers, strings, no inputs, etc. This is very important. Points will be taken off if your code doesn't handle exceptional cases of inputs. - Use reasonable variable names. - Error output goes to stderr. Normal output goes to stdout. - Use #defines and assembly constants to make your code as general as possible. - Use a local header file to hold common #defines, function prototypes, type definitions, etc., but not variable definitions. - Judicious use of blank spaces around logical chunks of code makes your code much easier to read and debug. - Keep all lines less than 80 characters, split long lines if necessary. - Use 2-4 spaces for each level of indenting in your C source code (do not use tab). Be consistent. Make sure all levels of indenting line up with the other lines at that level of indenting. - Do use tabs in your Assembly source code. - Always recompile and execute your program right before turning it in just in case you commented out some code by mistake. - Before running turnin please do a make clean in your project directory. - Do #include only the header files that you need and nothing more. - Always macro guard your header files (#ifndef … #endif). - Never have hardcoded magic numbers. This means we shouldn't see magic constants sitting in your code. Use a #define if you must instead.