CSC B09 Assignment 3, Winter 2015: Write a trivial shell in C The
Transcription
CSC B09 Assignment 3, Winter 2015: Write a trivial shell in C The
CSC B09 Assignment 3, Winter 2015: Write a trivial shell in C Due by the end of Friday March 20, 2015; no late assignments without written explanation. This assignment involves implementing basic command execution such as is performed by any unix shell. The parsing of the command line is supplied; your task is to implement the fork/exec/wait, i/o redirection, and pipes, as follows. The supplied parsing code In /cmshome/ajr/b09/a3 there is skeleton source code for a feeble little unix shell which I call ‘‘tsh’’. The code there does some simple parsing of a command line, resulting in the following structure: struct cmdline { char *inputfile, *outputfile; /* i/o redirection with ’<’ and ’>’ */ char **argv; /* ends with NULL; if argv[0] is NULL, it’s a blank line */ struct cmdline *pipedinto; }; ‘‘inputfile’’ and ‘‘outputfile’’ are the file names which the command is redirected from or to with ‘<’ and/or ‘>’, respectively. If a redirection has not been performed, they will be NULL. ‘‘argv’’ is an array of the words in the command, where argv[0] is the program to be executed (after looking up its location using the PATH variable). However, unlike the arguments to main(), there is no argc value but rather, the array is terminated with a NULL pointer value. This makes it suitable for passing to execve() directly. ‘‘pipedinto’’ is NULL for a simple command, or the command to the right of this one in the pipeline. In the struct object pointed to by pipedinto, inputfile and outputfile will always be NULL. For the purpose of this assignment you only need to handle pipelines of up to two components, e.g. ‘‘foo | bar’’ and not ‘‘foo | bar | baz’’. An example of calling parse() is in the supplied skeleton tsh.c in main(). In fact you do not need to look inside parse.c for most of the assignment, and you do not need to change the supplied main(). Suggested sequence of implementation 1. First, compile and run the distributed tsh.c and parse.c and type some commands to it. Type zero or more argv lists separated by vertical bars, possibly with an input or output redirection for the entire command. 2. Make execute() execute a simple command which uses an absolute path name, by using execve(). (Remove the existing dummy execute() contents.) Thus, so long as p->argv[0] is not a null pointer, you can use p->argv[0] as the first parameter to execve(), and p->argv itself as the second parameter. The third parameter to execve() will be the global variable ‘‘environ’’. You can declare it with ‘‘extern char **environ;’’. Note that this means that you can’t type ‘‘cat file’’ but must instead type ‘‘/bin/cat file’’. We’ll fix that in the next step. 3. If p->argv[0] does not contain a slash, construct a string consisting of a directory name from the searchlist array concatenated with p−>argv[0] (you may impose a length limit of, say, 1000 chars, so long as you check that this is not exceeded no matter what wacky things the user either types in or supplies as a PATH variable). Go through the searchlist array in order, calling stat() on each one to determine whether it exists, and stopping when you find a file which is executable. Pass that file path name as the first parameter to execve() instead of argv[0]. Note that parsePATH(), previously called from main(), initializes the searchlist array (which doesn’t subsequently change). As distributed, parsePATH() puts a hard-coded path in searchlist, but you will (continued) -2- change this later to parse the PATH environment variable. So for example, if the elements of searchlist are ‘‘/bin’’ and ‘‘/usr/bin’’ (thus searchlistsize is 2), you will try to stat /bin/cat and then /usr/bin/cat, except that the stat of /bin/cat would succeed so you would stop there. If the command is not found in any of the directories in the search list, print the usual error message: ‘‘%s: Command not found\n’’. Note that the above routine with concatenating strings only applies if argv[0] does not contain a slash. Test with strchr(p−>argv[0], ’/’). For example, the user can still type /bin/cat, and this doesn’t mean /bin/bin/cat, or /usr/bin/bin/cat—it just means /bin/cat as typed. Also, ‘‘./cat’’ means to run cat in the current directory, even though ‘‘/bin/./cat’’ would be a valid name for /bin/cat. (That is to say, ‘‘./cat’’ is not the same as ‘‘cat’’!) To summarize this paragraph in other words, if a slash appears anywhere in the argv[0] string, it is a complete file pathname (absolute or relative), not to have the search directories prepended. After a failed execve(), call perror(). The parameter to perror() should be the first parameter to execve() including the prepended directory name. 4. Implement i/o redirection. You have to open the appropriate files after the fork(), in the child only. Test your implementation with commands such as ‘‘ls >file’’ and ‘‘tr e f <file’’. 5. Implement pipelines of length two. That is, if p−>pipedinto is non-null, do a pipe() call in the child process, then fork again, then rearrange file descriptors as appropriate in the two youngest processes, and exec. Make sure that simple commands still work! Now is also a good time to make sure that you just get another prompt if you just press return (which yields a ‘‘pipeline’’ of length zero), with no extra lingering processes. (Getting two prompts would be one sign of the lingering processes problem.) Pipelines of length greater than two are trickier and you don’t have to do them for this assignment, but they will be implemented in my sample solution. 6. Implement parsePATH(). As shown in the distributed code, begin with getenv("PATH") to get the PATH variable from the environment. If the variable PATH is not set in the environment, getenv returns NULL; this is extremely unusual and simply exiting with an appropriate message (as already implemented) is an adequate reaction. The PATH variable contains directory names separated by colons. You want to store each directory name in a separate element of the searchlist array, and leave the appropriate value in searchlistsize. If the number of entries exceeds MAXSEARCHLIST, you can just abort with an error. You will have to malloc() the appropriate strings, but you may want to use the ‘‘estrsavelen’’ function in parse.c, which is exported for this possible purpose. For the purposes of this assignment, you don’t have to worry about the possibility of empty strings in the PATH (e.g. /bin:/usr/bin: (note the trailing colon)). Note: parsePATH() is worth only about ten percent of the value of the assignment. To think about: freeparse() Free()ing the data structure created by parse() is something else which needs to be written as part of a complete implementation of tsh, although it is not part of this assignment. For the purposes of this assignment, you can leave the dummy freeparse() in parse.c. How would you write freeparse()? You may find it instructive to produce a draft version, although that is not to be submitted. My sample solution will contain a correct freeparse() implementation. (continued) -3- Epilogue: Memory leak This part is worth about five to ten percent of the value of the assignment. The supplied parse.c contains a bug of the kind called a ‘‘memory leak’’—over time, the tsh program will use more and more memory; not everything would be freed properly even after the hypothetical freeparse() is called. The reason for this is that there is a place where a pointer variable (data area) is assigned to be the return value from malloc() but that variable might already contain the only copy of another return value from malloc(), so the previous malloc() pointer is lost and cannot be freed. Find this in parse.c and fix it. Submit your revised parse.c under the name ‘‘parse-fixed.c’’. Be sure to diff your parse-fixed.c with the original. The change you make should be minimal. If you make changes throughout the file you will get zero for this part of the assignment; full marks requires changing only the portion of the code which has this memory leak problem. Note: Your tsh.c will be compiled with the original parse.c (and parse.h); your parse-fixed.c will be graded separately. Other notes You will want to begin by making a subdirectory to hold the .c files. You’ll want to copy in the starter files from /cmshome/ajr/b09/a3, i.e. ‘‘cp /cmshome/ajr/b09/a3/* . ’’. You can type ‘‘make’’ to use the supplied Makefile to build your program, or you can simply type ‘‘gcc −Wall tsh.c parse.c’’. Your C programs must be in standard C. They must compile on the UTSC linux machines with ‘‘gcc −Wall’’ with no errors or warning messages, and may not use linux-specific or GNU-specific features. Your revised tsh.c file will be compiled with the original versions of all of the other files for automated testing. If you have edited the other files (e.g. to fix parse.c’s memory leak, or e.g. to put in debugging printfs somewhere), I strongly recommend copying over all other modified files anew from /cmshome/ajr/b09/a3 (perhaps in a new directory, then also copying in your tsh.c) and doing ‘‘make clean’’ and then ‘‘make’’ to produce a tsh for your final testing. Once you are satisfied with your files, you can submit them for grading with the command submit −c cscb09w15 −a a3 tsh.c parse-fixed.c and the other ‘‘submit’’ commands are also as before. Please see the assignment Q&A web page at http://mathlab.utsc.utoronto.ca/courses/cscb09w15/a3/qna.html for other reminders, and answers to common questions. Remember: This assignment is due at the end of Friday, March 20, by midnight. Late assignments are not ordinarily accepted and always require a written explanation. If you are not finished your assignment by the submission deadline, you should just submit what you have, for partial marks. Despite the above, I’d like to be clear that if there is a legitimate reason for lateness, please do submit your assignment late and send me that written explanation. And above all: please be careful not to commit an academic offence in your work on this (or any) assignment, even if you’re under pressure. Just submit what you can do yourself; do not look at other students’ assignments, and do not show your assignment (complete or partial) to other students. Even a zero out of 10% is far better than cheating and suffering an academic penalty. Students also receive academic offence penalties for giving their assignment to other students, since they are helping that other student to commit an academic offence. Your friend might promise in all sincerity not to hand in your work as their own, but if they can’t do the assignment themselves, a copy of your solution is not going to help them enough and when the deadline approaches, they might hand in some of your work. Don’t tempt them.