Assignment 9 File: asst9.txt Author: CS 51 (Bob Walton) Version: 1 This is the third and LAST of three assignments in which you program an interpreter called lisp51 for a subset of the COMMONLISP language. In this assignment you will implement the garbage collector and a symbol hash table. You do NOT need to copy any new files into your asst7-9 directory to do this assignment. The following are defined for this assignment: make Makes the lisp51 program. make test-with-gc.out Test gc. Runs previous tests with GC-LIMIT set to a small value. make test-without-gc.out Same test with gc turned mostly off. make test-gc.diffs Diff of previous two .out files after strip- ping GC printouts from test-with-gc.out. Only differences should be inconsequential. make asst9 Makes asst9 outputs, i.e. all tests. make print9 Prints asst9 files. make submit9 Submits for assignment 9. make sprint9 Prints asst9 for submission. A printout of the file: answers9.txt containing a written question is included with this assignment. Also included are the printouts of the files: test-without-gc.in test-with-gc.in that are input to the tests of `make asst9'. The lisp51 interpreter is describe in class handouts that are separate from the assignments. You will need to make reference to a number of the files in these handouts, but in particular to gc.h and gc.cc. The specific things you MUST do for assignment 9 are: (1) Do the written exercise in answers9.txt. This is not related to the rest of the assignment, and can be done any time. (2) Read lisp51.pd, and particularly the section on the garbage collector. (3) Using lisp51.pd as a reference, write the pseudo- code for gc() into gc.cc. This is the design part of this assignment. (4) Write the code for for the garbage collector into the files: gc.cc gc.h cons.cc fixnum.cc function.cc symbol.cc It is best to do this a little bit at a time and test frequently. First implement a counter to be compared against the gc_limit. Code to increment this counter and call gc() needs to be placed in a central function (declared in gc.h, where it may be put as an inline) that is called from particular places in cons.cc, fixnum.cc, function.cc, and symbol.cc. This is the only modification needed to fixnum.cc and function.cc. The counter should be part of the garbage_collector class in gc.h. Be sure to add a statement to the end of gc() to zero this counter. Then implement the GC starting and ending statistics printouts. To print statistics loop through all the object_types calling the gc_print functions (see initialize_all_object_types for such a loop). Now make test-with-gc.out and test-gc.diffs to test. Your GC will be 'working' except nothing will be being moved from the used list to the free list. Be sure lisp51 does NOT do a gc while reading in lisp51.lsp. This will be because GC-LIMIT is set to a large value at the beginning of lisp51.lsp. Then write object_type::clear_marked (in obtype.cc) and the loop that calls it in gc() (compare with the loop in initialize_all_object_types). Test this to be sure lisp51 does not blow up and produces the same results as before. Next write the gc_mark routine, which must be declared in gc.h. Be sure you check the marked flag first, set it second, and call the object scavenge virtual function third. This function can be an inline function in gc.h. Now add a loop to gc() to go though the preserve stack and call gc_mark on all the objects pointed to by that. Test to be sure nothing blows up. Note that some symbols like NIL have been placed in the preserve stack by preserve_all_existing_symbols() called from main in lisp.cc. Look at NIL to see if it has been marked. Try gdb lisp51 run (gc) control-C p {symbol}NIL Next write cons::scavenge and symbol::scavenge in cons.cc and symbol.cc. Be careful not to gc_mark the NULL pointer that might be in the symbol value or function members. Put an assert (ob != NULL) in gc_mark. Retest. Try using the debugger to check that after lisp51 starts both the symbol QUOTE and the primitive function pointed at by its function member are marked. Try gdb lisp51 run (gc) control-C p {symbol}QUOTE p {function}((symbol *)QUOTE)->function Next write the gc() loop to call all the object_type::mark functions, and write symbol_type::mark in symbol.cc. Its hard to test this so be very careful. You might want to put tracing code in various places for debugging. For example you could have symbol::scavenge print out the name of each symbol scavenged and then type lisp51 'x (setq y 99) (gc) Then Y should be scavenged but X should not be. A lot of other symbols from lisp51.lsp will be scavenged. Next call gc_mark (environment) to mark the environment. DO NOT FORGET THIS. Now write the loop in gc() that calls all the object_type::sweep functions, and write the default object_type::sweep function in obtype.cc. Be very careful with the list pointers. Test to see that nothing blows up if you do: lisp51 (gc) HOWEVER, if you continue on from this point things might blow up because the C++ code has not been doing preserve/release to keep the preserve stack up to date. If you make test-with-gc.out the lisp51 program should make errors and possibly crash because the preserve stack has not been maintained. If it does not crash make test-gc.diffs which should show errors. Now you need to go through all your .cc files looking for places that preserve and release are needed. Read preserve.h carefully first. Start with eval.cc and parser.cc and then review the other files. You should be cautious and use preserve and release everywhere it might be needed, don't skimp. But if you are totally mindless about this, preserving everything in sight, then points will be deducted. Read the rules in preserve.h and follow them, adding a few preserves even if you are not sure they are needed. For every preserve you must do a corresponding release, or the toplevel function will catch you with an assert statement. You are done when test-gc.diffs looks right. Its very hard to get completely done: only a very few points will be deducted if only a few preserves are missed. (5) Now go on to the hash table part of the assignment. The first step is to write into symbol.cc a small function: int hash (char * name) that takes a symbol name and returns an integer from 0 through 100 that is a function of name but a function that appears to produce random changes in output for regular changes of input. Try hash ("...C") ===> (2 * hash ( "..." ) + 'C') % 101 where C is any ASCII character. Then add a member: object * hash_table [101]; to the symbol_type class in symbol.cc. The hash_table replaces the single used_list with 101 used lists: hash_table[0] through hash_table[100]. Which used list a symbol appears on is calculated by applying the hash function above to the symbol's name. Because there are 101 lists, each is shorter than the original list by a factor of 101 on average. It is important for the symbols in a particular program to be randomly distributed among the hash_table used lists, even if the symbol names do not appear to have been chosen randomly by the programmer. Now you replace every virtual function of object_type that uses object_type::used_list with a corresponding virtual function of symbol_type that uses symbol_type::hash_table, so object_type:: used_list is NO LONGER USED FOR SYMBOLS. You must change, for example, symbol_type::mark. And you must add new functions symbol_type::clear_marked, symbol_type::put_used, symbol_type::sweep, and symbol_type::gc_print. Next test everything. You might want to ci -l test-with-gc.out before testing and then compare the results with the new hash table using rcsdiff test-with-gc.out to see if they have changed (but do not do anything like change lisp51.lsp during testing or the original ci'ed test-with-gc.out file will change). The answers9.txt part of this assignment is worth 5 points, the design part 10 points, the garbage collector code part 40 points, and the symbol hash table code part 5 points. There is NO separate theory part to this assignment: the only thing you need to hand in is the make sprint output. The following are the SLOC counts for a well written solution for the garbage collector. In C++ a SLOC is defined as a non-comment line containing a semi-colon. gc.cc (mostly gc()) 29 object_type::clear_marked 3 object_type::sweep 7 gc.h 6 preserves/releases ?? These SLOC counts will be the number of points that will be deducted if you do NOT do a function (they do NOT sum to 40). The SLOC count for the symbol hash table is much higher than its point total would indicate. Good Luck!