Assignment 4: Pooling strings

Due: 5:00pm, Friday, September 19. Value: 40 pts. Submit your solution (consisting of pool.c only) to Moodle.

Often programs end up allocating lots of memory for strings. As we will see in a later assignment, all these calls to malloc waste quite a bit of memory: Each time you ask malloc to reserve n bytes of memory, in fact it will round n up to a multiple of four, add another four, and then allocate that many bytes rather than n itself. This can add up to seven extra bytes for each memory allocation. If you have lots of short allocations, all that can add up. For example, allocating memory for each string in an English dictionary ends up using over 50% more memory than strictly needed by the actual dictionary words.

In this assignment you'll develop a string pool. The idea is that you'll allocate a large chunk of memory using malloc, and then you can allocate strings more efficiently out of that chunk. One deficiency relative to malloc is that your pool won't allow individual strings to be deallocated: One can only deallocate the entire pool all at once. (Allowing allocations to be deallocated individually is why malloc uses memory inefficiently.)

The starting code found in main.c exists simply to test your implementation of a string pool: This implementation loads a full dictionary file into memory using a string pool, and then it repeatedly asks the user for a word.

main.c defines the main function for testing your program. which reads the dictionary file and prompts the user repeatedly for query words.
pool.c provides three functions for managing a string pool. This is the file you will modify for this assignment.
pool.h declares the function prototypes for pool.c.

You'll modify pool.c only, to provide the three functions whose prototypes are in pool.h and which are all used by main.c.

struct string_poolpool_create();

Allocates and returns a pool in which strings can be allocated.

charpool_alloc(struct string_pool *poolchar *str);

Allocates a pointer to a newly allocated space in the pool holding a copy of the characters found in the parameter string.

void pool_free(struct string_pool *pool);

Deallocates all memory occupied by the pool.

The idea behind the algorithm you implement in pool.c is illustrated roughly by the below diagram.

pool = pool_create()
a = pool_alloc(pool"Martin")
b = pool_alloc(pool"Galloway")
c = pool_alloc(pool"Hardin")

In calling pool_create, we allocate a large fixed-length character array. Then with each subsequent call to pool_alloc, we copy the string just past the last-allocated string in this array, returning a pointer to the first. Note that the amount of memory taken by this new string is simply the string's length plus 1 for the string termination character. (We'll see that there is slightly more memory taken for tracking the status of the character array. But if you choose a reasonably large character array (I used 1000 bytes), this extra memory will be a tiny fraction of the overall memory used by the pool.)

However, you should keep your character array to be of a reasonable length, since allocating a huge array is itself a vast waste if only a tiny fraction of it is used. I suggest 1000 bytes. However, eventually more strings will be allocated than could fit into a single character array. When this happens, you should simply allocate another large character array and start using that. However, you also need to be ready for a call to pool_free, which must free all the arrays allocated for the pool. To be able to find these arrays, I recommend keeping a linked list of them. In particular, I recommend using a struct which contains your large fixed-length character array, a counter tracking how many entries of that array have been allocated so far, and a pointer to the next struct.

You are responsible for memory leaks and other potential bugs; I strongly recommend valgrind to check for these. [instructions]