Project 1: Exhaustive Search

Value and due date to be announced by Dr. Collins.

This assignment introduces the notion of positioning a sequence/chain of objects on a lattice (points in the plane or n-dimensional space with integer coordinates) in such a way as to maximize some metric or property based on the axis (horizontal, vertical) adjacency of the entries in the chain. The context we will use is a simplified model of the process whereby proteins arrange themselves in a water-based solution. To do so we will categorize the amino acid components as either hydrophobic or polar. Hydrophobic amino acid residues despise being next to polar amino acids or water molecules. Hence when the protein is placed in the water-based solution it attempts to position itself in the lattice/grid so that the maximum number of hydrophobic components are axis-adjacent.

For this assignment we will restrict our attention to just two dimensions (i.e. a planar grid). The protein chain will be placed in the grid so that consecutive amino acids are either vertically or horizontally adjacent to each other, and no two amino acid residues can reside in the same location. The score for such a placement will be the number of hydrophobic residues that are vertically or horizontally adjacent. We want a positioning so that the resulting score is maximal.

The finding of a positioning that maximizes the score is known to be NP-complete. But because of the importance of such problems researchers continue to search for faster algorithms to solve these problems.

There are two files that you must download from the web site to complete this assignment. The class file that contains the 'main' is called Protein.java. This file defines the class Protein that defines a class to represent and display a folding as well as computing the score for the folding. YOU SHOULD NOT CHANGE ANY CODE IN THIS CLASS! The second file is called ProteinSearch.java and implements the base, simple depth-first brute force exhaustive search.

Your assignment is to modify the search algorithm so that it is more efficient. You must still use exhaustive search, but incorporate some “pruning” and efficient computation. For example the protein chain HPPHPPPHPPH took 1650 ms to complete using the given program, but some simple to moderate improvements dropped the running time to 20 ms.

Here is the list of requirements for this assignment.

  • Create a public method search2 in class ProteinSearch that is your modification and adjust the main method so that it computes the solution by both the brute force method and your modification, showing the results of each.

  • Write up a report of your solution, describing your modification and the expected improvements, the test cases that you considered (why you chose them), the correctness and speed up of your modifications, and a final analysis of your solution.

  • Submit your code as well as your report via Moodle.