Search Background

Next: Waterfall Up: Theory Previous: Theory

Search Background

There are three well known tree search algorithms, namely breadth-first, depth-first, and best-first.

In a breadth-first algorithm the search is concentrated at the high level and not until a solution is found at this level does the algorithm go deeper into the lower levels. This algorithm is queue based and almost the entire search space needs to be search before an answer is found. This algorithm is O(n) in complexity where n is the number of nodes in the tree. On average, best case and worst case performance are about the same.

In a depth-first algorithm the search is concentrated at the lower levels. This algorithm is stack based and a potential solution may be found early in the search. The worst case performance is no better than a breadth-first algorithm but on average a depth-first algorithm will find a solution quicker in most cases. The depth-first algorithm has a problem around sections of the tree that represent near solutions. The algorithm will get stuck on a local optimal and not find the real solution until much later in the search. This problem is called hill climbing. The algorithm climbs the nearest hill and since all paths are down from the top of the hill, the algorithm does not see the mountain.

In a best-first algorithm the search is concentrated on the next best move. All next moves are prioritized by looking one move ahead and only the next best move is taken. After each move, additional moves may be possible and are added to the list of candidates. The process continues until a solution is found. The search space is searched in a jumping fashion as the algorithm hops between different areas of higher interest. This algorithm is based on a priority queue usually based on a partial order tree.

The best known of the best-first algorithms is called A*. The priority functions are split into two components. One represents the known cost to get to a point in the search space while the other represents the estimated cost of continuing towards the goal. The estimated cost must be positive and an underestimate of the actual cost. It can be shown that A* is the best of all best-first search algorithms. The A* algorithm is more complex because it requires the definition of the priority function.

There is an interesting tradeoff between the cost of visiting a node in the search space and the cost of calculating the priority function. If the search space is small, cheap to walk, and the priority function is expensive, then the depth-first and breadth-first algorithms may have better total performance over the best-first algorithm. The cost of the priority function can be controlled and varied. If the search space is complex and large, then the cost of calculating a precise priority function is negligible. On the other hand, some situations call for a cheap priority function. In the limiting case, the priority function could be simply that all next steps have the same priority and the algorithm would become a breadth-first algorithm. Alternatively, the priority function could reflect the depth of the search space and the best-first algorithm would behave like a depth-first algorithm.

These algorithms need a search space. A search space consists of a collection of nodes or states. There are two special states called initial and goal. There is a function that walks the search space using the primitive next step. The states are labeled using an alphabet. The path from initial state to the goal state is called the solution. Between each state is a cost. Only the best-first algorithm uses this information for other than summation reasons. In general, the algorithms need a directed acyclic graph.

Next: Waterfall Up: Theory Previous: Theory

Ronald LeRoi Burback
Wed Jul 30 10:49:53 PDT 1997