Next: Analogy: Search and Methodologies Up: An Analogy with Search Previous: An Analogy with Search

Search Background

There are three well-known search algorithms: breadth-first, depth-first, and best-first. The basic background of these three algorithms are presented along with their accompanying search space.

In a breadth-first algorithm, the search is concentrated at the high level and not until a solution is found at this level does the algorithm go deeper into the lower levels. This algorithm is queue-based, and almost the entire search space needs to be searched before an answer is found. On the average, this algorithm is O(N) in complexity where N is the number of nodes in the tree representing the search space. Best case performance for a breadth-first algorithm is and worst case performance are O(N).

In a depth-first algorithm, the search is concentrated at the lower levels. This algorithm is stack-based, and a potential solution may be found early in the search. The worst case performance is no better than a breadth-first algorithm, but on average a depth-first algorithm will find a feasible solution quicker. On the average, this algorithm is O(N) in complexity where N is the number of nodes in the tree representing the search space. Best case performance for a depth-first algorithm is O(1), while worst case performance is O(N).

The depth-first algorithm has a problem around sections of the tree that represent near solutions. The algorithm will get stuck on a local optimum and not find the best solution until much later in the search. This problem is called hill climbing.

In a best-first algorithm, the search is concentrated on the next best move. All next moves are prioritized by looking one move ahead and only the next best move is taken. After each move, additional moves may be possible and are added to the list of candidates. The process continues until an optimum solution is found. The search space is searched in a jumping fashion as the algorithm hops between different areas of higher interest. This algorithm is based on a priority queue that is usually based on a partial order tree.

The best known of the best-first algorithms is called A*. The priority functions are split into two components. One represents the known cost to get to a node in the search space while the other represents the estimated cost of continuing towards the goal. The estimated cost must be positive and must be an underestimation of the actual cost. It can be shown that A* is the best of all best-first search algorithms. The A* algorithm is more complex because it requires the definition of the priority function. On the average, this algorithm is an order-of-magnitude less than N in complexity where N is the number of nodes in the tree. Best case performance for a best-first algorithm is O(1), while worst case performance is O(N).

There is an interesting tradeoff between the cost of visiting a node in the search space and the cost of calculating the priority function. If the search space is small, inexpensive to traverse, and the cost of calculating the priority function is expensive, then the depth-first and breadth-first algorithms may have better total performance over the best-first algorithm. The cost of calculating the priority function can be controlled by varying the quality of the answers returned by the priority function. If the search space is complex and large, then the cost of calculating a precise priority function is negligible. On the other hand, some situations call for a cheap priority function. In the limiting case, the priority function could be simply that all next steps have the same priority and the algorithm becomes a breadth-first algorithm. Alternatively, the priority function could reflect the depth of the search space and the best-first algorithm would behave like a depth-first algorithm.

These search algorithms use a search space. A search space consists of a collection of nodes or states. There are two special states called initial and goal. There is a function that walks the search space using the primitive next step. Optionally, the states may be labeled for later reference. The path from initial state to the goal state is called the solution. Between a state and its reachable next states are associated costs. Only the best-first algorithm uses this cost information for other than summation or report generation reasons. In general, the algorithms produce a directed acyclic graph as a result of the search.

Next: Analogy: Search and Methodologies Up: An Analogy with Search Previous: An Analogy with Search

Ronald LeRoi Burback
1998-12-14