Encoder Front Page
SRS Home | Front Page | Monthly Issue | Index
Google
Search WWW Search seattlerobotics.org

 

Developing Mobile Robot Wall-Following Algorithms

Using Genetic Programming

Robert A. Dain, HTR Labs

Abstract

This paper demonstrates the use of genetic programming (GP) for the development of mobile robot wall-following behaviors. Algorithms are developed for a simulated mobile robot that uses an array of range finders for navigation. Navigation algorithms are tested in a variety of differently shaped environments to encourage the development of robust solutions, and reduce the possibility of solutions based on memorization of a fixed set of movements. A brief introduction to GP is presented. A typical wall-following robot evolutionary cycle is analyzed, and results are presented. GP is shown to be capable of producing robust wall-following navigation algorithms that perform well in each of the test environments used.

Introduction

This work focuses on using genetic programming (GP) to develop mobile robot navigation strategies that will perform well in the real world. A simulated robot is used for learning, but the simulation is designed to closely approximate a real robot. The algorithms developed with GP can then be tested in the real world, on a real robot. Wall following was selected for the initial problem domain because it is a fairly simple problem to set up and evaluate. It also lays the groundwork for more complex problem domains, such as maze traversal, mapping, and full coverage navigation (i.e., vacuuming and lawn mowing). The development process for these behaviors is described, and the results of the experiments are presented. (1)

Genetic programming is attractive for this work for several reasons. First, it is a machine learning technology, so control strategies are emergent rather than predetermined. The GP environment is set up for a particular problem domain and it is left to "the unyielding weight of evolution," as Dr. John Koza puts it [1], to drive out solutions. Also, GP is a black-box system in that a-priori knowledge of the system under analysis is not required. Some method of analyzing the results obtained from a set of input conditions is all that is necessary. Finally, the results of a GP learning session can be arbitrarily complex. Complexity is bounded by imposing limits on the maximum size of the computer programs that GP can create. GP will explore solutions that range up to the largest allowable size, regardless of the magnitude of that limit.

An Overview of Genetic Programming

Genetic programming is a methodology for automatically generating computer programs. GP uses the concepts of natural selection and genetic recombination in a manner very similar to that used with genetic algorithms [2]. Rather than evolving fixed-length bit streams, however, GP evolves computer programs of dynamic size and shape. GP is founded on the premise that computer programs can be represented as tree structures, as shown in figure 1a. The functions (f) operate on terminals (T) to produce a result. Functions are operations that take one or more arguments. They can be sequential (i.e., +, AND), conditional (i.e., If, When), or iterative (i.e., DoUntil, WhileDo). Terminals are operations that take no arguments but return a value (i.e., variables and constants). Figure 1b shows a simple arithmetic function. If terminals T1, T2 and T3 evaluate to 5, 7 and 2, respectively, then the program will return the value 6 when it is executed. (5 is added to 7 for an intermediate value of 12, which is then divided by 2 for a final result of 6.) These diagrams represent simple Lisp-like s-expressions that result in a single output at the top. In this simplest form the functions are both sequential and deterministic, but functions and terminals can be defined to perform iteration, recursion, side effecting, memory evaluation, and other complex operations.


Figure 1a - Tree-like representation
Figure 1b - (T1 + T2) / T3 of a computer program

The genetic programmer defines a set of functions and terminals appropriate for the problem at hand. These may include highly generic functions, such as those mentioned above, or they may be specific to the problem domain. The set of functions and terminals defined must satisfy two constraints: sufficiency and closure. Sufficiency requires that a solution to the given problem must exist within the space of all possible computer programs that can be created using the specified set of functions and terminals. Obviously, if a solution does not exist GP will not be able to find it. Closure requires that every function and terminal return a value of compatible type, and that every function accepts arguments of this same type. This constraint guarantees that any combination of functions and terminals that comprise a complete tree structure (i.e., a terminal is placed at every leaf node) will be a syntactically valid computer program that can be evaluated.

The genetic programmer also defines a fitness test that quantifies the performance of a computer program. The performance of a computer program in GP can be a function of the final result that it produces, or it can be based on an analysis of some side effect that the computer program has produced. In the work presented herein, the function and terminal sets contain operators that affect the contents of a 2-dimensional memory array. The final state of this array is evaluated to determine the computer program's fitness.

The GP process initially creates a number of computer programs by randomly combining functions and terminals into complete tree structures. This collection of programs, or individuals, is called a population. The GP engine evaluates and assigns a fitness value to each of the individuals in the population. Not surprisingly, the performance of the individuals in this initial, randomly created population can be quite poor. Nevertheless, there is variance in the performance levels: some individuals perform less poorly than others. The population is ranked according to fitness. A new population is generated by selecting individuals from the previous generation to participate in creative operations. Individuals are selected for inclusion in the creative process based on their fitness - the higher the fitness, the higher the likelihood of inclusion. Three basic creative operations are used in GP: reproduction, crossover and mutation (2). Reproduction is simply the copying of an individual from the previous generation into the next generation without any modification of its structure.


Figure 2 - The reproduction operation.

Mutation is performed by randomly selecting a node in an individual tree structure, and removing that node along with any sub-tree that may exist below it. A new sub-tree is then generated randomly and "grafted in" at the position where the original node was removed.


Figure 3 - The mutation operation.

Crossover involves selecting two individuals and selecting a node at random in each of them. The selected nodes, along with any sub-trees that exist below them, are exchanged between the two individuals.


Figure 4 - The crossover operation

As successive populations are created by applying these genetic operators, the average performance of the population tends to improve. The performance of the best individual in the population can improve to optimal or near-optimal levels. There is no guarantee, of course, that GP will find an optimal solution (in less than infinite time), but a well thought out set of functions and terminals with a reasonable fitness test will usually produce good results.

The Wall Following Robot

The goal of a wall-following robot is to navigate through open spaces until it encounters a wall, and then navigate along the wall at some fairly constant distance. The successful wall-follower should traverse the entire circumference of its environment at least once without straying either too close, or too far from the walls.


Figure 5 - The four room configurations used to develop the wall-following behavior.

A two-dimensional simulated environment is used to test the navigation algorithms developed by GP. The environment consists of four completely enclosed rooms with wall contours of varying complexity. Each individual algorithm from the GP population is tested in each of the four rooms, and scored on how well it performs the wall-following behavior. By testing each individual in a number of different environments GP is encouraged to develop general-purpose solutions rather than simply memorize a set of movements, or capitalize on insignificant coincidental properties of the simulator [3].

A screen display of the simulation environment is shown in figure 5. The dark shaded areas represent the walls of the environment. The lightly shaded areas represent the desired path of the robot. The desired path is for display only - the robot has no visibility of these areas. All sensory input to the robot comes through the use of simulated range finders. The robot has eight range finders positioned at 45 increments about its center. The solid square in the center of each room designates the starting point of the robot in that room.

Each room in the simulated environment consists of a 40 by 40 array of grid locations that are each 10 pixels square. The pixel size is used as the basic unit of movement (3). The navigation algorithms are scored on how many of the lightly shaded "corridor" grid locations they can pass through - the more the better. The robot is allowed to wander between 10 and 20 units from the wall (the width of a corridor grid) without any performance demerits. GP calculates fitness based on the number of corridor grids that are not visited during the course of a run, and attempts to minimize the following fitness equation:

1000 * NumberOfCorridorGridsNotVisited +
100 * DistanceOfEndingPointFromCenterOfCorridor +
TotalDistanceTravelled

NumberOfCorridorGridsNotVisited is the most significant aspect of fitness. A weighting factor of 1000 ensures that GP will remain motivated to navigate through corridor grids, even when there are only a few unvisited grid points remaining.

DistanceOfEndingPointFromCenterOfCorridor is the distance from the final resting point of the robot to the center of the nearest corridor grid (i.e., 15 units from the nearest wall). This term is used to "kick-start" the GP process. Without it, a robot that does not move at all (travels zero distance) will score higher than a robot that moves towards a wall but does not quite reach the corridor grids, and GP will often suffer premature convergence on these solutions. In other words, this term encourages initial exploration by rewarding algorithms that move towards walls, even if they do not reach those walls. This provides the necessary scoring gradient in the early generations to encourage GP to develop individuals that leave their starting location and move towards the walls. A weighting factor of 100 is applied to this term so that it will eclipse TotalDistanceTravelled any time the robot does not reach the corridor grids.(4)

TotalDistanceTravelled is the least significant term in the fitness equation. It is included to discourage the navigation algorithms from traversing the corridor grids by spiraling haphazardly through the open spaces of the environment. A robot that navigates directly through the center of the corridor grids will score higher than one that meanders back and forth across the grids.

The Genetic Programming Environment

As described above, genetic programming solves problems by combining and organizing a predefined set of functions and terminals. This set of functions and terminals provides the fundamental domain knowledge that allows GP to discover solutions. The functions and terminals used for the wall-following experiments presented herein are defined below.

Do2 (Arga, Argb) This function provides for sequential operations. It evaluates argument Arga, then it evaluates argument Argb. The return value is the result of argument Argb. Multiple sequential steps can be accomplished by nested Do2's.

WhileInCoridorRange (Arga) This function checks each of the distance values, selects the minimum value (i.e., the closest wall) and determines if it is within the allowable wall-following range. If the robot is out of range of a wall, this function returns the value 0.0. Otherwise, it evaluates argument Arga continuously (so long as the robot is within acceptable range of the nearest wall) and returns the value obtained from the last evaluation of argument Arga. The range of values that satisfies the "InCorridorRange" condition is defined as 10 to 20 units, inclusive.

WhileTooCloseToWall (Arga) This function is identical to WhileInCoridorRange except that it evaluates the argument as long as the robot is too close to a wall. The range of values that satisfy this function is less than 12 units. 12 is used rather than 10 to provide a hysteresis band between the in and out of range conditions.

WhileTooFarFromWall (Arga) This function is also identical to WhileInCoridorRange except that it evaluates the argument as long as the robot is too far away from the closest wall. The range of values that satisfy this function is greater than 18 units.

IfConvexCorner (Arga, Argb) This function tests for the condition of the range sensor at angle 90 reading too far to the nearest wall, and the range sensor at angle 135 reading an acceptable distance. This condition occurs when the robot is following a wall that drops away in a convex corner. If this condition is satisfied, the function evaluates argument Arga and returns its value. Otherwise it evaluates argument Argb and returns its result.

MoveForward() This terminal causes the robot to move forward five units. The return value is 1.0 if the robot completes its move without colliding with a wall, and 0.0 if a collision occurs. If the robot collides with a wall, movement stops at that point (i.e., the robot is not allowed to move into or through the walls). These return values are arbitrary and were selected to coincide with the C/C++ integral values used to represent Boolean true and false.

TurnRight(), TurnLeft() These terminals cause the robot to rotate 45 to the right or left, respectively. The return value from these terminals is the angle that the robot ends up facing after the rotation.

TurnTowardsClosestWall() This terminal checks each of the current sensor values and selects the lowest value (i.e., the closest wall). It then rotates the robot such that the robot is pointed in the direction of the sensor that had the lowest reading. Note that the robot turns towards the perceived closest wall, and not necessarily towards the actual closest wall. The return value is the new direction of the robot.

TurnAwayFromClosestWall() This terminal is similar to TurnTowardsClosestWall, except that the robot rotates to face the opposite direction of the perceived closest wall.

TurnParallelToClosestWall() This terminal is also similar to TurnTowardsClosestWall, except that the robot rotates such that its 90 sensor is pointed in the direction of the perceived closest wall.

Population size was set to 500 individuals, and experiments typically required 200 to 300 generations to converge on a solution. This population size was selected by trial and error to provide acceptable results in a reasonable amount of time on the hardware available. All experiments were performed on Pentium-100 PC's, and ran for six to eight hours. The probability values for reproduction, crossover and mutation were set to 39%, 60% and 1%, respectively, for all experiments. These values were selected somewhat arbitrarily, based on earlier work [4]. The reproduction and crossover values set the balance between exploitation and exploration, and mutation allows for recovery of any lost genetic materials. All experiments invoked an operation to force the best performing individual to reproduce at least once into the next generation. This prevents outstanding performers from being lost in a genetic roll of the dice (which happened surprisingly often without this operator in place).

The Learning Cycle

The genetic programming engine was, indeed, able to organize the functions and terminals described above into fairly proficient wall-following algorithms. This section describes a typical evolutionary cycle. The initial generations of each run, as expected, performed quite poorly. The robot paths in figure 6 (shown as dotted lines) are typical of early generations. The robot manages to maneuver into the wall-following corridor and claim a few grid points, but has not yet figured out how to turn and follow the corridor. The target corridor shading has been removed for clarity. The black dot at the end of the robot path designates the stopping point of the robot.


Figure 6 - Typical result of early generations in a GP run.

The GP-generated code that produced the behavior displayed in figure 6 is shown below. The line numbers are for reference only, and the indentation represents the calling hierarchy. This particular solution capitalizes on the hysteresis band in the WhileTooFarFromWall function to skim the edge of a wall following corridor in the upper right room.

It begins with a WhileTooFarFromWall loop, with a Do2 function as the body. The robot starts in a too-far-from-wall position, so the Do2 is evaluated. The first argument of the Do2 is a TurnAwayFromClosestWall terminal which sets the direction of the robot to the left in all cases. The second argument of the Do2 is another WhileTooFarFromWall loop with a MoveForward terminal as its body. This loop repeats until the too-far-from-wall test fails (i.e., the robot ends up within allowable wall-following range). The inner WhileTooFarFromWall loop terminates at that point. The Do2 is complete, so the outer WhileTooFarFromWall loop evaluates the position of the robot. Finding that the robot is not too-far-from-wall, this loop terminates and the program ends with a total of 13 hits out of the possible 933.

0 WhileTooFarFromWall
1    Do2
2             TurnAwayFromClosestWall
3            WhileTooFarFromWall
4                     MoveForward

GP soon begins to develop solutions that exhibit rudimentary wall-following behavior, as shown in figure 7. These solutions typically perform reasonably well in the square room, and claim a few corridor grids in the more complex rooms by ricocheting around the environment.

The code for the individual displayed in figure 7 is shown below. It begins with a WhileTooFarFromWall loop with a Do2 as its body. The arguments of the Do2 are a MoveForward and a Do2 with two WhileInCoridorRange loops as arguments. While the robot is out of corridor range the WhileInCoridorRange tests will fail. The result is that the robot will move forward until it finds itself within corridor range of a wall. When the MoveForward on line 2 moves the robot into a wall-following corridor, the WhileInCoridorRange functions are able to execute. The first (on line 4) turns the robot away from the closest wall repeatedly until the maximum loop counter is exceeded.(5) The second (on line 6) employs a Do2 to turn the robot parallel to the closest wall and move forward. These two steps are repeated until the robot moves out of range, or the loop times out.(6) When either of those conditions occurs, another iteration of WhileTooFarFromWall (on line 0) is made. If the robot is out of range and the loop counter is not maximized, the process repeats. This algorithm passes through a total of 275 corridor grids of the possible 933.

0 WhileTooFarFromWall
1     Do2
2     MoveForward
3     Do2
4          WhileInCoridorRange
5              TurnAwayFromClosestWall
6          WhileInCoridorRange
7              Do2
8                  TurnParallelToClosestWall
9                  MoveForward


Figure 7 - Typical result of a GP run as it begins to learn the wall-following behavior.

In almost every run GP is able to generate algorithms that perform reasonably well. The individual shown in figure 8 makes a couple of errors, but for the most part it is exhibiting the desired behavior. It passes through 894 of the 933 corridor grids.(7)


Figure 8 - Typical end result of a GP run.

The code for the individual displayed in figure 8 is shown above. Analysis of this code (which is typical of the complexity of GP-generated algorithms) is left as an exercise for the reader. (8)

0 WhileTooFarFromWall
1     WhileTooFarFromWall
2 Do2
3     Do2
4 WhileTooCloseToWall
5     WhileInCoridorRange
6 TurnParallelToClosestWall
7 Do2
8                  WhileInCoridorRange
9                      Do2
10                      WhileTooCloseToWall
11                          WhileTooFarFromWall
12                              MoveForward
13                      Do2
14                          Do2
15                              WhileTooCloseToWall
16                                  TurnParallelToClosestWall
17                              Do2
18                                  Do2
19                                      WhileTooCloseToWall
20                                          WhileInCoridorRange
21                                              TurnParallelToClosestWall
22                                      Do2
23                                          WhileInCoridorRange
24                                              Do2
25                                                  WhileTooCloseToWall
26                                                      WhileTooFarFromWall
27                                                          MoveForward
28                                                  Do2
29                                                      Do2
30                                                          WhileTooCloseToWall
31                                                              TurnParallelToClosestWall
32                                                          Do2
33                                                              MoveForward
34                                                              IfConvexCorner
35                                                                  TurnTowardsClosestWall
36                                                      Do2
37                                                          MoveForward
38                                                          WhileInCoridorRange
39                                                              Do2
40                                                                  TurnTowardsClosestWall
41                                                                  Do2
42                                                                      TurnParallelToClosestWall
43                                                                          MoveForward
44                                          MoveForward
45                                      WhileInCoridorRange
46                                          TurnAwayFromClosestWall
47                              Do2
48                                  MoveForward
49                                  WhileInCoridorRange
50                                  Do2
51                                      TurnTowardsClosestWall
52                                      Do2
53                                          TurnParallelToClosestWall
54                                          MoveForward
55                  MoveForward
56              WhileInCoridorRange
             57                  TurnAwayFromClosestWall

 

Conclusions

These experiments show that GP is capable of generating wall-following algorithms using the function set provided. Although a 100% solution (passing through all possible corridor grids) was never attained, most learning cycles produced algorithms that exhibited the desired behavior. The convex corners proved to be particularly troublesome and warrant further analysis. This is probably a sufficiency issue and resolution should enhance performance, possibly leading to 100% solutions.

The algorithms generated are robust in that they perform well across the four environments used in these experiments. It seems reasonable to expect that the algorithms will also perform well in additional environments of similar complexity, but this hypothesis remains untested. The original intent was to test the algorithms on a physical robot equipped with ultrasonic range finders, but such a robot was unavailable during the required time interval. Follow-on work will focus on employing GP-developed algorithms on physical robots.

This project demonstrates the feasibility of using genetic programming to develop mobile robot navigation algorithms. It lays the foundation for planned follow-on projects of maze traversal, map generation and full-coverage area traversal. The results of this project show sufficient promise to warrant further research into these more complex tasks.

Footnotes

  1. All experiments are conducted using GP Experimenter, a genetic programming environment and tool set developed by the author.

  2. The reproductive operations presented in this section represent the fundamental GP operations. Other reproductive operations have been developed, as well as variations of each of the operations discussed here.

  3. Robot position is displayed at the nearest pixel, but the robot moves through real number space and is not constrained to pixel centers.

  4. This term can also eclipse (or at least compete with) TotalDistanceTravelled if a robot moves away from the walls after passing through one or more corridor grids. While that is not the intent of this term, the effect does not seem to hinder performance and may even be beneficial.

  5. Actually these loops terminate immediately if the robot does not move during the course of evaluating the function body. This feature was added to improve the performance of the simulator.

  6. The instruction pair, TurnParallelToClosestWall and MoveForward, are capable of navigating a concave turn because the robot gets closer to the wall it is approaching than the wall it is following before it goes out of corridor range. The instruction pair is not capable, however, of negotiating a convex corner – the robot simply moves out of range. This is why the robot follows the walls briefly and then loses them.

  7. All but 6 of the 39 corridor grids not visited lie at the point of a convex corner. Stated another way, GP hit only 9 of the 42 convex corner points – a failure rate of 79%. Such poor performance on one aspect of an otherwise good performer suggests a lack of sufficiency, either in the robot sensor configuration or the function set, to solve the convex corner problem. This will be explored further in follow-on work.Although genetic programming is advertised as creating human-readable solutions, it is understood in the GP community that this is only partly true. While the code in figure 8 could be analyzed, the task of doing so is only slightly less daunting than probing the hidden layers of a neural network to understand its internal functionality.

  8. Although genetic programming is advertised as creating human-readable solutions, it is understood in the GP community that this is only partly true. While the code in figure 8 could be analyzed, the task of doing so is only slightly less daunting than probing the hidden layers of a neural network to understand its internal functionality.

References

J.R. Koza, Genetic programming: On the programming of computers by means of natural selection, MIT Press, Cambridge, MA, 1992.

J.H. Holland, Adaptation In Natural And Artificial Systems, University of Michigan Press, Ann Arbor, MI, 1975.

C.W. Reynolds, Evolution of Obstacle Avoidance Behavior: Using Noise to Promote Robust Solutions, Advances in Genetic Programming, MIT Press, Cambridge, MA, pp. 221-241, 1994.

R.A. Dain, An Overview of Genetic Programming with Machine Vision Examples, Northwest Artificial Intelligence Forum Journal, Volume 4, pp. 21-30, 1995.