An Investigation of Supervised Learning in Genetic Programming   [SL] [GP]

by

Gathercole, C.

Literature search on Evolutionary ComputationBBase ©1999-2013, Rasmus K. Ursem
     Home · Search · Adv. search · Authors · Login · Add entries   Webmaster
Note to authors: Please submit your bibliography and contact information - online papers are more frequently cited.

Info: 1998
Keywords:genetic algorithms, genetic programming
Abstract:
This thesis is an investigation into Supervised Learning [SL] (SL) in Genetic Programming (GP). [GP] With its flexible tree-structured representation, GP is a type of Genetic Algorithm, [GA] using the Darwinian idea of natural selection [NS] and genetic recombination, evolving populations of solutions over many generations to solve problems. SL is a common approach in Machine Learning [ML] where the problem is presented as a set of examples. A good or fit solution is one which can successfully deal with all of the examples. In common with most Machine Learning [ML] approaches, GP has been used to solve many trivial problems. When applied to larger and more complex problems, however, several difficulties become apparent. When focusing on the basic features of GP, this thesis highlights the immense size of the GP search space, [SS] and describes an approach to measure this space. A stupendously flexible but frustratingly useless representation, Anarchically Automatically Defined Functions, [ADF] is described. Some difficulties associated with the normal use of the GP operator Crossover (perhaps the most common method of combining GP trees to produce new trees) are demonstrated in the simple MAX problem. Crossover [MP] can lead to irreversible sub-optimal GP performance when used in combination with a restriction on tree size. There is a brief study of tournament selection [TS] which is a common method of selecting fit individuals from a GP population to act as parents in the construction of the next generation. The main contributions of this thesis however are two approaches for avoiding the fitness evaluation [FE] bottleneck resulting from the use of SL in GP. To establish the capability of a GP individual using SL, it must be tested or evaluated against each example in the set of training examples. Given that there can be a large set of training examples, a large population of individuals, and a large number of generations, before good solutions emerge, a very large number of evaluations must be carried out, often many tens of millions. This is by far the most time-consuming stage of the GP algorithm. Limited Error Fitness (LEF) and Dynamic Subset Selection (DSS) both reduce the number [TNO] of evaluations needed by GP to successfully produce good solutions, adaptively using the capabilities of the current generation of individuals to guide the evaluation of the next generation. LEF curtails the fitness evaluation [FE] of an individual after it exceeds an error limit, whereas DSS picks out a subset of examples from the training set for each generation. Whilst LEF allows GP to solve the comparatively small but difficult Boolean Even N parity problem for large N without the use of a more powerful representation such as Automatically Defined Functions, DSS [ADF] in particular has been successful in improving the performance of GP across two large classification problems, allowing the use of smaller population sizes, many fewer and faster evaluations, and has more reliably produced as good or better solutions than GP on its own. The thesis ends with an assertion that smaller populations evolving over many generations can perform more consistently and produce better results than the `established' approach of using large populations over few generations.
URL(s):(G)zipped postscript

Review item:

Mark as doublet (will be reviewed)

Print entry



BibTex:
@PhdThesis{gathercole:thesis,
  author =       "Chris Gathercole",
  title =        "An Investigation of Supervised Learning in Genetic
                 Programming",
  school =       "University of Edinburgh",
  year =         "1998",
  keywords =     "genetic algorithms, genetic programming",
  URL =          "ftp://ftp.dai.ed.ac.uk/pub/daidb/papers/pt9810.ps.gz",
  size =         "207 pages",
  abstract =     "This thesis is an investigation into Supervised
                 Learning (SL) in Genetic Programming (GP). With its
                 flexible tree-structured representation, GP is a type
                 of Genetic Algorithm, using the Darwinian idea of
                 natural selection and genetic recombination, evolving
                 populations of solutions over many generations to solve
                 problems. SL is a common approach in Machine Learning
                 where the problem is presented as a set of examples. A
                 good or fit solution is one which can successfully deal
                 with all of the examples.

                 In common with most Machine Learning approaches, GP has
                 been used to solve many trivial problems. When applied
                 to larger and more complex problems, however, several
                 difficulties become apparent. When focusing on the
                 basic features of GP, this thesis highlights the
                 immense size of the GP search space, and describes an
                 approach to measure this space. A stupendously flexible
                 but frustratingly useless representation, Anarchically
                 Automatically Defined Functions, is described. Some
                 difficulties associated with the normal use of the GP
                 operator Crossover (perhaps the most common method of
                 combining GP trees to produce new trees) are
                 demonstrated in the simple MAX problem. Crossover can
                 lead to irreversible sub-optimal GP performance when
                 used in combination with a restriction on tree size.
                 There is a brief study of tournament selection which is
                 a common method of selecting fit individuals from a GP
                 population to act as parents in the construction of the
                 next generation.

                 The main contributions of this thesis however are two
                 approaches for avoiding the fitness evaluation
                 bottleneck resulting from the use of SL in GP. To
                 establish the capability of a GP individual using SL,
                 it must be tested or evaluated against each example in
                 the set of training examples. Given that there can be a
                 large set of training examples, a large population of
                 individuals, and a large number of generations, before
                 good solutions emerge, a very large number of
                 evaluations must be carried out, often many tens of
                 millions. This is by far the most time-consuming stage
                 of the GP algorithm. Limited Error Fitness (LEF) and
                 Dynamic Subset Selection (DSS) both reduce the number
                 of evaluations needed by GP to successfully produce
                 good solutions, adaptively using the capabilities of
                 the current generation of individuals to guide the
                 evaluation of the next generation. LEF curtails the
                 fitness evaluation of an individual after it exceeds an
                 error limit, whereas DSS picks out a subset of examples
                 from the training set for each generation.

                 Whilst LEF allows GP to solve the comparatively small
                 but difficult Boolean Even N parity problem for large N
                 without the use of a more powerful representation such
                 as Automatically Defined Functions, DSS in particular
                 has been successful in improving the performance of GP
                 across two large classification problems, allowing the
                 use of smaller population sizes, many fewer and faster
                 evaluations, and has more reliably produced as good or
                 better solutions than GP on its own.

                 The thesis ends with an assertion that smaller
                 populations evolving over many generations can perform
                 more consistently and produce better results than the
                 `established' approach of using large populations over
                 few generations.

                 ",
}