您的当前位置：首页正文

Learning

来源：星星旅游

Learning Classifier Systems

Technical Report 070514A

Complex Intelligent Systems Laboratory, Centre for Information Technology Research, Faculty of Information and Communication Technologies, Swinburne University of Technology

Melbourne, Australia jbrownlee@ict.swin.edu.au

JASON BROWNLEE

Abstract-Learning Classifier Systems are a machine learning technique that may be categorised in between symbolic production systems and sub-symbolic connectionist systems. Classifiers are cognitive paradigm for adaptation that learn in environments of perpetual novelty with minimal and delayed reward. They employ two principle processes (1) reinforcement learning called ‘trial-and-error’, and genetic evolution called ‘survival-of-the-fittest’. This work provides a brief review of classifier systems with a focus on the principles of the learning paradigm.

Keywords-Learning Classifier System, LCS, Reinforcement Learning

I. INTRODUCTION

learning Learning Classifier Systems (LCS) are a and a genetic algorithm to evolve a set of binary encoded approach that employs reinforcement machine learning rules. autonomous They are traditionally applied to fields including and data mining. LCS were proposed in the late 1970robot navigation, supervised classification, and s of approach during the early 1990the intensely approach developed lead to in the the loss 1980of s. The interest complexity in the resurged, likely due to the proposal of the simplified and s, although the approach effective review surrounding of XCS LCS, variation. with a This focus work on provides the principles a brief For [15,16,24] and a seminal reference [26].

relevant the and paradigms recent inception books on by the John topic, Holland. see II. HOLLANDS CLASSIFIER SYSTEMS

systems Classifier systems brigade that algorithm) algorithm) learn using are and credit message-passing rule-based rule discovery assignment (the bucket following characteristics:

[14]. They are suited for problems (the with genetic the 1. Perpetually novel events with large amounts of noise 2. Continual, and real-time requirements for action 3. Implicitly or inexactly defined goals

4. Sparse through long sequences of tasks

payoff or reinforcement obtainable only

[4,9], and Classifier systems his seminal work on Adaptation [12], Holland suggested later standardised were proposed [5]. In his by 1992 John edition Holland of that genetic-based classifier learning systems in were problem proposed domains to investigate that were

CIS Technical Report 070514A

May 2007

characteristic of most learning situations for animals and humans.

perpetually How does performance are only rarely available?novel a system environment improve where its performance ([12] pg. 172) overt ratings in of a elements: (computationally In [9] Holland completepresents cognitive his classifier system system with as four a classifiers. (1) A set of elementary interacting units called action learning of the 2) A performance algorithm that directs the system in the success learning in algorithm that keeps environment. track of each (3) classifiers A simple such algorithm receiving that rewards. modifies (4the ) A set more of classifiers complex good that system generates an experience-based cognitive map that classifiers good classifiers are proposed. persist, The and result new is variants that the of lets the system lookahead and assign credit during non-reward intervals.

messages, (3) effectors, (4) feedback, and (5) classifiers.

The actors of the system include (1) detectors, (2) Detectors: Used by the system to perceive the state of the environment

Messagesinformation : Information messages, and messages may directly result in actions in the environment packets. The passed system from the detectors into the performs information system as discrete processing on Effectors: Control the systems actions on and within the environment

Feedbackmay also receive directed feedback from the environment (payoff)

: In addition to the system actively perceiving via its detections, it Classifiermessage classier triggers. Rules act as message processors.

satisfies : A condition-action the conditional rule part that of provides the classifier, a filter for the messages. action of If the a Figure 1 - Summary of the principle components of Holland's classifier

system

alphabet. Messages trinary alphabet of {1, 0, #}, where the A classifier are defined is defined of length as a k bit using string a with binary not care. The classifier# represents do a one or more conditional parts, and an action part.

s string has at least two parts: 1##01#10/001011 [condition] / [action]

message Messages enter the system and are placed onto a classifiers list, against all seek then activation a matching by matching process occurs their bit in which activated, compete and the messages in the list. Those strings onto the winners post their message that are messages blackboard for communication both internally (between message list. Thus the message list provides a classifiers), and externally (detectors and effectors). The

page 1 of 6

# symbol of a classifier defines the generality or specificity of the conditional part of the message. In addition, a message may be such that the condition of classifiers is negated (NOT), thus messages may be assigned sign (+/-). After competition, more than one winning-activated rules may act concurrently. This feature facilitates emergent concepts of the environment innate in the collective of rules, and a transfer of experience to new situations.

The classifier system executes in discrete cycles, as follows:

1) Messages from the environment are placed on the message list

2) The conditions of each classifier are checked to see if they are satisfied by at least one message in the message list

3) All classifiers that are satisfied participate in a competition, those that win post their action to the message list

4) All messages directed to the effectors are executed (causing actions in the environment)

5) All messages on the message list from the previous cycle are deleted (messages persist for a single cycle)

Figure 2 - Classifier system execution cycle (from [12] pg.175)

Competition is used to determine which of the activated classifier are triggered for a given execution cycle of the system. Each classifier is assigned a strength that summarises that classifiers usefulness in past cycles. Thus, a classifier may be considered an IF-THEN hypothesis where the strength describes the validity of the hypothesis as defined by the systems experience. Competition occurs in a bidding process (bidding function) that not only includes the strength of the classifier, but also its specificity. Winners are selected probabilistically proportional to their bid, such that stronger rules are more likely to win the competition. An important problem is that of rating rules, referred to as credit assignment. This problem is complicated, because feedback from the environment regarding the systems performance (holistic) is provided intermittently. Further, this problem is of particular concern early in the systems genesis, before any feedback has been received from the environment. Credit assignment is addressed using the bucket brigade credit assignment algorithm that provides a way of distributing payoff if and when it is received from the environment.

In this algorithm, a classifier may be a consumer and or a supplier of payoff. A classifier may consume the message output from another classifier from the previous cycle. When a classifier consumes a message and wins a competition, the supplier of that message receives payment, which comes at a personal cost to the winner classifier. Thus, the winning bid made by a classifier during a competition is subtracted from the winner and paid to the supplier of the message. When the system is given feedback from the environment, it is distributed to all currently active rules, increasing their strength. In this way, the environment reinforces the current classifiers strengths ensuring that only those classifiers that are in the chain that received reinforcement are made stronger. The system requires multiple plays of the game (many cycles with environment reinforcement) to stabilise a coherent classifier set. Interestingly, this process requires no overt memory, rather memory is implicit in the

CIS Technical Report 070514A

May 2007

system.

A final major concern is that of getting new rules into the system, referred to as rule discovery. The system needs an effective way of proposing plausible classifiers to replace low-strength classifiers. In fact, the systems performance as an induction system is dependent on its ability to propose plausible replacement rules. The principle employed is that the systems experience biases the generation of replacement rules. A genetic algorithm is employed that probabilistically selects and recombines rules from the current set to propose replacement rules. The fitness of a classifier is defined by its usefulness (strength), which as has been discussed is an experience-dependant guideline or estimation with inherent errors. Thus, careful consideration is required regarding the genetic algorithms selection strength, and the rules for selecting the classifiers in the system to replace.

Holistically, the system provides an incremental way of modelling an environment, where the system perpetually gains experience and tries new rules. Thus, the system is designed to continually adapt to the environment, attempting to balance exploration (acquisition of new information and capabilities) and exploitation (the efficient use of information and capabilities already established) [12].

Figure 3 - Overview of Hollands classifier system ([6], pg. 191)

In their seminal work on induction and inductive systems [10], Holland, et al. propose three important properties of classifiers systems: parallelism, message passing, and the systems lack of reliance of interpreters, as follows:

Parallelism: Large numbers of classifiers can be active at the same time. There is no need to schedule rules because classifiers may only post messages to the message list. Rules are used as building blocks, where the activation of multiple rules encapsulates concepts, which may be acted upon in the domain. Message Passing: All communication in and out of the system is performed in messages, this includes input messages from the environment, and action messages from triggered rules.

Lack of Interpreters: Interaction is based on messages, and message triggering is based on matching, thus there is no need for high-level interpreters. The system is modular and graceful, and it is possible to add new candidate classifiers to the system without global disruption.

Figure 4 - Summary of the important properties of classifier systems (from

page 2 of 6

[10] pg. 103-104)

Holland proceeds to suggest that the general classifier system may be augmented in various ways such as specialized algorithms for planning, learning, and inference, such that it may be applied to varied domains. He provides a number of handcrafted examples as follows: Simple stimulus-response classifier (if then rule), Rules for encoding relations (compound object identified by the triggering of multiple rules), Simple memory (an internal alert status for a specific event), Building blocks (combining a number of active rules to handle complex situations), Networks of tagging (networks as hierarchies of classifiers, where classifiers This relationship is discussed by Booker, et al, [14], and by Holland in his proposal of tags for augmenting the lookahead and anticipatory properties of classifier systems [6]. Holland proposes that a lookahead internal model of the world is required to handle the constant flow of performance-unrelated information from the environment. He proposes the classifier systems construct lookahead models from sets of rules, where the anticipatory feature is emergent.

Both symbolic and sub-symbolic systems suffer from the same problem in categorizing signals from the environment they can only categorise signals that can be distinguished. In order to lookahead, Holland proposes a system requires internal stimulus as well as are coupled using tags).

application of HollandDavid Goldberg provided perhaps the seminal genetic classifier system was designed to control a cart-and-pole algorithm) in s classifiers (bucket brigade and a his Ph.D. dissertation [3]. A balancing problem, and to learn to regulate a simulated gas-pipeline periods, effectiveness. and system detect during normal summer and winter through effective learning from a random starting point, The systems gas performance leaks with was increasing achieved without implanted expert knowledge.

[31] Other classical example applications include Forrests sematic application version of the classifier system and the application to a networks. to Smiththe classification s development of knowledge of the LS-1 in maze investigated problem cognitive the and connections poker between [35]. Bookerclassifier s work and artificial systems, and the adaptive behaviour of [17]. creature in a two-dimensional environment an called (AnimatFinally, s) [36,37].

Wilsons work with artificial animals classifier Belew and Forrest explored the relationship between learning systems and symbolic and between systems Production classifier in systems their work and symbolic that proposes sub-symbolic systems hybrids examples knowledge of systems symbolic-based and expert learning systems systems are [28]. in classic for capabilities reasoning. is explicitly Such systems encoded, have stored, good and which lookahead employed poor at autonomous construction of an experience based using means-ends analyses, although are model. symbolic These systems may be contrasted with sub-(neural networks) are a typical example. These systems approaches of which connectionist models construct thus are good at autonomous construction of experience-internal models from provided example data based knowledge models, lookahead and anticipation.

into although models are that not guide good the at system organizing by systems and use means-ends analysis, although they use Classifier systems are rule-based lookahead-oriented a experienced sub-symbolic representative of a middle ground between symbolic and based representation learning processes. and autonomous They are sub-symbolic exploiting based inductive learning approaches, approaches, the systems.

and parallel the anticipatory processing properties of connectionist of expert CIS Technical Report 070514A May 2007

external. For the system to be effective at looking ahead, the absence system connectionsof needs inputs. to be working on the problem in the connectionist approaches. This is a feature possessed by , or re-circulation These are of so-called input pulsesre-entrant in classifier systems information in their ability to algorithm) in if the presence of infrequent environmental and assign payoff (via continue to process the bucket brigade feedback.

III. LEARNING CLASSIFIER SYSTEMS

paradigm Learning adaptive, (pattern Classifier of thinking), Systems thus (LCS) a whole refers class to of a principles learning, Genetics-Based of Hollandcognitive systems based on the [39] Machine s original proposal, so called and intervening [25] and [14] for for classical Learning reviews of (GBML) classier [2]. systems, See systems are 10 an defined years. effective with In that summary regard review, of to two learning the field key classifier in the an survival of the fittestevolutionary unknown environment to trigger adaptation in a system to principles: (fields artificial interactions life), computation, and trial and adaptive error in behaviour, that became and seeks to maximise rewards (reinforcement learning). with an environment in which learning the through system is [11]. Among the many comments regarding the history, provided An excellent summary of what learning classifiers are by the past and present leaders in the field application, insightful summation of the principle characteristics of a and theory of classifier systems is Riolos LCS follows:

in the context of modelling complex systems, as Message Board: A place to add and remove messages Rules: A rule-based representation of knowledge

Competitionperformance, and predictions of future expected outcomes

: A competition for rules to become active based on inputs, past Parallel dynamics of the bidding process. Explicit conflict resolution occurs only with Firing of Rules: Consistency and coordination emerge from the effectors

Credit (TD) methods such as bucket brigade, profit sharing, Q-learning, and mixtures Assignment: Credit is assigned using temporal-difference learning of such schemes operating at the same time.

Rule Discoveryspecific application, with genetic algorithms as the traditional choice. : New rules are discovered using heuristics appropriate to the Figure 5 - Summary of the principle characteristics of LCS (Riolo [11])

that From the same as a system defined by a population of entities that:

is independent work, of syntax Smith, or implementation provides a definition details

page 3 of 6

Act individually, responding to and tacking actions on an external members, under the action of evolutionary computation

environment, while evolving as population review, A review of the field is provided in [8], in that classifier systems. They are as follows:

three arguments are provided in support of Adaptabilityrapidly over different time scales, such on data collected over weeks and months, on changing : Classifier environments. systems They are adaptive, have demonstrated capable of online their adaptiveness learning in data collected over days, and in real-time scenarios.

Generalizationgeneralise, which includes (1) the ability to represent what it has learned in a : An important feature of learning systems is their ability to compact form, and (2) apply what it has learned to unseen situations.

Scalabilitythe does size of the system increase with the increase in problem complexity? The relationship : An important feature of a learning system is an understanding of between system size and problem complexity; how rapidly question system of scalability is difficult to increases as a although, there is evidence to address suggest given that that the complexity for XCS, the of size the than with the size of the complexity of the problem.

low-order polynomial of the complexity of the problem rather Figure 6 - Summary of the supportive arguments for LCS (from [8])

A. Michigan and Pittsburgh

enormous, The number streams of classifier research have emerged, demarcated although of variations given this and types enormity; of classifiers two main is by Michigan style classifier system and Pittsburgh learning De Jong [13]. They are the classical (Holland) classifier system of Smith [32,35].

Michigan principle of using a genetic algorithm to develop new classifiers and relies on Approach: (classical) The traditional approach based on the the bucket brigade algorithm to encourage systems of these classifiers to work together. Suitable in behaviour cannot be tolerated.

for online learning in environments where radical changes Pittsburgh Approachare genetic algorithm. Suitable for offline learning in environments where larger concatenated together : (Pittto ) An approach in which a population of classifiers form an individual to be operated upon by the changes in behaviour may be tolerated.

Figure 7 - Summary of the two main learning classifier systems

learning Wilson proposed two variations of the Michigan style employed in the field. The ZCS [33] which is a simpler classifier which have become ubiquitously variation for the system for investigation, and the XCS [34,38] application.

which is an effective classifier for general ZCSstyle classifier system in which there is no message list, and a Q-learning like : (zeroth-level classifier system) A simpler variation of the Michigan reinforcement learning algorithm is used called QBB.

XCSmaximally : (accuracy-basedreplaced general classifiers) ) An archetypal in application LCS (with accurate and used (usefulness) in with ZCS). an In adaptation of Q-learning which the (different bucket to brigade the hybrid algorithm approach is algorithm is applied to subsets of classifiers (called environmental niches) that of classifiers addition, (as credit opposed is to assigned the predicted based reward). upon the The accuracygenetic apply to the same situations (rather than the entire population).

Figure 8 - The two variations of the Michigan learning classifier system

(problems) In [11], Wilson summarises his two major trade-off and between with classical the learning classifiers. tensions (1) The form. the classifiers (2) cooperationThe trade-off needed competitionbetween for chains between classifiers the of generalityclassifiers to classifiers. His with proposed the performancesolution (in gains the XCS of learning

specific of CIS Technical Report 070514A

May 2007

classifier) is to employ a niching approach for matching classifiers, and use fitness based on accuracy, such that the combination niches do not over-generalize. He claims limitations of these two features overcome that the the research.

of the first 20 years of learning classifier Problem

Solution

Cooperative vs. Competitive GA amongst similar classifiers Performance vs. Generality

Fitness based on accuracy

Table 1 - Summary of the problems and the their solutions in XCS

B. Reinforcement Learning

the Reinforcement learning (RL) is an area of study in system field process (or of agent) machine learning learning to perform and is characterised a task through by a provides of rewards. minimal trial-and-errorfeedback in an environment that treatment Sutton [18] provide a seminal treatment of Temporal Difference of reinforcement and Barto in the form of delayed learning. [29] provide Kaelbling, a seminal et al, learning (TD) that includes techniques such as adaptive heuristic critic, Q-learning and TD(lamda).

algorithms Moriarty, field in et reinforcement al. [1] review learning, the role referring of evolutionary Reinforcement of study as Evolutionary Algorithm to the for main thrusts of RL research; (1) value function methods Learning (EARL). They suggest two that evolutionary search a Searching to policy methods function space space involves that (such search as TD), and (2) using search policy operators space. searching modify represented, value explicit representations of policies. In function rather function the space, process no explicit policies are are that maximizes rewards. involves The two learning a suitable complementary, pays applications. Generally, with differing objectives approaches and learning, less polices in the face of noisy and incomplete information. and attention provides to a individual the robust path decisions EARL approach to designing than good TD classifiers From a reinforcement learning perspective, learning approach in which a policy is decomposed into a number (as a holistic system) are a policy searching of distributed subtasks. policy), providing finer granularity of the search over the representation The search process (rather than operates a monolithic upon the subtasks. facilitates In addition, the distributed knowledge regarding the sub-tasks.

the integration of fine-grained representation domain of In addition, see Kovacs [40] on the two perspectives and reinforcement learning.

learning classifier systems: evolutionary algorithms C. Addendum

rules Learning through a process of forward chaining a parallel rule set. that allow classifiers an agent evolve to take a interconnected actions in a domain set of This permits the system to execute sequences of actions in chains.

develop An important overlapping feature sets of of LCS rules is their called ability to default

page 4 of 6

hierarchiesincrease rule set parsimony, enlarge the solution set, and [10] (also see [30]). Default hierarchies lend algorithm themselves to graceful refinement by the are (specific) mostly [27]. correct They to allow be enhanced default (general) through rules genetic that favouring rules. more the more LCS specific organizes exception default rules hierarchies exception over the by assignment scheme, and during competition.

general default rules in the bucket brigade credit rule-complexes A default hierarchy is an abstract concept to describe general concept in the that form in which from a high-level a low-level, specific cases are described. From a cognitive domain is described, and from a perspective, a default hierarchy is a quasi-homomorphic (overlapping typically map) model of the world, which is homomorphic a more compact (less rules) than a map) model [7,20].

(non-overlapping structure preserving system in which a problem is explicitly decomposed into Finally, ALECSYS is a parallel learning classifier a task, number architecture facilitates low-level parallelism at the thus of a hierarchy tasks, and of a LCS classifier are employed is assigned [23]. to each The of multiple a single LCS, and high-level parallelism level address a large problem.

LCS work together on independent tasks in that to Figure 9 Depiction of a distributed LCS (taken from [1] pg. 253)

and responsibility are discriminated and allocated to sub-A switch architecture is used, such that messages LCS. Behaviour shaping is employed to train the switch architecture be used: (1) a holistic approach allows responsibility to where two distinct learning processes may emerge, systems and discriminator to (2) be a trained, modular approach then allows individual used simulation employed system for autonomous is trained. The frozen whilst the robot architecture tasks both was Dorigo engineering and and Colombetti with real-robots propose [21]. a methodology From this work, in for Behaviour, as a representative implementation [19,22].

Analysis, behaviours Training in autonomous (BAT), with robots ALECSYS called ACKNOWLEDGMENTS

useful feedback on drafts of this paper

Tim Hendtlass for his patience and for providing REFERENCES

Algorithms [1] D. E. Moriarty, A.C. for Reinforcement Learning Schultz, and J.J. Grefenstette, Evolutionary Journal of Artificial Intelligence Research, vol. 11, pp. 199-229, 1999.

CIS Technical Report 070514A May 2007 [2] David E. Goldberg. Genetic Algorithms in Search, Optimization and Company, Inc., 1989.

Machine Learning, USA, Canada: Addison Wesley Publishing Operation [3] David Edward Goldberg, Computer-Aided GAS Pipeline Michigan.

Using Genetic Algorithms and Rule Learning 1983. University of eds. R. Rosen and F. Snell. Academic Press, 1976.pp. 263-293.

[4] J. H. Holland. Adaptation. In: Progress in Theoretical Biology IV, general patterns in growing knowledge-bases [5] J. H. Holland, Adaptive algorithms International Journal of Policy for discovering and using Analysis and Information Systems, vol. 4, pp. 217-240, 1980.

lookahead [6] J. in H. classifier Holland, systems,\" \"Concerning Proceedings the emergence of the of ninth tag-mediated annual international conference of the Center for Nonlinear Studies on Self-organizing, Collective, and Cooperative Phenomena in Natural and Artificial Computing United States, pp. 188-201, 1990.

Networks on Emergent computation, Los Alamos, New Mexico, Classifier systems, Q-morphisms, and induction. In: [7] J. H. Holland, K. J. Holyoak, R. E. Nisbett, Genetic Algorithms and and P. R. Thagard. Simulated Annealing, ed. L. D. Davis. London: Pittman, 1987.pp. 116-128. Learning classifier systems: New models, successful applications [8] J. H. Holmes , P. L. Lanzi, W. Stolzmann, and S. W. Information Wilson, Processing Letters, vol. 82, pp. 23-30, Apr, 2002.

on adaptive algorithms [9] John H. Holland and Judith S. Reitman, Cognitive systems based ACM SIGART Bulletin, vol. 63, pp. 49-49, Jun, 1977. Thagard. [10] Cambridge, Mass, USA: MIT Press, 1986.

John H. Holland, Keith J. Holyoak, Richard E. Nisbett, and Paul R. Induction : processes of inference, learning, and discovery, Dorigo, [11] Smith, Pier Luca Lanzi, Wolfgang Stolzmann, and Stewart W. Wilson. What David John E. H. Goldberg, Holland, Lashon Stephanie B. Booker, Forrest, Marco Rick L. Colombetti, Riolo, Robert Marco E. is a Learning Classifier System? In: Learning Classifier Systems: From Foundations 2000.pp. 3

to Applications, Anonymous Berlin / Heidelberg: Springer , [12] John Henry Holland. Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence, USA: MIT Press, 1992.

Overview [13] Kenneth Machine Learningde Jong, , vol. 3, pp. 121-138, Oct, 1988.

Learning with Genetic Algorithms: An systems and genetic algorithms [14] L. B. Booker, D. E. Goldberg, and J. H. Holland, Classifier Sep, 1989.

Artificial Intelligence, vol. 40, pp. 235-282, Springer, 2004.

[15] Larry Bull. Applications Of Learning Classifier Systems, [16] Larry Bull and Tim Kovacs. Foundations of Learning Classifier Systems (Studies in Fuzziness and Soft Computing), Springer, 2005.

the task environment 1982. University of Michigan .

[17] Lashon Bernard Booker, Intelligent behavior as an adaptation to Reinforcement learning: a survey [18] Leslie Pack Kaelbling, Michael L Littman, and Andrew W Moore, vol. 4, pp. 237-285, 1996.

Journal of Artificial Intelligence Research, training-a [19] M. Colombetti, M. Dorigo, and G. Borghi, Behavior analysis and methodology for behavior engineering IEEE Transactions on Systems, Man and Cybernetics, Part B, vol. 26, pp. 365-380, Jun, 1996.

Formation in Learning Classifier Systems,\" [20] Marco Dorigo, \"New Perspectives Proceedings of the 2nd Congress about Default Hierarchies of the Italian Association for Artificial Intelligence on Trends in Artificial Intelligence, pp. 218-227, 1991.

a real robot by distributed classifier systems [21] Marco Dorigo, Alecsys and the AutonoMouse: Learning to control 209-240, Jun, 1995.

Machine Learning, vol. 19, pp. [22] Marco. Dorigo and Marco Colombetti. Robot Shaping: an experiment in behavior engineering, USA: MIT Press, 1998.

Classifier [23] Marco Dorigo, E. S., \"Alecsys: A Parallel Laboratory for Learning Systems,\" Proceedings of the 4th International Conference on Genetic Algorithms, San Diego, CA, USA, pp. 296-302, 1991.

[24] Martin V. Butz . Anticipatory Learning Classifier Systems (Genetic Algorithms and Evolutionary Computation), Springer, 2002.

Decade [25] of Pier Learning Luca Classifier Lanzi and System Rick L. Research. Riolo. A In: Roadmap Learning to Classifier the Last Systems: From Foundations to ApplicationsSpringer , 2000.pp. 33

, Anonymous Berlin / Heidelberg: page 5 of 6

[26] Pier Luca Lanzi, Wolfgang Stolzmann, and Stewart W. Wilson. Learning Heidelberg: Springer, 2000.

Classifier Systems: From Foundations to Applications, Berlin / Separation [27] R. Bloomington campus, Indiana University, pp. 148, 1991.

in E. a Classifier Smith. D. System,\" E. Goldberg, Foundations \"Variable of Genetic Default Algorithms,Hierarchy programming [28] Richard K. 193-223, Oct, 1988.

in classifier Belew systems and Stephanie Machine Forrest, Learning , Learning vol. 3, and pp. [29] Richard S. Sutton and Andrew G. Barto. Reinforcement Learning: An Press, 1998.

Introduction , Cambridge, Massachusetts; London, England: The MIT classifier [30] systems,\" Rick L. Riolo, Proceedings \"The emergence of the third of default international hierarchies conference in learning on Genetic 1989.

algorithms, George Mason University, United States, pp. 322-327, application [31] S. Forrest, University.

to classification Study in of KL-ONE parallelism semantic in the networks Classifier 1985. system Michigan and its Through [32] Adaptive S. Smith, Search,\" \"Flexible Proceedings Learning 8th of International Problem Solving Joint Conference Heuristics on Artificial Intelligence, West Germany, pp. 422-425, 1983.

[33] S. W. Wilson, ZCS: A Zeroth Level Classifier System Evolutionary Computation, vol. 2, pp. 1-18, Spring, 1994.

[34] S. W. Wilson, Classifier Fitness Based on Accuracy Evolutionary Computation, vol. 3, pp. 149-175, 1995.

[35] Stephen Frederick Smith, A learning system based on genetic CIS Technical Report 070514A May 2007 adaptive Pittsburgh.

algorithms 1980. Department of Computer Science, University of [36] Stewart W. Wilson, \"Knowledge Growth in an Artificial Animal,\" Proceedings of the 1st International Conference on Genetic Algorithms,16-23, 1985.

pp. [37] Stewart W. Wilson, Classifier systems and the animat problem Machine Learning, vol. 2, pp. 199-228, Nov, 1987.

[38] Stewart W. Wilson. State of XCS Classifier System Research. In: Learning Classifier Systems: From Foundations to ApplicationsBerlin / Heidelberg: Springer , 2000.pp. 63

, Anonymous classifier [39] systems,\" Stewart W. Proceedings Wilson and David E. Goldberg, of the third international \"A critical review of conference on Genetic 1989.

algorithms, George Mason University, United States, pp. 244-255 , [40] Tim Kovacs, \"Two Views of Classifier Systems,\" Advances in Learning Classifier Systems : 4th International Workshop, IWLCS 2001 (Revised Papers), San Francisco, CA, USA, pp. 74, 2002.

page 6 of 6

因篇幅问题不能全部显示，请点此查看更多更全内容

查看全文