logo资料库

NSGA3:AnEvolutionaryMany-ObjectiveOptimization.pdf

第1页 / 共23页
第2页 / 共23页
第3页 / 共23页
第4页 / 共23页
第5页 / 共23页
第6页 / 共23页
第7页 / 共23页
第8页 / 共23页
资料共23页,剩余部分请下载后查看
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. 1 An Evolutionary Many-Objective Optimization Algorithm Using Reference-point Based Non-dominated Sorting Approach, Part I: Solving Problems with Box Constraints Kalyanmoy Deb, Fellow, IEEE and Himanshu Jain Abstract—Having developed multi-objective optimization al- gorithms using evolutionary optimization methods and demon- strated their niche on various practical problems involving mostly two and three objectives, there is now a growing need for developing evolutionary multi-objective optimization (EMO) algorithms for handling many-objective (having four or more objectives) optimization problems. In this paper, we recognize a few recent efforts and discuss a number of viable directions for developing a potential EMO algorithm for solving many-objective optimization problems. Thereafter, we suggest a reference-point based many-objective NSGA-II (we call it NSGA-III) that em- phasizes population members which are non-dominated yet close to a set of supplied reference points. The proposed NSGA-III is applied to a number of many-objective test problems having two to 15 objectives and compared with two versions of a recently suggested EMO algorithm (MOEA/D). While each of the two MOEA/D methods works well on different classes of problems, the proposed NSGA-III is found to produce satisfactory results on all problems considered in this study. This paper presents results on unconstrained problems and the sequel paper considers constrained and other specialties in handling many-objective optimization problems. Index Terms—Many-objective optimization, evolutionary com- large dimension, NSGA-III, non-dominated sorting, putation, multi-criterion optimization. I. INTRODUCTION Evolutionary multi-objective optimization (EMO) method- ologies have amply shown their niche in finding a set of well-converged and well-diversified non-dominated solutions in different two and three-objective optimization problems since the beginning of nineties. However, in most real-world problems involving multiple stake-holders and functionalities, there often exists many optimization problems that involve four or more objectives, sometimes demanding to have 10 to 15 objectives [1], [2]. Thus, it is not surprising that handling a large number of objectives had been one of the main research activities in EMO for the past few years. Many-objective K. Deb is with Department of Electrical and Computer Engineering. Michigan State University, 428 S. Shaw Lane, 2120 EB, East Lansing, MI 48824, USA, e-mail: kdeb@egr.msu.edu (see http://www.egr.msu.edu/˜kdeb). Prof. Deb is also a visiting professor at Aalto University School of Business, Finland and University of Sk¨ovde, Sweden. H. Jain is with Eaton Corporation, Pune 411014, India, email: himan- shu.j689@gmail.com. Copyright (c) 2012 IEEE. Personal use of this material is permitted. However, permission to use this material for any other purposes must be obtained from the IEEE by sending a request to pubs-permissions@ieee.org. problems pose a number of challenges to any optimization algorithm, including an EMO. First and foremost, the propor- tion of non-dominated solutions in a randomly chosen set of objective vectors becomes exponentially large with an increase of number of objectives. Since the non-dominated solutions occupy most of the population slots, any elite-preserving EMO faces a difficulty in accommodating adequate number of new solutions in the population. This slows down the search process considerably [3], [4]. Second, implementation of a diversity-preservation operator (such as the crowding distance operator [5] or clustering operator [6]) becomes a computationally expensive operation. Third, visualization of a large-dimensional front becomes a difficult task, thereby causing difficulties in subsequent decision-making tasks and in evaluating the performance of an algorithm. For this purpose, the performance metrics (such as hyper-volume measure [7] or other metrics [3], [8]) are either computationally too expensive or may not be meaningful. An important question to ask is then ‘Are EMOs useful for many-objective optimization problems?’. Although the third difficulty related to visualization and performance measures mentioned above cannot be avoided, some algorithmic changes to the existing EMO algorithms may be possible to address the first two concerns. In this paper, we review some of the past efforts [9], [10], [11], [12], [13], [14] in devising many-objective EMOs and outline some viable directions for designing efficient many-objective EMO methodologies. Thereafter, we propose a new method that uses the framework of NSGA-II procedure [5] but works with a set of supplied or predefined reference points and demonstrate its efficacy in solving two to 15-objective optimization problems. In this paper, we introduce the framework and restrict to solving unconstrained problems of various kinds, such as having normalized, scaled, convex, concave, disjointed, and focusing on a part of the Pareto-optimal front. Practice may offer a number of such properties to exist in a problem. Therefore, an adequate test of an algorithm for these eventualities remains as an important task. We compare the performance of the proposed NSGA-III with two versions of an existing many- objective EMO (MOEA/D [10]), as the method is somewhat similar to the proposed method. Interesting insights about working of both versions of MOEA/D and NSGA-III are revealed. The proposed NSGA-III is also evaluated for its use in a few other interesting multi-objective optimization and Copyright (c) 2013 IEEE. Personal use is permitted. For any other purposes, permission must be obtained from the IEEE by emailing pubs-permissions@ieee.org.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. decision-making tasks. In the sequel of this paper, we suggest an extension of the proposed NSGA-III for handling many- objective constrained optimization problems and a few other special and challenging many-objective problems. In the remainder of this paper, we first discuss the diffi- culties in solving many-objective optimization problems and then attempt to answer the question posed above about the usefulness of EMO algorithms in handling many objectives. Thereafter, in Section III, we present a review of some of the past studies on many-objective optimization including a recently proposed method MOEA/D [10]. Then, in Section IV, we outline our proposed NSGA-III procedure in detail. Results on normalized DTLZ test problems up to 15 objectives using NSGA-III and two versions of MOEA/D are presented next in Section V-A. Results on scaled version of DTLZ problems suggested here are shown next. Thereafter in subsequent sections, NSGA-III procedure is tested on different types of many-objective optimization problems. Finally, NSGA-III is applied to two practical problems involving three and nine objective problems in Section VII. Conclusions of this extensive study are drawn in Section VIII. II. MANY-OBJECTIVE PROBLEMS Loosely, many-objective problems are defined as problems having four or more objectives. Two and three-objective prob- lems fall into a different class as the resulting Pareto-optimal front, in most cases, in its totality can be comprehensively visualized using graphical means. Although a strict upper bound on the number of objectives for a many-objective optimization problem is not so clear, except a few occasions [15], most practitioners are interested in a maximum of 10 to 15 objectives. In this section, first, we discuss difficulties that an existing EMO algorithm may face in handling many- objective problems and investigate if EMO algorithms are useful at all in handling a large number of objectives. 2 may cause an unacceptable distribution of solutions at the end. 3) Recombination operation may be inefficient: In a many- objective problem, if only a handful of solutions are to be found in a large-dimensional space, solutions are likely to be widely distant from each other. In such a population, the effect of recombination operator (which is considered as a key search operator in an EMO) becomes questionable. Two distant parent solutions are likely to produce offspring solutions that are also distant from parents. Thus, special recombination operators (mating restriction or other schemes) may be necessary for handling many-objective problems efficiently. 4) Representation of trade-off surface is difficult: It is intuitive to realize that to represent a higher dimensional trade-off surface, exponentially more points are needed. Thus, a large population size is needed to represent the resulting Pareto-optimal front. This causes difficulty for a decision-maker to comprehend and make an adequate decision to choose a preferred solution. 5) Performance metrics are computationally expensive to compute: Since higher-dimensional sets of points are to be compared against each other to establish the performance of one algorithm against another, a larger computational effort is needed. For example, computing hyper-volume metric requires exponentially more com- putations with the number of objectives [18], [19]. 6) Visualization is difficult: Finally, although it is not a matter related to optimization directly, eventually visu- alization of a higher-dimensional trade-off front may be difficult for many-objective problems. The first three difficulties can only be alleviated by certain modifications to existing EMO methodologies. The fourth, fifth, and sixth difficulties are common to all many-objective optimization problems and we do not address them adequately here. A. Difficulties in Handling Many Objectives It has been discussed elsewhere [4], [16] that the current state-of-the-art EMO algorithms that work under the principle of domination [17] may face the following difficulties: 1) A large fraction of population is non-dominated: It is well-known [3], [16] that with an increase in number of objectives, an increasingly larger fraction of a ran- domly generated population becomes non-dominated. Since most EMO algorithms emphasize non-dominated solutions in a population, in handling many-objective problems, there is not much room for creating new solutions in a generation. This slows down the search process and therefore the overall EMO algorithm be- comes inefficient. 2) Evaluation of diversity measure becomes computation- ally expensive: To determine the extent of crowding of solutions in a population, the identification of neigh- bors becomes computationally expensive in a large- dimensional space. Any compromise or approximation in diversity estimate to make the computations faster B. EMO Methodologies for Handling Many-Objective Prob- lems Before we discuss the possible remedies for three difficulties mentioned above, here we highlight two different many- objective problem classes for which existing EMO method- ologies can still be used. First, existing EMO algorithms may still be useful in finding a preferred subset of solutions (a partial set) from the complete Pareto-optimal set. Although the preferred subset will still be many-dimensional, since the targeted solutions are focused in a small region on the Pareto-optimal front, most of the above difficulties will be alleviated by this principle. A number of MCDM-based EMO methodologies are already devised for this purpose and results on as large as 10-objective problems have shown to perform well [20], [21], [22], [23]. Second, many problems in practice, albeit having many objectives, often degenerate to result in a low-dimensional Pareto-optimal front [4], [11], [24], [25]. In such problems, identification of redundant objectives can be integrated with an EMO to find the Pareto-optimal front that is low-dimensional. Copyright (c) 2013 IEEE. Personal use is permitted. For any other purposes, permission must be obtained from the IEEE by emailing pubs-permissions@ieee.org.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Since the ultimate front is as low as two or three-dimensional, existing EMO methodologies should work well in handling such problems. A previous study of NSGA-II with a principal component analysis (PCA) based procedure [4] was able to solve as large as 50-objective problems having a two-objective Pareto-optimal front. C. Two Ideas for a Many-Objective EMO Keeping in mind the first three difficulties associated with a domination-based EMO procedure, two different strategies can be considered to alleviate the difficulties: 1) Use of a special domination principle: The first difficulty mentioned above can be alleviated by using a special domination principle that will adaptively discretize the Pareto-optimal front and find a well-distributed set of points. For example, the use of -domination principle [26], [27] will make all points within  distance from a set of Pareto-optimal points -dominated and hence the process will generate a finite number of Pareto-optimal points as target. Such a consideration will also alleviate the second difficulty of diversity preservation. The third difficulty can be taken care of by the use of a mating restriction scheme or a special recombination scheme in which near-parent solutions are emphasized (such as SBX with a large distribution index [28]). Other special domination principles [29], [30] can also be used for this purpose. Aguirre and Tanaka [31] and Sato et al. [32] suggested the use of a subset of objectives for dominance check and using a different combination of objectives in every generation. The use of fixed cone-domination [33], [34] or variable cone-domination [35] principles can also be tried. These studies were made in the context of low- dimensional problems and their success in solving many- objective optimization is yet to be established. 2) Use of a predefined multiple targeted search: It has been increasingly clear that it is too much to expect from a single population-based optimization algorithm to have convergence of its population near the Pareto- optimal front and simultaneously its distributed uni- formly around the entire front in a large-dimensional problem. One way to handle such many-objective opti- mization problems would be to aid the diversity mainte- nance issue by some external means. This principle can directly address the second difficulty mentioned above. Instead of searching the entire search space for Pareto- optimal solutions, multiple predefined targeted searches can be set by an algorithm. Since optimal points are found corresponding to each of the targeted search task, the first difficulty of dealing with a large non-dominated set is also alleviated. The recombination issue can be addressed by using a mating restriction scheme in which two solutions from neighboring targets are participated in the recombination operation. Our proposed algorithm (NSGA-III) is based on this principle and thus we discuss this aspect in somewhat more detail. We suggest two different ways to implement the predefined multiple targeted search principle: 3 a) A set of predefined search directions spanning the entire Pareto-optimal front can be specified beforehand and multiple searches can be performed along each direction. Since the search directions are widely distributed, the obtained optimal points are also likely to be widely distributed on the Pareto-optimal front in most problems. Recently proposed MOEA/D procedure [10] uses this con- cept. b) Instead of multiple search directions, multiple pre- defined reference points can be specified for this purpose. Thereafter, points corresponding to each reference point can be emphasized to find set of widely distributed set of Pareto-optimal points. A few such implementations were proposed recently [36], [37], [38], [14], and this paper suggests another approach extending and perfecting the al- gorithm proposed in the first reference [36]. III. EXISTING MANY-OBJECTIVE OPTIMIZATION ALGORITHMS Garza-Fabre, Pulido and Coello [16] suggested three single- objective measures by using differences in individual objective values between two competing parents and showed that in 5 to 50-objective DTLZ1, DTLZ3 and DTLZ6 problems the con- vergence property can get enhanced compared to a number of existing methods including the usual Pareto-dominance based EMO approaches. Purshouse and Fleming [39] clearly showed that diversity preservation and achieving convergence near the Pareto-optimal front are two contradictory goals and usual genetic operators are not adequate to attain both goals simul- taneously, particularly for many-objective problems. Another study [40] extends NSGA-II by adding diversity-controlling operators to solve six to 20-objective DTLZ2 problems. K¨oppen and Yoshida [41] claimed that NSGA-II procedure in its originality is not suitable for many-objective optimiza- tion problems and suggested a number of metrics that can potentially replace NSGA-II’s crowding distance operator for better performance. Based on simulation studies on two to 15-objective DTLZ2, DTLZ3 and DTLZ6 problems, they suggested to use a substitute assignment distance measure as the best strategy. Hadka and Reed [42] suggested an ensemble- based EMO procedure which uses a suitable recombination operator adaptively chosen from a set of eight to 10 different pre-defined operators based on their generation-wise success rate in a problem. It also uses -dominance concept and an adaptive population sizing approach that is reported to solve up to eight-objective test problems successfully. Bader and Zitzler [43] have suggested a fast procedure for computing sample-based hyper-volume and devised an algorithm to find a set of trade-off solutions for maximizing the hyper-volume. A growing literature on approximate hyper-volume computation [44], [18], [45] may make such an approach practical for solving many-objective problems. The above studies analyze and extend previously-suggested evolutionary multi-objective optimization algorithms for their suitability to solving many-objective problems. In most cases, Copyright (c) 2013 IEEE. Personal use is permitted. For any other purposes, permission must be obtained from the IEEE by emailing pubs-permissions@ieee.org.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. the results are promising and the suggested algorithms must be tested on other more challenging problems than the usual normalized test problems such as DTLZ problems. They must also be tried on real-world problems. In the following paragraphs, we describe a recently proposed algorithm which fits well with our description of a many-objective optimization algorithm given in Section II-C and closely matches with our proposed algorithm. MOEA/D [10] uses a predefined set of weight vectors to maintain a diverse set of trade-off solutions. For each weight vector, the resulting problem is called a sub-problem. To start with, every population member (with size same as the number of weight vectors) is associated with a weight vector randomly. Thereafter, two solutions from neighboring weight vectors (defined through a niching parameter (T )) are mated and an offspring solution is created. The offspring is then associated with one or more weight vectors based on a performance metric. Two metrics are suggested in the study. A penalized distance measure of a point from the ideal point is formed by weighted sum (weight θ is another algorithmic parameter) of perpendicular distance (d2) from the reference direction and distance (d1) along the reference direction: PBI(x, w) = d1 + θd2. (1) We call this procedure here as MOEA/D-PBI. The second approach suggested is the use of Tchebycheff metric using a utopian point z∗ and the weight vector w: TCH(x, w, z∗) = M max i=1 wi|fi(x) − z∗ i |. (2) In reported simulations [10], the ideal point was used as z∗ and zero-weight scenario is handled by using a small number. We call this procedure as MOEA/D-TCH here. An external population maintains the non-dominated solutions. The first two difficulties mentioned earlier are negotiated by using an explicit set of weight vectors to find points and the third difficulty is alleviated by using a mating restriction scheme. Simulation results were shown for two and three-objective test problems only and it was concluded that MOEA/D-PBI is better for three-objective problems than MOEA/D-TCH and the performance of MOEA/D-TCH improved with an objective normalization process using population minimum and maxi- mum objective values. Both versions of MOEA/D require to set a niching parameter (T ). Based on some simulation results on two and three-objective problems, authors suggested the use of a large fraction of population size as T . Additionally, MOEA/D-PBI requires an appropriate setting of an additional parameter – penalty parameter θ, for which authors have suggested a value of 5. A later study by the developers of MOEA/D suggested the use of differential evolution (DE) to replace genetic recombination and mutation operators. Also, further modifi- cations were done in defining neighborhood of a particular solution and in replacing parents in a given neighborhood by the corresponding offspring solutions [46]. We call this method as MOEA/D-DE here. Results on a set of mostly two and a three-objective linked problems [47] showed better performance with MOEA/D-DE compared to other algorithms. 4 As mentioned, MOEA/D is a promising approach for many- objective optimization as it addresses some of the difficulties mentioned above well, but the above-mentioned MOEA/D studies did not quite explore their suitability to a large number of objectives. In this paper, we apply them to problems having up to 15 objectives and evaluate their applicability to truly many-objective optimization problems and reveal interesting properties of these algorithms. Another recent study [14] follows our description of a many-objective optimization procedure. The study extends the NSGA-II procedure to suggest a hybrid NSGA-II (HN algorithm) for handling three and four-objective problems. Combined population members are projected on a hyper- plane and a clustering operation is performed on the hyper- plane to select a desired number of clusters which is user- defined. Thereafter, based on the diversity of the population, either a local search operation on a random cluster member is used to move the solution closer to the Pareto-optimal front or a diversity enhancement operator is used to choose population members from all clusters. Since no targeted and distributed search is used, the approach is more generic than MOEA/D or the procedure suggested in this paper. However, the efficiency of HN algorithm for problems having more than four objectives is yet to be investigated to suggest its use for many-objective problems, in general. We now describe our proposed algorithm. IV. PROPOSED ALGORITHM: NSGA-III (or NSGA-III) remains similar The basic framework of the proposed many-objective NSGA-II to the original NSGA-II algorithm [5] with significant changes in its selec- tion mechanism. But unlike in NSGA-II, the maintenance of diversity among population members in NSGA-III is aided by supplying and adaptively updating a number of well-spread reference points. For completeness, we first present a brief description of the original NSGA-II algorithm. Let us consider t-th generation of NSGA-II algorithm. Suppose the parent population at this generation is Pt and its size is N , while the offspring population created from Pt is Qt having N members. The first step is to choose the best N members from the combined parent and offspring population Rt = Pt ∪ Qt (of size 2N ), thus allowing to preserve elite members of the parent population. To achieve this, first, the combined population Rt is sorted according to different non-domination levels (F1, F2 and so on). Then, each non-domination level is selected one at a time to construct a new population St, starting from F1, until the size of St is equal to N or for the first time exceeds N . Let us say the last level included is the l-th level. Thus, all solutions from level (l + 1) onwards are rejected from the combined population Rt. In most situations, the last accepted level (l-th level) is only accepted partially. In such a case, only those solutions that will maximize the diversity of the l-th front are chosen. In NSGA-II, this is achieved through a computationally efficient yet approximate niche-preservation operator which computes the crowding distance for every last level member as the summation of objective-wise normalized distance between two Copyright (c) 2013 IEEE. Personal use is permitted. For any other purposes, permission must be obtained from the IEEE by emailing pubs-permissions@ieee.org.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. 5 neighboring solutions. Thereafter, the solutions having larger crowding distance values are chosen. Here, we replace the crowding distance operator with the following approaches (subsections IV-A to IV-E). A. Classification of Population into Non-dominated Levels The above procedure of identifying non-dominated fronts using the usual domination principle [17] is also used in NSGA-III. All population members from non-dominated front level 1 to level l are first included in St. If |St| = N , no further operations are needed and the next generation is started with Pt+1 = St. For |St| > N , members from one to (l − 1) fronts are already selected, that is, Pt+1 = ∪l−1 i=1Fi, and the remaining (K = N −| Pt+1|) population members are chosen from the last front Fl. We describe the remaining selection process in the following subsections. B. Determination of Reference Points on a Hyper-Plane As indicated before, NSGA-III uses a predefined set of reference points to ensure diversity in obtained solutions. The chosen reference points can either be predefined in a structured manner or supplied preferentially by the user. We shall present results of both methods in the results section later. In the absence of any preference information, any predefined structured placement of reference points can be adopted, but in this paper we use Das and Dennis’s [48] systematic approach1 that places points on a normalized hyper-plane – a (M − 1)- dimensional unit simplex – which is equally inclined to all objective axes and has an intercept of one on each axis. If p divisions are considered along each objective, the total number of reference points (H) in an M -objective problem is given by: H = !M + p − 1 p ". (3) For example, in a three-objective problem (M = 3), the reference points are created on a triangle with apex at (1, 0, 0), (0, 1, 0) and (0, 0, 1). If four divisions (p = 4) are chosen for each objective axis, H = #3+4−1 4 $ or 15 reference points will be created. For clarity, these reference points are shown in Figure 1. In the proposed NSGA-III, in addition to emphasiz- ing non-dominated solutions, we also emphasize population members which are in some sense associated with each of these reference points. Since the above-created reference points are widely distributed on the entire normalized hyper- plane, the obtained solutions are also likely to be widely distributed on or near the Pareto-optimal front. In the case of a user-supplied set of preferred reference points, ideally the user can mark H points on the normalized hyper-plane or indicate any H, M -dimensional vectors for the purpose. The proposed algorithm is likely to find near Pareto-optimal solutions cor- responding to the supplied reference points, thereby allowing this method to be used more from the point of view of a combined application of decision-making and many-objective optimization. The procedure is presented in Algorithm 1. 1Any other structured distribution with or without a biasing on some part of the Pareto-optimal front can be used as well. f3 1 Reference point Reference line Normalized hyperplane 1f1 1 f2 Ideal point Fig. 1. a three-objective problem with p = 4. 15 reference points are shown on a normalized reference plane for Algorithm 1 Generation t of NSGA-III procedure Input: H structured reference points Z s or supplied aspira- tion points Z a, parent population Pt Output: Pt+1 St = St ∪ Fi and i = i + 1 1: St = ∅, i = 1 2: Qt = Recombination+Mutation(Pt) 3: Rt = Pt ∪ Qt 4: (F1, F2, . . .) = Non-dominated-sort(Rt) 5: repeat 6: 7: until |St|≥ N 8: Last front to be included: Fl = Fi 9: if |St| = N then 10: 11: else 12: Pt+1 = St, break 13: 14: 15: 16: 17: j=1Fj Pt+1 = ∪l−1 Points to be chosen from Fl: K = N −| Pt+1| Normalize objectives and create reference set Z r: Normalize(f n, St, Z r, Z s, Z a) Associate each member s of St with a reference point: [π(s), d(s)] =Associate(St, Z r) % π(s): closest reference point, d: distance between s and π(s) Compute niche count of reference point j ∈ Z r: ρj = %s∈St/Fl Choose K members one at a time from Fl to construct Pt+1: Niching(K, ρj,π, d, Z r, Fl, Pt+1) ((π(s) = j) ? 1 : 0) 18: end if C. Adaptive Normalization of Population Members i 1 , zmin First, the ideal point of the population St is determined by identifying the minimum value (zmin ), for each objective function i = 1, 2, . . . , M in ∪t τ =0Sτ and by constructing the ideal point ¯z = (zmin M ). Each objective value of St is then translated by subtracting objective fi by zmin , so that the ideal point of translated St becomes a zero vector. We denote this translated objective as f $ i(x) = fi(x) − zmin . Thereafter, the extreme point in each objective axis is identi- fied by finding the solution (x ∈ St) that makes the following achievement scalarizing function minimum with weight vector , . . . , zmin 2 i i Copyright (c) 2013 IEEE. Personal use is permitted. For any other purposes, permission must be obtained from the IEEE by emailing pubs-permissions@ieee.org.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. 6 w being the axis direction: ASF (x, w) = M max i=1 f $ i(x)/wi, for x ∈ St. (4) For wi = 0, we replace it with a small number 10−6. For i-th translated objective direction f $ i , this will result in an extreme objective vector, zi,max. These M extreme vectors are then used to constitute a M -dimensional linear hyper-plane. The intercept ai of the i-th objective axis and the linear hyper- plane can then be computed (see Figure 2) and the objective functions can be normalized as follows: f n i (x) = f $ i(x) ai − zmin i = fi(x) − zmin i ai − zmin i , for i = 1, 2, . . . , M . (5) Note that the intercept on each normalized objective axis is now at f n i = 1 and a hyper-plane constructed with these intercept points will make %M i=1 f n i = 1. 3a f’ 3 3 2 1 0 0 a1 1 3,max z 2a 2,max z 1,max z 2 f’ 1 0 1 f’ 2 2 Fig. 2. Procedure of computing intercepts and then forming the hyper-plane from extreme points are shown for a three-objective problem. In the case of structured reference points (H of them), the original reference points calculated using Das and Dennis’s [48] approach already lie on this normalized hyper-plane. In the case of preferred reference points by the user, the reference points are simply mapped on to the above-constructed normal- ized hyper-plane using equation 5. Since the normalization procedure and the creation of the hyper-plane is done at each generation using extreme points ever found from the start of the simulation, the proposed NSGA-III procedure adaptively maintains a diversity in the space spanned by the members of St at every generation. This enables NSGA-III to solve problems having a Pareto-optimal front whose objective values may be differently scaled. The procedure is also described in Algorithm 2. Algorithm 2 Normalize(f n, St, Z r, Z s/Z a) procedure Input: St, Z s (structured points) or Z a (supplied points) Output: f n, Z r (reference points on normalized hyper-plane) 1: for j = 1 to M do 2: 3: 4: Compute ideal point: zmin Translate objectives: f $ Compute argmins∈St  = 10−6, and wj j = 1 j = mins∈St fj(s) j(s) = fj(s) − zmin points: zj,max j ∀s ∈ St : extreme ASF(s, wj), where wj = (, . . . , )T = s 5: end for 6: Compute intercepts aj for j = 1, . . . , M 7: Normalize objectives (f n) using Equation 5 8: if Z a is given then 9: Map each (aspiration) point on normalized hyper-plane using Equation 5 and save the points in the set Z r 10: else 11: 12: end if Z r = Z s D. Association Operation After normalizing each objective adaptively based on the extent of members of St in the objective space, next we need to associate each population member with a reference point. For this purpose, we define a reference line corresponding to each reference point on the hyper-plane by joining the reference point with the origin. Then, we calculate the perpendicular distance of each population member of St from each of the reference lines. The reference point whose reference line is closest to a population member in the normalized objective space is considered associated with the population member. This is illustrated in Figure 3. The procedure is presented in 1.5 f3 1 0.5 0 0 0.5 f2 1 1.5 1.5 1 f1 0.5 0 Fig. 3. Association of population members with reference points is illustrated. Algorithm 3. E. Niche-Preservation Operation It is worth noting that a reference point may have one or more population members associated with it or need not have any population member associated with it. We count the Copyright (c) 2013 IEEE. Personal use is permitted. For any other purposes, permission must be obtained from the IEEE by emailing pubs-permissions@ieee.org.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: Algorithm 3 Associate(St, Z r) procedure Input: Z r, St Output: π(s ∈ St), d(s ∈ St) 1: for each reference point z ∈ Z r do 2: 3: end for 4: for each s ∈ St do 5: Compute reference line w = z for each w ∈ Z r do Compute d⊥(s, w) = s − wT s/’w’ end for Assign π(s) = w : argminw∈Zr d⊥(s, w) Assign d(s) = d⊥(s,π (s)) 6: 7: 8: 9: 10: end for number of population members from Pt+1 = St/Fl that are associated with each reference point. Let us denote this niche count as ρj for the j-th reference point. We now devise a new niche-preserving operation as follows. First, we identify the reference point set Jmin = {j : argminjρj} having minimum ρj. In case of multiple such reference points, one (¯j ∈ Jmin) is chosen at random. If ρ¯j = 0 (meaning that there is no associated Pt+1 member to the reference point ¯j), there can be two scenarios with ¯j in set Fl. First, there exists one or more members in front Fl that are already associated with the reference point ¯j. In this case, the one having the shortest perpendicular distance from the reference line is added to Pt+1. The count ρ¯j is then incremented by one. Second, the front Fl does not have any member associated with the reference point ¯j. In this case, the reference point is excluded from further consideration for the current generation. In the event of ρ¯j ≥ 1 (meaning that already one mem- ber associated with the reference point exists in St/Fl), a randomly2 chosen member, if exists, from front Fl that is associated with the reference point ¯j is added to Pt+1. The count ρ¯j is then incremented by one. After niche counts are updated, the procedure is repeated for a total of K times to fill all vacant population slots of Pt+1. The procedure is presented in Algorithm 4. F. Genetic Operations to Create Offspring Population After Pt+1 is formed, it is then used to create a new offspring population Qt+1 by applying usual genetic operators. In NSGA-III, we have already performed a careful elitist selec- tion of solutions and attempted to maintain diversity among solutions by emphasizing solutions closest to the reference line of each reference point. Also, as we shall describe in Section V, for a computationally fast procedure, we have set N almost equal to H, thereby giving equal importance to each population member. For all these reasons, we do not employ any explicit selection operation with NSGA-III. The population Qt+1 is constructed by applying the usual crossover and mutation operators by randomly picking parents from Pt+1. However, to create offspring solutions closer to 2The point closest to the reference point or using any other diversity preserving criterion can also be used. Algorithm 4 Niching(K, ρj,π, d, Z r, Fl, Pt+1) procedure Input: K, ρj, π(s ∈ St), d(s ∈ St), Z r, Fl Output: Pt+1 7 1: k = 1 2: while k ≤ K do 3: Jmin = {j : argminj∈Zr ρj} ¯j = random(Jmin) I¯j = {s : π(s) = ¯j, s ∈ Fl} if I¯j )= ∅ then if ρ¯j = 0 then Pt+1 = Pt+1 ∪&s : argmins∈I¯j d(s)’ else Pt+1 = Pt+1 ∪ random(I¯j) end if ρ¯j = ρ¯j + 1, Fl = Fl\s k = k + 1 else Z r = Z r/{¯j} end if 16: 17: end while parent solutions (to take care of third difficulty mentioned in Section II-A), we suggest using a relatively larger value of distribution index in the SBX operator. G. Computational Complexity of One Generation of NSGA-III The non-dominated sorting (line 4 in Algorithm 1) of a population of size 2N having M -dimensional objective vec- tors require O(N logM−2 N ) computations [49]. Identification of ideal point in line 2 of Algorithm 2 requires a total of O(M N ) computations. Translation of objectives (line 3) requires O(M N ) computations. However, identification of extreme points (line 4) require O(M 2N ) computations. De- termination of intercepts (line 6) requires one matrix inversion of size M × M , requiring O(M 3) operations. Thereafter, normalization of a maximum of 2N population members (line 7) require O(N ) computations. Line 8 of Algorithm 2 requires O(M H) computations. All operations in Algorithm 3 in associating a maximum of 2N population members to H reference points would require O(M N H) computations. Thereafter, in the niching procedure in Algorithm 4, line 3 will require O(H) comparisons. Assuming that L = |Fl|, line 5 requires O(L) checks. Line 8 in the worst case requires O(L) computations. Other operations have smaller complexity. How- ever, the above computations in the niching algorithm need to be performed a maximum of L times, thereby requiring larger of O(L2) or O(LH) computations. In the worst case scenario (St = F1, that is, the first non-dominated front exceeds the population size), L ≤ 2N . In all our simulations, we have used N ≈ H and N > M . Considering all the above considerations and computations, the overall worst-case complexity of one generation of NSGA-III is O(N 2 logM−2 N ) or O(N 2M ), whichever is larger. Copyright (c) 2013 IEEE. Personal use is permitted. For any other purposes, permission must be obtained from the IEEE by emailing pubs-permissions@ieee.org.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. 8 III is set as the smallest multiple of four3 higher than H. For MOEA/D procedures, we use a population size, N $ = H, as suggested by their developers. For three-objective problems, NUMBER OF REFERENCE POINTS/DIRECTIONS AND CORRESPONDING POPULATION SIZES USED IN NSGA-III AND MOEA/D ALGORITHMS. TABLE I No. of objectives (M ) 3 5 8 10 15 Ref. pts./ Ref. dirn. (H) 91 210 156 275 135 popsize NSGA-III MOEA/D popsize (N !) 91 210 156 275 135 (N ) 92 212 156 276 136 12 $ or we have used p = 12 in order to obtain H = #3−1+12 91 reference points (refer to equation 3). For five-objective problems, we have used p = 6, so that H = 210 reference points are obtained. Note that as long as p ≥ M is not chosen, no intermediate point will be created by Das and Dennis’s systematic approach. For eight-objective problems, even if we use p = 8 (to have at least one intermediate reference point), it requires 5,040 reference points. To avoid such a situation, we use two layers of reference points with small values of p. On the boundary layer, we use p = 3, so that 120 points are created. On the inside layer, we use p = 2, so that 36 points are created. All H = 156 points are then used as reference points for eight-objective problems. We illustrate this scenario in Figure 4 for a three-objective problem using p = 2 for boundary layer and p = 1 for the inside layer. For 10-objective H. Parameter-Less Property of NSGA-III Like in NSGA-II, NSGA-III algorithm does not require to set any new parameter other than the usual genetic parameters such as the population size, termination parameter, crossover and mutation probabilities and their associated parameters. The number of reference points H is not an algorithmic parameter, as this is entirely at the disposal of the user. The population size N is dependent on H, as N ≈ H. The location of the reference points is similarly dependent on the preference information that the user is interested to achieve in the obtained solutions. We shall now present simulation results of NSGA-III for various many-objective scenarios and compare its performance with MOEA/D and classical methods. V. RESULTS In this section, we provide the simulation results of NSGA- III on three to 15-objective optimization problems. Since our method has a framework similar to MOEA/D in that both types of algorithms require a set of user-supplied reference points or weight vectors, we compare our proposed method with different versions of MOEA/D (codes from MOEA/D web- site [50] are used). The original MOEA/D study proposed two procedures (MOEA/D-PBI and MOEA/D-TCH), but did not solve four or more objective problems. Here, along with our algorithm, we investigate the performance of these MOEA/D algorithms on three to 15-objective problems. As a performance metric, we have chosen the inverse generational distance (IGD) metric [51], [47] which as a single metric which can provide a combined information about the convergence and diversity of the obtained solutions. Since reference points or reference directions are supplied in NSGA-III and MOEA/D algorithms, respectively, and since in this section we show the working of these methods on test problems for which the exact Pareto-optimal surface is known, we can exactly locate the targeted Pareto-optimal points in the normalized objective space. We compute these targeted points and call them a set Z. For any algorithm, we obtain the final non-dominated points in the objective space and call them the set A. Now, we compute the IGD metric as the average Euclidean distance of points in set Z with their nearest members of all points in set A: IGD(A, Z) = 1 |Z| |Z| (i=1 |A| min j=1 d(zi, aj), (6) Fig. 4. The concept for two-layered reference points (with six points on the boundary layer (p = 2) and three points on the inside layer (p = 1)) is shown for a three-objective problem, but are implemented for eight or more objectives in the simulations here. where d(zi, aj) = ’zi − aj’2. It is needless to write that a set with a smaller IGD value is better. If no solution associated with a reference point is found, the IGD metric value for the set will be large. For each case, 20 different runs from different initial populations are performed and best, median and worst IGD performance values are reported. For all algorithms, the population members from the final generation are presented and used for computing the performance measure. Table I shows the number of chosen reference points (H) for different sizes of a problem. The population size N for NSGA- problems as well, we use p = 3 and p = 2 for boundary and inside layers, respectively, thereby requiring a total of H = 220 + 55 or 275 reference points. Similarly, for 15-objective problems, we use p = 2 and p = 1 for boundary and inside layers, respectively, thereby requiring H = 120 + 15 or 135 reference points. 3Since no tournament selection is used in NSGA-III, a factor of two would be adequate as well, but as we re-introduce tournament selection in the constrained NSGA-III in the sequel paper [52], we keep population size as a multiple of four here as well to have a unified algorithm. Copyright (c) 2013 IEEE. Personal use is permitted. For any other purposes, permission must be obtained from the IEEE by emailing pubs-permissions@ieee.org.
分享到:
收藏