logo资料库

基于遗传算法和区分矩阵的属性约简算法.pdf

第1页 / 共9页
第2页 / 共9页
第3页 / 共9页
第4页 / 共9页
第5页 / 共9页
第6页 / 共9页
第7页 / 共9页
第8页 / 共9页
资料共9页,剩余部分请下载后查看
An Attribute Reduction Algorithm Based on Genetic Algorithm and Discernibility Matrix School of Computer Science and Technology, Henan Polytechnic University, Jiaozuo, China Wu Zhengjiang, Zhang Jingmin and Gao Yan Email: {wuzhengjiang, gaoyan}@hpu.edu.cn Abstract—In order to effectively solve the problem between genetic algorithm convergence and a local optimal solution, this paper presents an attribute reduction algorithm based on genetic algorithm with improved selection operator and discernibility matrix. In the algorithm, from the point of view of granular computing, rough set decision tables based on partition and covering are researched by measuring granularity again. The practical results show that the average convergence generation of modified algorithm is obviously superior to not modified algorithm, which is generally applicable in rough set decision tables based on partition and covering Index Terms—rough set, genetic algorithm, discernibility matrix, selection operator, attribute reduction I. INTRODUCTION Rough set theory proposed by Pawlak in 1982 is a mathematical theory that deals with imprecise and uncertain information [1]. It provides a systematic approach for classification of objects through an indiscernability relation. Many examples of applications of the rough set method to pattern recognition, expert system, medical diagnosis, environmental science, biochemistry, chemistry psychology, conflict analysis, economics, process control, and elsewhere can be found in [2-7]. The further investigation into rough set theory and its extension will find new applications and new theories [8]. The classical rough set theory is based on equivalence relations, but this requirement is too restrictive for many applications. Therefore, rough set theory has been extended to similarity relation [9], tolerance relation [10], arbitrary binary relation [11, 12-15], covering [16-17] and others [18-20]. The notion of a covering generalized rough set is regarded as a meaningful extension of the traditional rough set model to deal with more complex practical problems. The literature [16-17, 21-32,46] has already provided several models of covering based rough sets. Multiple fuzzy rough set models based on coverings have been established by some researchers [18-20]. In rough set theory, the attribute reduction is one of the most important research contents. Most of attribute reduction algorithm base on discernibility matrix deve- loped by Skowron and Rauszer [33-35], Both the rows and columns of the matrix correspond to the objects. An element of the matrix is the set of all attributes that distinguish the corresponding object pairs, namely, the set consists of all attributes on which the corresponding two objects have distinct values. One can construct a Boolean discernibility function from a discernibility relation, with attributes as Boolean variables. Skowron and Rauszer [33] showed that the set of attribute reductions are in fact the set of prime implicants of the reduced disjunctive form of the discernibility logic foundation for the study of reducts. On this basis, many researchers studied reduction construction by using the discernibility information in the discernibility matrix [25], [36], [8] and [10]. The attribute reduction of covering rough set, one of rough set extension models, can also adopt discernibility matrix, such as Tsang[4] etc proposed a covering attribute reduction algorithm on basis of discernibility matrix. function. This provides a in is its this paper to guarantee The prerequisite for reduction that is called the weak reduction invariant classification capacity of knowledge base. In practice, we often hope to can ensure this prerequisite, K-nearest neighbor (KNN) is an effective and powerful lazy learning easy-to- implement. KNN is to find the nearest k neighbors for a query point from a given data set and is widely used in pattern classification applications [37, 38]. Therefore, we can evaluate the quality of weak reduction through the accuracy in KNN model. notwithstanding algorithm, In knowledge discovery, genetic algorithms have been used for classification, model selection, and other optimization tasks. Practically, genetic algorithms provides a general-purpose search methodology, which uses principles of the natural evolution [39]. It can effectively solve the problem of large scale, complicated structure, and low computational efficiency. Therefore, it is suitable to solve this kind of questions like NP-hard reduction. Genetic algorithm is firstly used for solving the rough set attribute reduction by Wroblewski in literature [40], after, many scholars have studied one after another[47]. So this paper proposes an attribute reduction algorithm based on genetic algorithm and discernibility matrix about rough set decision tables based on partition and covering. However, sometimes the algorithm above cannot converge within 200 generations, so cannot get the global optimal that behavior and performance of genetic algorithms are directly affected solution. We know 2640JOURNAL OF SOFTWARE, VOL. 7, NO. 11, NOVEMBER 2012© 2012 ACADEMY PUBLISHERdoi:10.4304/jsw.7.11.2640-2648
by the interaction between their parameters [41], which have fixed values in the simple genetic algorithm [42]. Therefore poor parameter settings usually lead to several problems such as premature convergence. So we can consider the three basic genetic operator (selection operator, crossover operator and mutation operator) to solve this problem, How to choose these parameters will directly affect the algorithm performance and search speed. This paper used improved selection operator increases the probability of the algorithm convergence in the global optimal solution, and thus don’t easily fall into the local optimum. Following the modified instruction the average convergence algebra is improved. The rest of the paper is organized as follows. Section 2 reviews some basic concepts of the Pawlak’s rough set and genetic algorithm. Section 3 describes the attribute reduction algorithm based on genetic algorithm and discernibility matrix. The is described in Section 4. Section 5 presents the examples and analyses of the algorithms, the relationship and performance analysis of the two algorithms are studied and tabulated in each section. Section 6 concludes the paper. improved algorithm II. PRELIMINARY In this section , we briefly introduce the basic ideas of rough sets, genetic algorithm and K-nearest neighbor algorithm. A. Rough Set , ) I the decision table. Rows of to objects, and columns correspond In rough set theory, the data is organized in a table table called decision correspond to attributes. In the data set, a class label indicates the class to which each row belongs. The class label is called as decision attribute, the rest of the attributes are the U A F is a knowledge condition attributes. Let ( , expression system. Here, A C D = U , C is used to denote the condition attributes, D for decision attributes, where C D = ∅ . The knowledge expression system possessed the condition attributes and the decision attribute is called the decision table. where Pawlak defines rough set as following definition 2.1. Definition 2.1.[43] Let U be a finite set and R be an equivalence relation on U. R generates a partition U R the equivalence classes generated by the equivalence relation R. In the rough set theory, these are also called elementary sets of R. Let ∅ denote the empty sets. For any X U⊆ , we can describe X by the elementary sets of R and the two sets. R X *( R X *( ) Y U R Y ) { | i i ∩ ≠ ∅ Y U R Y { } = ∪ ∈ i i = ∪ ∈ YL , m Y Y { , 1 2 Y Y 2, 1 are Y }m ⊆ X are called the lower and the upper approximation of X, respectively. L X = } , , , | Definition 2.2.[43] An = U A F ,where U ) , , triplet ( information system x x { , 1 2 is a L is a nonempty x , }n , i n A = ja , j ix , finite set of objects 1,2,..., = a a a }m { , , , discourse, L 1 j F | m ; attributes 1,2,..., = of information functions such that ix U∈ , where called the universe of is a nonempty finite set of { ..., 1,2 , is a set = = V∈ for all f x ( ) i j jV is the domain of attribute ja . m } f 2 j j , , ) A⊆ Theorem 2.1. [44] Let ( U A F be an information . The indiscernibility relation is system and B defined as B x x R {( } , ∀ ∈ . = i B BR is an equivalence relation, and produces a Then, partition on x U x U f , i f x ( l a l ) | ∈ ∈ x i ), = ( ) , j j l j : [ x= } ] {[ x U U U R | , in which, ∈ B i i B x R ]i B x x x | ( ) , { } = ∈ B j i j B f x a x ( { | , ∀ ∈ = l i f x ( l . ) = j l j )} The classical rough set analysis depends on the indiscernibility relation that describes indistinguishability of objects. Indiscernibility relations are equivalences that are interpreted so that two objects are equivalent if one cannot distinguish them by using existing information. Definition 2.3.[44] Let ( , , , , ) U A F be an information system and D A⊆ .D is referred to as a consistent set of U A F if ( . If D is a consistent set and R for all d D∈ , then D is referred to as a D d { } − U A F . The set of all reductions of an reduction of ( information table S is denoted as RED(S). ) R R= R D ≠ ) , , A A R Since D R = ⇔ , the attribute reduce- tion of an information system means it can preserve the partition (classification) of the object set U. U R U R A = D A Definition 2.4. Let ( R system and D A⊆ . if D , weak reduction of ( , ) U A F be an information , , R= , the D is referred to as a U A F . ) A Compared with reduction, the weak reduction defined in this paper doesn't consider the minimum condition attribute but considers only two object that have the same classification ability. Definition 2.5. [16] Let U be a finite universe of discourse, C a family of subsets of U. If none of the , C is called a subsets in C is empty and C U covering of U. It is clear that a partition of U is certainly a covering of U. and the concept of a covering is an extension of the concept of a partition. Definition 2.6.[33] Discernibility matrix. Two objects are discernible if their values are different in at least one attribute. Skowron and Rauszer suggested a matrix representation for storing the sets of attributes that discern pairs of objects, called a discernibility matrix. ∪ = JOURNAL OF SOFTWARE, VOL. 7, NO. 11, NOVEMBER 20122641© 2012 ACADEMY PUBLISHER
) | | a | | f a ( = ≠ U ( , ( , ( , y ( )} U× is a | ) Given an information table S, its discernibility matrix matrix, in which the = x y is defined by M M x y ( , )) M x y for an object pair ( , ) ( , element a A f x M x y ( ) ( , ) { = ∈ The physical meaning of the matrix element M x y ) is that objects x and y can be distinguished by any x y can be discerned if attribute in M x y ≠ ∅ . A discernibility matrix M is symmetric, M x y ≠ ∅ . Therefore, it i.e., is sufficient to consider only the lower triangle or the upper triangle of the matrix. ( , M x y M y x ( , ) M x y . The pair ( , ) B M x y ( , A⊆ can discern an object pair ≠ ∅ . An attribute set B x y if ( , ) ) Definition 2.7. [33] (Discernibility function) The discernibility function of a discernibility matrix is defined by ) ( , , and } ≠ ∅ x y U M x y M x y , ( , ( , )) | { ( ∀ = ∧ ∨ M x y )) ( ( , is the disjunction of all The expression ∨ M x y , indicating that the object pair ) ( , attributes in x y can be distinguished by any attribute in M x y ( , ) ) The expression { ( is the conjunction of all ∧ ∨ , indicating that the family of discernible ∨ object pairs can be distinguished by a set of attributes satisfying { ( ∧ ∨ M x y ( , M x y ( , M x y ( , f M ( ))} ))} ( , ∩ ∈ )) . ) ) ) , ) ( The discernibility function can be used to state an important result regarding the set of reductions of an information table, as shown by the following theorem from Skowron and Rauszer. transforming the problem of Theorem 2.2.[33] The reduction set problem is equivalent the discernibility function to a reduced disjunctive form. Each conjunctor of the reduced disjunctive form is called a prime implicant. Given the discernibility matrix M of an information table S, an attribute set L is a reduction if and only if the conjunction of all attributes in , is a prime implicant of f(M). R, denoted as 1 a a a { , 1 to }p R ∧ = a a a , p 2 2 ∧L In order to derive the reduced disjunctive form, the discernibility function f(M) is transformed by using the absorption and distribution laws. Accordingly, finding the set of reductions can be modelled based on the manipulation of a Boolean function. The set RED(S) of reductions of an information table is equivalent to the set of prime implicants of the discernibility function. Based on the results of Theorem 2.2, Skowron and Rauszer also suggested an alternative characterization of a reduction in terms of the discernibility matrix as shown by the next theorem. Theorem 2.3 [33] Given the discernibility matrix M of an information table S, an attribute set R is a reduction if and only if (i). For ∀ ,x y U U ∈ × M x y ( , ) (ii). For a R ) ≠ ∅ ⇒ ∩ x y U U ∀ ∈ , there exists ( , M x y a { }) ( , ) ∩ R M x y ( , ∈ × , such that ) M x y ( , , such that ≠ ∅ . = ∅ . ≠ ∅ ∧ (( R − ) Property (i) shows that R is jointly sufficient for distinguishing all discernible object pairs. In fact, the set of attributes formed by the union of all elements of the discernibility matrix satisfies property (i). Property (ii) shows that each attribute in R is individually necessary. The result of Theorem 2 provides a convenient way to test if a subset of attributes is a reduction. However, it does not directly offer a method to compute a reduction. Many authors have proposed and studied various algorithms the discernibility matrix[45]. to construct a reduction based on B. Genetic Algorithm Genetic algorithm provides a general-purpose search methodology, which uses principles inspired by natural genetics to evolve solutions to problems [39] . The simple genetic algorithm starts off with a population of randomly generated chromosomes, each representing a candidate solution to the concrete problem being solved, and advances towards better chromosomes by applying genetic operators, which correspond to those occurring in nature. This population evolves over time through a successive iteration process of competition and controlled variation. Each state of population is called generation. Associated with each chromosome at every generation is a fitness value, which indicates the quality of the solution and the chromosome values lead to. Based upon these fitness values, the selection of the chromosomes, which form the new generation, takes place. Like in nature, the new chromosomes are created using genetic operators such as crossover and mutation. The traditional genetic algorithm generally includes selection operator, crossover operator, mutation operator and optimal preservation strategy. The one-point crossover operator is used in this paper. The concrete implementation process is as follows: Randomly match by pairs for population. Randomly choose an intersection point for the a) b) paired individuals. c) Exchange two individuals part chromosomes at the set crossover probability in the intersection point for the paired individuals. We take a basic bit mutation operator. Firstly, each gene locus of each individual is specified for an aberrance point at the set mutation probability, and then the value of each set aberrance point is taken inversion operation, so as to form one new individual. The elitist strategy is that after getting in contemporary generation, if the fitness value of the worst individual (the minimum fitness value) is less than the fitness value of individual the new 2642JOURNAL OF SOFTWARE, VOL. 7, NO. 11, NOVEMBER 2012© 2012 ACADEMY PUBLISHER
the best individual (the maximum fitness value) of last generation, then the best individual of last generation replaces the worst individual of contemporary generation. So far, genetic algorithm have demonstrated impress- ive results in a wide range of domains. However, it has also been found that the genetic algorithm behavior is strongly determined by the interaction between their parameters, which have fixed values in the simple genetic algorithm. Therefore poor parameter settings usually lead to several problems such as premature convergence. Genetic algorithm generally takes the roulette wheel scheme to select. Although it is widely performed, it still exist two problems. One is that in the early evolution, there may be a fitness high individual is selected at a high probability, thus reproduces many offspring, so it is easy to appear the local optimal situation because a single individual be unable to continue to evolve; The other is that in the later evolution, when the gap of each individual fitness is not big, this method is no longer has the ability to select, so doesn’t reflect the superiority of individuals. Hence, this paper improves selection operator. Firstly, select the individual directly into mating pool that the fitness value is 1 from the initial population, and then take the roulette wheel scheme. By taking this improved selection operator, the individual number of the better fitness value is increased, thus the above problems are solved to some extent. C. K-nearest Neighbor Algorithm involves The KNN method [37] is widely used in pattern recognition. This method is useful both for estimation of densities and for classification. As a classifier, this algorithm three steps: (1) calculating the distances between a query sample and all training samples, (2) choosing the k nearest training samples to the query sample, and (3) assigning a class label by applying the majority rule to the k nearest samples. the following The process of KNN algorithm is as follows: given a test document x, find the K nearest neighbors of x among all the training documents, and score the category candidates based the category of K neighbors. The similarity of x and each neighbor document is the score of the category of the neighbor document. If several of the K nearest neighbor documents belong to the same category, then the sum of the score of that category is the similarity score of the category in regard to the test document x. By sorting the scores of the candidate categories, system assigns the candidate category with the highest score to the test document x. The decision rule of KNN can be written as: f x ( ) = arg max j Score x C ( , ) j = ∑ d KNN i ∈ sim x d y d C ) ( ( , , i i ) j where Score x C is the score of the candidate category with respect to x; )i and the training document ) {0,1} binary category value of the training document f x is the label assigned to the test document x; ( ) jC )j ( , sim x d is the similarity between x is the id with y d C ∈ ( id ; ( , , i j respect to category jC ( jC , or 1y = indicates document y = ). 0 id is part of This approach is effective, non parametric and easy to implement. However, its classification time is long and the accuracy is severely degraded by the presence of noisy training document. III. AN ATTRIBUTE REDUCTION ALGORITHM BASED ON GENETIC ALGORITHM AND DISCERNIBILITY MATRIX A. The Coding of Chromosome , = C c }n c c { , 1 2 Considering the practical characteristic of attribute reduction, the binary encoding is introduced. And the coding strategy is as follows: let L is the condition attributes set, the condition attributes space can easily map the chromosome of genetic algorithm. Chromosome is the binary string that the length is n, each bit of which corresponds a condition attribute. If the bit is 1, which means select the corresponding condition attribute, the corresponding condition attribute. For example, the code 0001100100001 shows the condition attributes of this reduction have 13 bits and select the corresponding the fourth, the fifth, the eighth and the thirteenth condition attributes. Since binary encoding has the feature of simple and convenient, in the whole, genetic algorithm in easy to operate. B. The Fitness Function and otherwise means not select is In the genetic algorithm, the fine level of the optimal solution can be achieved by making use of the fitness function which individual’ s optimization calculation in the groups. In this paper, the fitness function is defined as follows: to measure each F x ( ) (1) Where | indicates the number of terms of discerni- nC indicates the number of terms of dis- bility function. cernibility function that is covered by the chromosome n. C. Selection Operator |f / | C = f | n to Take the roulette wheel scheme to select, we copy the excellent individual selected from the contemporary population the next generation population. The concrete implementation process is as follows: a) Calculating the sum of all individuals fitness; b) Calculating relative fitness of all individuals by type (2), that is to say, the probability of each individual inherits in the next generation ∑ n i 1 = L (2) c) Taking the roulette wheel scheme to ascertain the C x ( i F x ( i F x ( i 1,2, ) / ), = = n ) i times of each individuals selected D. Algorithm Description According to the analyses above, an attribute reduction algorithm based on genetic algorithm and discernibility matrix is described as follows: JOURNAL OF SOFTWARE, VOL. 7, NO. 11, NOVEMBER 20122643© 2012 ACADEMY PUBLISHER
The weak reduction is called advantage weak reduction which more than the original accuracy of k-nearest neighbor. The weak reduction is called reliable weak reduction which greater than or equal the original accuracy of k-nearest neighbor. Algorithm 1(also shown in figure 1): Input: a decision table , , where C is the condition attribute, D is the decision attribute S U A V f A C D = U Output: one or more advantage weak reduction Step 1: Calculating the number of terms of discernib- ), = ( , , ility function | |f . Step 2: Initializing the population: randomly generate m numbers initial population that length is n Step 3: According to the type (1) to calculate the fitness value of each individual, if the fitness values of all individuals are 1, turn to step 9 Step 4: By type (2) to calculate the probability of each individual selected, and then taking the roulette wheel scheme. Step 5: By the crossover probability cp to proceed the one-point crossover operator Step 6: By the mutation probability mp to proceed the basic bit mutation operator Step 7: By type (1) to calculate the fitness values of new individuals; and then taking the optimal preservation strategy Step 8: If all individuals are 1, turn to step 9, otherwise, turn to step 3 Step 9: Judging whether the optimal individual’s accuracy of k-nearest neighbor is more than the original accuracy of k-nearest neighbor, if more than, then input , otherwise ,turn to step 2 IV. AN ATTRIBUTE REDUCTION ALGORITHM BASED ON GENETIC ALGORITHM WITH IMPROVED SELECTION OPERATOR AND DISCERNIBILITY MATRIX A. Improved Selection Operator The process of improved selection operator is as follows: a) Selecting the individual directly into the mating pool that the fitness value is 1 from initial population. b) Taking the roulette wheel scheme for all individuals B. Algorithm Description According to the analyses above, an attribute reduction algorithm based on genetic algorithm with improved selection operator and discernibility matrix is described as follows: Algorithm 2: Input: a decision table , where C is the condition attribute, D is the decision attribute S U A V f A C D = U ), = ( , , , Output: one or more advantage weak reduction Start Input a decision table Calculate the number of terms of discernibility function Randomly generate initial population N The weak reductions are advantage weak reductions The fitness value is 1 Y N Selection crossover mutation Output the weak reductions The optimal preservation strategy End Figure 1. Algorithm Flowchart Step 1: Calculating the number of terms of discerni- bility function | |f . Y Step 2: Initializing the population: randomly generate m numbers initial population that length is n Step 3: According to the type (1) to calculate the fitness value of each individual, if the fitness values of all individuals are 1, turn to step 9 Step 4: Taking the improved selection operator. Step 5: By the crossover probability cp to proceed the one-point crossover operator Step 6: By the mutation probability mp to proceed the basic bit mutation operator Step 7: By type (1) to calculate the fitness values of new individuals; and then taking the optimal preservation strategy Step 8: If all individuals are 1, turn to step 9, otherwise, turn to step 3 Step 9: Judging whether the optimal individual’s accuracy of k-nearest neighbor is more than the original accuracy of k-nearest neighbor, if more than, then input ,otherwise ,turn to step 2. C. The Analysis of Algorithm Complexity 2 || (| by function Time complexity of the algorithm consists of three parts. The first part is to calculate the time complexity of discernibility discernibility matrix O C U |) ; The second part is to calculate the time complexity of the genetic algorithm , where the m is the population numbers, the n is the termination generations of genetic algorithm; The third part is to calculate the time complexity of more than the . original accuracy of k-nearest neighbor So the total time complexity is: | | + |), || max O mn f O m C U C m C U || ( |)) 2 O m C U O C U O mn f mn f (| = C+ + | + ( | || | | |) |) |) || ( ( ( | | | 2 | | | 2 2 2644JOURNAL OF SOFTWARE, VOL. 7, NO. 11, NOVEMBER 2012© 2012 ACADEMY PUBLISHER
In algorithm 2, step 4 is the only difference from algorithm 1. So, in theory, algorithm complexities of algorithm 1 and algorithm 2 are the same, but in fact, the average running time of algorithm 2 is about 48% faster than algorithm 1. It is the effect of the improved selection in section 4.1. V. ALGORITHM EXAMPLES AND ANALYSIS In order to more specific expression result of algorithm 1, we draw once experimental result from figure 2 and figure 3, such as table 1 and table 2 shows: The effectiveness of algorithm 1 can be shown from table 1, table 2, figure 2and figure 3. for the In this section, we present the examples and analyses of the algorithm 1 and algorithm 2, the experimental result of the two algorithms are studied and tabulated. We use the wine data set (13 condition attributes and a decision attribute, 178 records) in UCI machine learning database to test. In order to reduce the influence on outcomes inconformity of each attribute dimension, we standardize the value of all continuous attributes to [0, 1] interval with the biggest-minimum method. The rough set based on partition is measured with the condition attributes divided by particle size, the particle size selected is 0.21, if two condition attributes divided by 0.21 after rounding take the same, and then we think them belong to the same category, otherwise not belong to the same category. The rough set based on covering is measured through the particle size to change its neighborhood size of covering element, coverings of these neighborhoods meet a properties compared with generally covering: for any neighborhood, there always exist some elements that cannot be covered by other neighborhoods in addition to itself. The intervals selected are [0,0.22),[0.18,0.42),[0.38,0.62),[0.58,0.82),[0.78,1]. If two condition attributes fall in the same interval, and then we think them belong to the same category, otherwise not belong to the same category. All in the experiments ,we take the test set TE and training set TR which are completely disjoint sets, and randomly set the proportion of TR and TE is 2:1, the experimental parameters are as follows: initial population m is 10, the biggest algebra n sets 200, the crossover cp is 0.7, the mutation probability mp is 0.01. probability A. The Experimental Result of the Algorithm 1 In this subsection, we in detail state the experimental result of the algorithm 1. In order to more easy to analysis , the experimental result is made line charts and table . In order to see the effect more visualized, algorithm 1 is run 100 times repeatedly to make line charts, As figure 2 and figure 3 show. The advantages of the advantage weak reduction can be saw clearly and visually from figure 2 and figure 3. There are two cases that can’t get the advantage weak reduction in the experiments. Case 1: when the original accuracy of k-nearest neighbor is 100%, by the definition of the advantage weak reduction, we know that this kind of situation has no advantage weak reduction; Case 2: algorithm can’t converge within 200 generations, so we can't find the global optimal solution. It is because of these two cases that figure 2 and figure 3 have no 100 times records. Figure 2. Algorithm flowchart. Accuracy Contrast Diagram of the Rough Set based on Partition Figure 3. Algorithm flowchart. Accuracy Contrast Diagram of the Rough Set based on covering TABLE I. THE ROUGH SET BASED PARTITION the original accuracy of k-nearest neighbor 96.61% weak reduction 1111011101010 1011111100010 1111011101000 1010010011100 1011101111111 1011001111111 the advantage weak reduction and its accuracy convergence algebra 1011001111111 1011101111111 98.31% 7 JOURNAL OF SOFTWARE, VOL. 7, NO. 11, NOVEMBER 20122645© 2012 ACADEMY PUBLISHER
TABLE II. THE ROUGH SET BASED COVERING the original accuracy of k- nearest neighbor 94.92% weak reduction 0011111111100 0111111111100 1101001101100 0010101101101 1101001110111 1101111010100 1101001101101 the advantage weak reduction and its accuracy convergence algebra 1101001101101 98.31% 1101001110111 100% 4 B. The Experimental Results Contrast of Algorithm 1 and 2 In this subsection, we in detail state the experimental result contrast of algorithm 1 and 2. In order to more easy to analysis and contrast, the experimental result is made line charts and table . In order to contrast the convergence of algorithm 1 and algorithm 2 based on partition and covering, we make 100 times contrast experiments between algorithm 1 and 2 with the same initial individual, As figure 4 and figure 5 show. Figure 5. Convergence Algebra Contrast Diagram of the Rough Set Based on Covering rgence, we draw once experiment result from the figure 4 and figure 5, such as table 3 and table 4 shows TABLE III. ROUGH SET BASED ON PARTITION the original accuracy of k- nearest neighbor 98.31% 98.31% Alg. 1 Alg. 2 weak reduction 1111011001100 1111011011100 1110011011100 1110010011101 0011111111101 1011111111101 0011001011111 0011110110101 0011011110100 1010101111101 convergence algebra the advantage weak reduction and its accuracy 200 null 24 100% 1010101111101 TABLE IV. ROUGH SET BASED ON COVERING Figure 4. Convergence Algebra Contrast Diagram of the Rough Set Based on Partition the original accuracy of k- nearest neighbor weak reduction convergence algebra the advantage weak reduction and its accuracy The cause of figure 4 and 5 no 100 times records is the same as figure 2 and 3. In figure4, the average convergence algebra of algorithm 1 is 25, and the average convergence algebra of algorithm 2 is 9, algorithm 2 is faster 16 generations than algorithm 1 on an average, that is to say, the generation of algorithm 1 is 2.8 times faster than algorithm 2. Similarly, in figure5, the average convergence algebra of algorithm 1 is 25, and the average convergence algebra of algorithm 2 is 10, algorithm 2 is faster 15 generations than algorithm 1 on an average, that is to say the generation of algorithm 1 is 2.5 times faster than algorithm 2. So we can see the convergence of the algorithm 2 is better than algorithm 1. In order to more specific reflect the advantage of algorithm 2 in the conve- Alg. 1 98.31% Alg. 2 98.31% 1101010110011 1101010110010 1111110111111 1111000110011 1101000110011 1101110110011 1101110111011 1100010111101 1011001011101 1100100111101 0111000111101 1100011111101 1100001111001 0111000110101 200 null 10 100.0% 1011001011101 From the table 3 and table 4, we can see algorithm 1 can’t converge within 200 generations in the same initial population, however, algorithm 2 can quickly converge in 2646JOURNAL OF SOFTWARE, VOL. 7, NO. 11, NOVEMBER 2012© 2012 ACADEMY PUBLISHER
optimal solution, at this time, the advantage of algorithm 2 is more apparent in the convergence. The discussion above is the advantage weak reduction, the following simply discuss about the reliable weak reduction according to the analysis method of advantage weak reduction. For the rough set based on partition, the average convergence algebra of algorithm 1 is 8, and the average convergence algebra of algorithm 2 is 4, algorithm 2 is faster 4 generations than algorithm 1 on an average, that is to say, the generation of algorithm 1 is 2 times faster than algorithm 2. For the rough set based on covering, the average convergence algebra of algorithm 1 is 5, and the average convergence algebra of algorithm 2 is 12, algorithm 2 is faster 7 generations than algorithm 1 on an average, that is to say, the generation of algorithm 1 is 2.4 times faster than algorithm 2. The time of getting the advantage of weak reduction is 5 or 6 times faster than the reliable of weak reduction, in practice, we can choose according to the decision table’s size and needs. VI. CONCLUSION In this paper, we propose an attribute reduction algorithm based on genetic algorithm and discernibility matrix and improve it. The convergence speed of algorithm improved before is slow, and it is easy to fall into the local optimal solution. So we uses improved selection operator which increases the probability of converged on the global optimal solution to solve this problem. The experimental results show the algorithm can faster converge on the global optimal solution compared with the algorithm improved before. We also contrast the proportion of advantage (reliable) weak reduction in weak reduction about rough set decision tables based on partition and covering, the experimental results show that the proportion don’t improve compared with the algorithm improved before. Hence, in future articles, we will study how to improve the proportion. that ACKNOWLEDGMENT This work was supported by grants from The National Science Foundation of China under Grant No. 61175047; The Soft Science Research Program of Henan Province under Grant No. 122400450212 and The Innovation Foundation of Henan Polytechnic for the Master Thesis under Grant No: 2001-M-36. REFERENCES [1] Pawlak Z, Rough sets [J]. International Journal of Computer and Information Sciences, 1982, 11 (5): 341- 356. [2] F. Angiulli, C. Pizzuti, Outlier mining in large high- dimensional data sets, IEEE Transactions on Knowledge and Data Engineering 2005, 17 (2) 203–215. [3] F. Hu, G.Y. Wang, H. Huang, a.l. et Incremental Attribute Reduction Based on Elementary Sets, RSFDGrC 2005, 3641, 2005, pp. 185–193. [4] C.T. Su, J.H. Hsu, An extended chi2 algorithm for discretization of real value attributes, IEEE Transactions on Knowledge and Data Engineering 2005, 17 (3) 437–441. [5] L. Polkowski, A. Skowron (Eds.), Rough Sets and Current Trends in Computing, vol. 1424, Springer, Berlin, 1998. [6] L. Polkowski, A. Skowron (Eds.), Rough Sets in Knowledge Discovery, vol. 1, Physica-Verlag, Berlin, 1998. [7] L. Polkowski, A. Skowron (Eds.), Rough Sets in Knowledge Discovery, vol. 2, Physica-Verlag, Berlin, 1998. [8] Z. Pawlak, A. Skowron, Rough sets: some extensions, Inform. Sci. 2007, 177 (1) 28–40. [9] R. Slowinski, D. Vanderpooten, A generalized definition of rough approximations based on similarity, IEEE Trans. Knowledge Data Eng. 2000, 12 (2) 331–336. [10] A. Skowron, J. Stepaniuk, Tolerance approximation spaces, Fundamenta Informaticae. 1996, 27: 245–253. [11] G.L. Liu, W. Zhu, The algebraic structures of generalized rough set theory, Inform. Sci. 2008, 178 (21) : 4105–4113. [12] W. Zhu, Generalized rough sets based on relations, Inform. Sci. 2007. 177 (22): 4997–5011. [13] Y.Y. Yao, On generalizing pawlak approximation operators, in: LNAI, vol. 1424, 1998, pp. 298–307. [14] Y.Y. Yao, Relational interpretations of neighborhood operators and rough set approximation operators, Inform. Sci. 1998: 111 (1–4) 239–259. [15] Y.Y. Yao, Constructive and algebraic methods of theory of rough sets, Inform. Sci. 1998: 109: 21–47. [16] Z. Bonikowski, E. Bryniarski, U.W. Skardowska, theory, Extensions and Information Sciences 1998. 107: 149–167. the rough set intentions [17] E. Bryniarski, A calculus of rough sets of the first order, in Bull. Pol. Acad. Sci. 1989, 36 (16) (1989) 71–77. [18] T.J. Li, Y. Leung, W.X. Zhang, Generalized fuzzy rough approximation operators based on fuzzy coverings, International Journal of Approximation Reasoning 2008. 48: 836–856. [19] T.J. Li, W.X. Zhang, Fuzzy rough approximations on two Information Science 2008. universes of discourse, 178:829–906. [20] T. Deng, Y. Chen, W. Xu, Q. Dai, A noval approach to fuzzy rough sets based on a fuzzy covering, Information Science 2007. 177: 2308–2326. [21] W. Zhu, Topological approaches to covering rough sets, Information Sciences 2007. 177: 1499–1508. [22] W. Zhu, Relationship between generalized rough sets based on binary relation and coverings, Information Sciences 2008. [23] Eric C.C. Tsang, D. Chen, D.S. Yeung, Approximations and rough sets, Computers and Mathematics with Applications 2008. 56: 279–289. reducts with covering generalized [24] W.H. Xu, W.X. Zhang, Measuring roughness of generalized rough sets induced by a covering, Fuzzy Sets and Systems 2007. 158: 2443–2455. [25] D. Chen, C. Wang, Q. Hu, A new approach to attributes reduction of consistent and inconsistent covering decision systems with covering rough sets, Information Sciences 2007. 177: 3500–3518. [26] Z. Xu, Q. Wang, On the properties of covering rough sets model, Journal of Henan Normal University (Natural Science) 2005. 33 (1): 130–132. [27] E. Tsang, D. Chen, J. Lee, D.S. Yeung, On the upper approximations of covering generalized rough sets, in: Proceedings of the 3rd International Conference on Machine Learning and Cybernetics, 2004, pp. 4200–4203. [28] W. Zhu, F.Y. Wang, A new type of covering rough sets, in: IEEE IS 2006, London, 4–6 September, 2006, pp. 444–449. JOURNAL OF SOFTWARE, VOL. 7, NO. 11, NOVEMBER 20122647© 2012 ACADEMY PUBLISHER
分享到:
收藏