Logo

CIRG - Research  -  Data Mining 



[Bullet] Home
About
News
[Bullet]

- NN
- DM
- SI
- EC
- MAS
- AIS
- IA
- Bioinf
- Games
- Opt
- FA
- Industry

Publications
People
Resources
Links
Contact Us

OVERVIEW

The data and text mining focus area has the objective of developing new techniques for knowledge discovery and to improve existing techniques. The focus area is also active in applying data mining techniques to solve real-world problems in consultation to South African industries.

Some of the questions being addressed are how to mine knowledge from data with continuous classes, how to cope with extremely large databases, more efficient data clustering methods and how to extract knowledge in environments where data changes over time. Tools are currently under development which address these questions.

ACTIVE MEMBERS

List the current members actively doing research in this focus area. [ Show ]

ALUMNI MEMBERS

List alumni of this research focus area. [ Show ]

GROUP PUBLICATIONS


A Building Block Approach to Genetic Programming for Rule Discovery
Engelbrecht, AP. Rouwhorst, S. Schoeman, L. 2001.
in Data Mining: A Heuristic Approach, HA Abbass, R Sarkar, C Newton (eds), Chapter IX, pp 175-189, Idea Group Publishing

Download this publication from the Data Mining group.

Abstract:

This chapter presents a building-block approach to evolving compact decision trees using genetic programming. The algorithm extracts crisp and accurate rules sets, in comparison with C4.5 and CN2.

Back to top Up Arrow



Searching the Forest: Using Decision Trees as Building Blocks for Evolutionary Search in Classification Databases
Rouwhorst, S. Engelbrecht, AP. 2000.
IEEE International Congress on Evolutionary Computation, San Diego, USA, pp 633-638, IEEE

Download this publication from the Evolutionary Computing and Data Mining groups.

Abstract:

A new evolutionary search algorithm, called BGP, to be used for classification tasks in data mining, is introduced. It is different from existing evolutionary techniques in that it does not use indirect representations of a solution, such as bit strings or grammars. The algorithm uses decision trees of various sizes as individuals in the populations and operators, e.g. crossover, are performed directly on the trees. When compared to C4.5 and CN2 on a benchmark of problems, BGP shows very good results.

Back to top Up Arrow



A Hybrid Exhaustive and Heuristic Rule Extraction Approach
Rodich, D. Engelbrecht, AP. 1999.
In: Development and Practice of Artificial Intelligence Techniques, VB Bajic, D Sha (eds), pp 25-28, Proceedings of the International Conference on Artificial Intelligence, Durban, South Africa

Download this publication from the Data Mining group.

Abstract:

This paper presents a new exhaustive-heuristic hybrid approach to the discovery of rules from data sets containing binary attributes. Principles from evolutionary computing are used to design heuristics to reduce the complexity of the search for crisp and accurate rules. A comparison of this new approach with a benchmark genetic algorithm approach shows the proposed method to be more efficient.

Back to top Up Arrow



Rule Improvement through Decision Boundary Detection using Sensitivity Analysis
Engelbrecht, AP. Viktor, HL. 1999.
International Working Conference on Artificial Neural Networks, Alicabte, Spain, 1607:78-84, in the Springer-Verlag series Lecture Notes in Computer Science

Download this publication from the Data Mining group.

Abstract:

Rule extraction from artificial neural networks (ANN) provides a mechanism to interpret the knowledge embedded in the numerical weights. Classification problems with continuous-valued parameters create difficulties in determining boundary conditions for these parameters. This paper presents an approach to locate such boundaries using sensitivity analysis. Inclusion of this decision boundary detection approach in a rule extraction algorithm resulted in significant improvements in rule accuracies.

Back to top Up Arrow



Incorporating Rule Extraction from ANNs into a Cooperative Learning Environment
Viktor, H. Engelbrecht, AP. Cloete, I. 1998.
Neural Networks and Their Applications, Marseilles, France, pp 385-391

Download this publication from the Data Mining group.

Abstract:

Rule extraction from artificial neural networks (ANNs) addresses the need of domain experts to obtain insight into the decision making process of an ANN. This paper presents the ANNSER approach to extracting rules from continuous data, using sensitivity analysis to locate decision boundaries in determining the thresholds of attributes. In addition, the paper discusses the incorporation of the ANNSER approach into a rule-based cooperative learning environment. In the cooperative learning environment, the ANNSER approach co-exists with other inductive learning techniques. Therefore, the domain expert may use more than one approach to verify the classification of the concepts that are contained in the data.

Back to top Up Arrow



Reduction of Symbolic Rules from Neural Networks using Sensitivity Analysis
Viktor, H. Engelbrecht, AP. Cloete, I. 1995.
IEEE International Joint Conference on Neural Networks, Perth, Australia, pp 1022-1026

Download this publication from the Data Mining group.

Abstract:

This paper shows how sensitivity analysis identifies and eliminates redundant conditions from the rules extracted from trained neural networks, by eliminating irrelevant inputs. This leads to a reduction in the number and size of the rules. The reduced rule set accurately and minimally reflect the classification problems presented. Also, the elimination of redundant input units significantly reduces the combinatorics of the rule extraction algorithm. The resultant rule set compares favorably with traditional symbolic machine learning algorithms.

Back to top Up Arrow






You are visitor #14475
Contact webmaster
Back to top

QualNet Network Simulator University Program Valid XHTML 1.0! Valid CSS!


Computational Intelligence Research Group
University of Pretoria
Copyright © 2017