Logo

CIRG - Research  -  Data Mining 



[Bullet] Home
About
News
[Bullet]

- NN
- DM
- SI
- EC
- MAS
- AIS
- IA
- Bioinf
- Games
- Opt
- FA
- Industry

Publications
People
Resources
Links
Contact Us

OVERVIEW

The data and text mining focus area has the objective of developing new techniques for knowledge discovery and to improve existing techniques. The focus area is also active in applying data mining techniques to solve real-world problems in consultation to South African industries.

Some of the questions being addressed are how to mine knowledge from data with continuous classes, how to cope with extremely large databases, more efficient data clustering methods and how to extract knowledge in environments where data changes over time. Tools are currently under development which address these questions.

ACTIVE MEMBERS

List the current members actively doing research in this focus area. [ Show ]

ALUMNI MEMBERS

P Lutu

PhD Completed in 2010

A Louis

M.Sc Started in 2006

E Dean

M.Sc Started in 2002

G Nel

M.Sc Completed in 2005

E Papacostantis

M.Sc Started in 2004
Hons-B.Sc Completed in 2003

G Potgieter

M.Sc Completed in 2003
Hons-B.Sc Completed in 2001

D Rodic

PhD Completed in 2005
M.Sc Completed in 1999

GROUP PUBLICATIONS

List publications of this research focus area. [ Show ]

MEMBER PROFILE



 Name:

 Gavin Potgieter

Portrait photo

 E-mail:

 engel@cs.up.ac.za

 Group(s):

 Evolutionary Computation
 Data Mining

 

 Degree specific information: M.Sc

 Title:

 Mining continuous classes using evolutionary computing.

 Abstract:

Data mining is the term given to knowledge discovery paradigms that attempt to infer knowledge, in the form of rules, from structured data using machine learning algorithms. Specifically, data mining attempts to infer rules that are accurate, crisp, comprehensible and interesting. There are not many data mining algorithms for mining continuous classes. This thesis develops a new approach for mining continuous classes. The approach is based on a genetic program, which utilises an efficient genetic algorithm approach to evolve the non-linear regressions described by the leaf nodes of individuals in the genetic program's population. The approach also optimises the learning process by using an efficient, fast data clustering algorithm to reduce the training pattern search space. Experimental results from both algorithms are compared with results obtained from a neural network. The experimental results of the genetic program is also compared against a commercial data mining package (Cubist). These results indicate that the genetic algorithm technique is substantially faster than the neural network, and produces comparable accuracy. The genetic program produces substantially less complex rules than that of both the neural network and Cubist.

 Supervisor / Co-Supervisor:

 AP Engelbrecht

 Thesis:

 Download




You are visitor #16431
Contact webmaster
Back to top

QualNet Network Simulator University Program Valid XHTML 1.0! Valid CSS!


Computational Intelligence Research Group
University of Pretoria
Copyright © 2017