Scheduled for Measurement: Digging the Gold (Knowledge) from the Data Mine, Friday, April 12, 2002, 10:15 AM - 12:15 PM, San Diego Convention Center: Room 7B


Intelligent Data Analysis Using Artificial Neural Networks

Sharon Y. Hsu, University of Illinois at Urbana-Champaign, Urbana, IL and Weimo Zhu, Mahomet, IL

With a combination of traditional statistical methods and machine-learning, a new set of data analysis tools called "Intelligent Data Analysis (IDA; Berthold & Hand, 1999)" has been developed recently. Many of these IDA techniques have been used in data mining and the artificial neural networks, or simply "neural networks (NN), is among the most sophisticated ones being employed. NN was created in 1943 by Warren McCulloch, a neurophysiologist, and Walter Pits, a logician, who planed to postulate a simple model to explain how biological neurons work. Eventually, the model becomes a new method to solving problems outside of the realm of neurobiology. With the development of digital computers and their power, NN got popular in 1980s. Currently, NN has been considered as one of the most active areas in artificial intelligence research, and successful applications have been reported in many fields, such as medicine, finance, psychology and education. NN simulates the processes of a neuron, in which each input activity is multiplied by a number called the weight. The "unit" adds together the weighted inputs and computes the output activity using an input-output function. In applying NN to solve real-life problems, a neural net has to be created with the following major steps: (a) identify the input and out features, (b) message the inputs and outputs so their range is between 0 and 1, (c) set it with an appropriate topology, (d) train the net work on a representative set of training examples, which are similar to the validation examples in regression, (e) test/evaluate the network using a cross-validation sample, and (f) apply the net work. Training the network is at the heart of these steps, from which the best weights on the inputs are determined. A very simple neural network can have only two layers, inputs and output, which is exactly equivalent to a logistic regression. A middle layer called the "hidden layer," however, is usually included in the network and it makes the network more powerful by enabling it to recognize more patterns. Using the data from the 1995 Adult Disability Follow-Back Survey (N=9,691), this presentation will introduce and illustrate NN and its potential applications in data mining, including its basic concepts, major steps in conducting a NN research, key features, major references and software.

Back to the 2002 AAHPERD National Convention and Exposition