Autonomy in an Neural Network – Pyschology Essay
The idea of creating artificial intelligent life is not a new concept but rather one that dates back to the end of the sixteenth century. Inventors fashioned mechanical constructs deemed “automata” for the purposes of entertaining the wealthy (Jones, n.d.). Many of these
hydraulic devices were constructed to simulate human and animal behaviours with the ultimate goal of creating artificial people as in the story of Pinocchio. Although they succeeded in making the machines perform simple tasks such as writing a limited number of phrases, their science and technology were woefully underdeveloped for the task of creating something with the cognitive faculties of an animal, let alone a human being (Jones, n.d.). Skipping ahead to the 21st century finds scientists in a situation not far removed from their predecessors. Although there have been enormous strides in the field of human psychology, scientists are still quite a ways off from creating an autonomous agent that can think and learn for itself. The main goal of those working in the field of artificial intelligence is the simulation of an artificial brain. The “easiest” way to do this is to model one based on the human brain. After all, we know that it works, that it’s capable of learning, and that as part of a whole it results in a being that is self aware and intelligent. The first step in this task is to understand how the brain works.
The human brain consists of 1012 neurons and which serve almost all the functions of the human brain (Reingold, n.d.). They are the information carriers and are responsible for all of our cognitive functions. They consist of a cell body called the soma and usually have two stem like extensions, the dendrites and the axon. Information is transmitted from the axon to the dendrite across the synaptic cleft whenever the neuron is excited to the point of firing, called an action potential (Klerfors, 1998) . These action potentials then propagate from neuron to neuron causing the neuron that they are connected to, to either excite or inhibit itself (Reingold, n.d.). In the case of excitation, a firing perceived from a connected neuron will cause it to make itself ready for an action potential of its own, while an inhibitory neuron will be less likely to fire (Reingold, n.d.). Action potentials are all or none and only convey the information that a specific neuron has fired. The groupings and connections between neurons are what is referred to as a neural network.
A neural network typically consists of three layers. The first layer is the “input layer” which is connected to the sensory organs, and which provides data for the next layer. Connected to these initial neurons are a multitude of neurons in what is called the “hidden layer”. The purpose of these neurons is to identify the input data and render it into meaningful information that the brain can understand, which is then passed to the “output layer” (Klerfors, 1998). There are three fundamental concepts necessary for understanding the relationships between neurons in a neural network. The first of these, connection strength, refers to how strongly one neuron influences those neurons connected to it. Because this connection strength can vary enormously it is thought that it is here that all information in the brain is stored. The second concept refers to the excitation/inhibition distinction between neurons. This refers to the whether a neuron will cause an excitatory or inhibitory response in a neighbouring neuron, and as a result the magnitude of the response will vary with the connection strength. The final component is the transfer function of the neuron. The transfer function describes how a neuron’s firing rate varies with the input it receives (Reingold, n.d.). It is from these three components that it will be determined how much of the activation value is passed on to the next node.
Artificial neural nets were created based on the aforementioned principles and were constructed to simulate the brain as similarly as possible through software run on computers. Although neural nets are parallel processors the only way to get them to run is to create them on programs that run through serial computers. While this process sacrifices the speed of parallel processors, it still mimics the fundamental properties of the neural network. As with biological neurons, the data holders in artificial neural nets come from the connection strengths between individual neurons and are referred to as weights. These weights are located in the hidden layer of a network and are given a value between –1 and +1. These values refer to how much activity from a connecting neuron is required to trigger an artificial action potential. Each node sums the activation values it receives, arrives at its own activation value, and then passes that along to the next nodes in the network (after modifying its activation level according to its transfer function (Reingold, n.d.). The neurons in an artificial neural network work in the same way as biological ones with input coming in one end, passing through a middle computational layer, and then exiting through the output layer. The actual physical set up of the network was the simple part for scientists while difficulty came from harnessing its power. For a system to be fully autonomous it must be able to learn without outside support and be able to adapt to its environment. This type of learning is referred to as unsupervised learning (Klerfors, 1998).
In unsupervised learning the weights in an artificial neural net start off randomized. What this means is that any data sent through the system will be outputted as randomized values. In order to produce meaningful information from this the system must teach itself to recognize the information, remember it and produce an appropriate response. Because the system is to be self sufficient it must do this on its own through the internal adjustment of its weights (Reingold, n.d.). As stated before, an artificial neural network begins in a randomized state where the values of the weights do not represent anything. As data is sent through the system the output layer sends the data back to input layer to rework the information into something a little more representational of the original info. This process that the artificial neural network goes through is not something that can be done in one, ten, or even a hundred runs, but rather can take up to as many as forty thousand runs to generate proper data. Looking at it from this perspective it is difficult to understand how anything can get done, but when one understands that information is passed from neuron to neuron at an average speed of 268mph it seems a little more feasible (Klerfors, 1998).
Because an artificial neural network that learns through unsupervised learning begins as a randomized “clean slate” it requires two things to function properly. One of these is a large data set to learn from. In order to function properly it must teach itself and it does this by using old confirmed data sets. By using old data sets both the inputs and desired outputs are known so when the info given to the network it can learn for itself by putting in the old inputs and comparing and adjusting its own weights to bring its own outputs in line with those of the data set (Want to try Neural Nets?, n.d.). Again, this procedure does not occur through a few trials, but rather in trials numbering in the tens of thousands. The second, and most important, component necessary for an artificial neural network to function properly is its learning algorithm. This refers the mathematical software that runs in the hidden layer and dictates exactly how the weights should respond to a specific target input (Klerfors, 1998). Although there currently exist many differing kinds of algorithms most share in common one of a few different learning laws. The learning laws are the backbone of the algorithm and can be seen as its dna, as it gives instructions on how the mathematics should be performed. Although the learning laws themselves seem to be constructed in greater abundance by scientists, there are three main laws for unsupervised learning. One of the first ones constructed by Donald Hebb and consequently became know as Hebb’s rule. Heb’s rule states that if a neuron receives an input from another neuron, and if both are highly active (mathematically have the same sign), the weight between the neurons should be strengthened. This corresponds to the physiological findings in the brain that greater activation between connected neurons results in easier activations (Artificial Neural Networks Technology, n.d.). Similar to Hebb’s rules is Hopfields Law which distinguishes itself by specifying that magnitude of the strengthening and weakening in said neurons. It states that if the desired output and the input are both active or both inactive, increment the connection weight by the learning rate, otherwise decrement the weight by the learning rate (Artificial Neural Networks Technology, n.d.). The third and final rule Kohonen’s learning law, developed by Teuvo Kohonen, was also inspired by learning in biological systems. In this procedure, the processing elements compete for the opportunity to learn, or update their weights. The processing element with the largest output is declared the winner and has the capability of inhibiting its competitors as well as exciting its neighbors. Only the winner is permitted an output, and only the winner plus its neighbors are allowed to adjust their connection weights (Artificial Neural Networks Technology, n.d.). Although there are many more learning laws out there, these three are considered to be the ones upon which the others are based.
As with the learning laws there are a multitude of algorithms being developed. These algorithms, also known as seen as the genetic makeup, are constructed and based on the application they are to work with. Although many have been developed, the most successful ones include; Kohonen Self Organizing Features Maps, Grossberg’s Adaptive Resonance Theory, and Fukushima’s Neogcognitron.
The basic premise of the Self Organizing Feature Map is that of a feature detector. It is based on competitive learning in a topology-preserving map that can be adjusted to represent the nature of the inputs (Self-Organizing Feature Maps, n.d.). In the SOFM, neurons located physically next to each other will respond to inputs that are also next to one another, which in turn allows for lateral inhibition and excitation. What this means is that if a target neuron is activated, all of its immediate neighbours will also become activated but to a lesser degree (Kaski, 1997). This provides a way to avoid totally unlearned neurons and it helps enhance certain topological property which should be preserved in the feature mapping. In every following cycle in which that same target neuron is still activated the size of the neighbourhood decreases, thereby precisely mapping the feature in question (Self-Organizing Feature Maps, n.d.).
The Neocognitron is an algorithm developed by Fukushima in an attempt to construct a neural network architecture based explicitly on knowledge about real brains (Neocognitron, n.d.). It is considered to be the most complex neural network architecture ever developed, and is also the most limited in its capabilities. The Neocognitron was developed to recognize handwriting and convert it to digital signals (Neocognitron, n.d.). The reason for its complexity comes from the immense variation in various persons handwriting style, and as a result it was necessary to make to algorithm as robust as possible. The design of the Neocognitron came from extensive study of the human visual system. Neuropsychological studies have shown that there are relatively few cells that receive input directly from the retina and that they are limited in function. Studies have shown that the local relationships between input neurons in the retina are topologically preserved in the organization of the neural pathways (Neocognitron, n.d.). Although mathematically-free information on how the neocognitron actually works is sparse, it is assumed that when a handwriting sample is presented to the neural network it attempts to locate the boundaries of the individual letters and match the patterns to similar ones stored in memory. When the samples are not very similar to the target letters in the database the neural net must approximate the unconfirmed features to confirmed ones and then probabilistically determine a correct output.
Unlike the previous two algorithms, Adaptive Resonance Theory was not developed for a specific task so much as it was developed to deal with a fundamental concern with artificial neural networks in general. Developed by Stephen Grossberg and Gail Carpenter, ART was designed to deal with the stability-plasticity dilemma (Adaptive Resonance Theory, n.d.). The stability-plasticity dilemma was a learning instability problem suffered by neural networks whereby scientists and programmers were unsure as to how a neural network would know when to apply its existing knowledge in regards to new inputs, or when it would be required to actually learn and adopt the new inputs as “learning material” (Adaptive Resonance Theory, n.d.). The weights which have captured some knowledge in the past continue to change as new knowledge comes in. There is therefore a danger of losing the old knowledge with time. The weights have to be flexible enough to accommodate the new knowledge but not so much so as to lose the old (Adaptive Resonance Theory, n.d.). The adaptive Resonance Theory deals with this problem by accepting input and classifying it into a category depending on the stored pattern it resembles most. Once the pattern is located, it is trained to resemble the input. If however the input does not match any stored pattern within a given range, then a new category is created by storing a new pattern similar to the input. Because of this, no stored pattern is ever modified unless it matches the input vector within a given fault range(Grossberg’s, n.d.). Because of this new method the ART has both stability, in that it can remember and recognize patterns, but it also has plasticity in that it can learn new material without forgetting the old. The original ART, known as ART1, only performed unsupervised learning in regards to binary input but the newly developed ART2 has been modified to handle both digital and analog (fuzzy) inputs (Grossberg’s, n.d.).
The emergence and continued development of unsupervised learning in Artificial Neural Networks is an important backbone to the development of true autonomous artificial intelligence. The human mind is itself the self contained seat of our consciousness and it is from its own evolutionary development that humans have reached the heights that they have. Because of this it makes complete sense that the easiest way to create and alien intelligence would be to understand our own physiology and then model it after ourselves. However, it is my belief that the current artificial neural networks that we have are but a child’s step towards our ultimate goal. It is difficult for me to believe that who we are is made up solely the firing of electronic impulses that travel from synapse to synapse. Although I don’t dispute that it is an integral part of our cognitive processes, it just seems too basic. What role then do our hormones and neurotransmitters play in regards to our cognitive faculties? I believe that in order for us to begin to learn about what makes us tick we need to have a larger crisper picture. The different topographic regions of the brain cater to various processing and cognitive functions, and it is the interaction between the different electrical and chemical action potentials and the various regions of the brain that gives us the ability to grow, learn, create, and think. Once we understand more of ourselves we will be in a better position to apply that knowledge and give birth to something in our likeness.
References
1. Adaptive Resonance Theory. (n.d.). Retrieved November 22nd, 2001, from
https://www.maths.uwa.edu.au/~rkealley/ann_all/node170.html
2. Artificial Neural Networks. (n.d.). Retrieved November 19th, 2001, from https://www.gc.ssr.upm.es/inves/neural/ann1/unsupmodels/unsupmodels.html
3. Artificial Neural Networks Technology. (n.d.). Retrieved November 21st, 2001, from
https://www.dacs.dtic.mil/techs/neural/neural5.html#RTFToC17
4. Grossberg’s Adaptive Resonance Theory. (n.d.). Retrieved November 20th, 2001, from
https://www.icsi.berkeley.edu/~jagota/NCS/VOL1/P3_html/node27.html
5. Jones, Stephen. Neural Networks and the Computational Brain. (n.d.). Retrieved November 20th, 2001, from https://www.culture.com.au/brain_proj/neur_net.htm
6. Kaski, S. Data exploration using self-organizing maps. (1997). Retrieved November 19th, 2001, from
https://www.cis.hut.fi/~sami/thesis/node18.html#SECTION00074000000000000000
7. Klerfors, Daniel. Artificial Neural Networks. (November 1998). Retrieved November 22nd, 2001, from https://hem.hj.se/~de96klda/NeuralNetworks.htm#2.2.3%20Learning
8. Neocognitron. (n.d.). Retrieved November 21st, 2001, from
https://www.maths.uwa.edu.au/~rkealley/ann_all/node225.html
9. Reingold, Eyal, and Nightingale, Johnathan. Artificial Intelligence Tutorial Review. (n.d.). Retrieved November 19th, 2001, from https://psych.utoronto.ca/~reingold/courses/ai/
10. Self-Organizing Feature Maps. (n.d.). Retrieved November 20th, 2001, from
https://www.nd.com/models/sofm.htm
11. Want to Try Neural Nets?. (n.d.). Retrieved November 23rd, from
https://www.zsolutions.com/soyou.htm