I’ve been a fan of neural networks since my friend’s computer scientist dad first explained them to me when I was in elementary school.
He was doing research to map the brain of a roundworm inside a computer; why build a neural network from scratch when you could just copy one already in existence?
How are neural networks different?
As I grew up, I continued to be fascinated by this alternative form of computing.
In a standard computer, everything is built up in terms of logic. A computer chip is a series of logical gates – AND, OR, NOR, etc. Programs merely pass instructions across these gates to produce a certain, expected output.
Because of this, higher-level programming languages we know and love can be reduced to an incredibly finite number of characters (due to the tiny number of logical gates required to actually represent machine language). A few developers have even demonstrated how just about every command in modern JavaScript can be represented with just 11 characters – [cci][]()+!{}/.,[/cci]
Neural networks, on the other hand, don’t use logical gates. Instead, they use the weighted connections between virtual “neurons” to determine whether a bit is on or off (whereas a logical gate would use a hard-coded truth table).
Machine Learning
The power in neural networks lies in the fact that they can be trained. You can build a neural network by creating an array of inputs, building up a number of hidden layers for data translation, and exposing an array of outputs. You then give it a set of known data and, thanks to an algorithm called “backwards propagation,” you can teach the neural network to produce the correct output array.
For example, say our input is a 10×10 grid of pixels. We use this grid to analyze a handwritten English character, assigning the value 1 to each input bit where a pixel contains ink and 0 to each bit with no ink (or a small amount of ink). The network exposes an array of 26 potential output bits – with each bit representing a specific character in the English alphabet (ignoring case for now).
We train the neural network by giving it a sample character, checking its output against our expectations, and backwards propagating the results to adjust the weights of each neural connection. After a number of training runs, the neural network will be able to view the training set and correctly identify each character.
What’s more, it will be able to analyze unknown test data and, with a high accuracy, classify previously unknown characters with their alphabet equivalent.
Potential Applications
I’ve used neural networks to successfully break captchas by building my own OCR software. I’ve used basic networks to classify emails as spam or not spam based on tokenized keywords. I’d like to use networks to help digest and automatically classify articles based on categories, keywords, and relevancy.
I find this field incredibly fascinating and am hunting after an excuse to use neural networks with my day job – or just built on top of WordPress/CMS in general. What potential applications can you think of?