Training Artificial Intelligence

This is a story about computer brains...

Screens

Let us imagine that I want to set my computer a simple task: I want it to point to the end of my nose. My computer doesn't know what a face is, what a nose is, or even that it can and should point to a certain part of my anatomy. I could write a program which tells my dumb computer exactly what to do - how to divine the position of my nose from an image, and then the relatively straightforward job of then asking the computer for the co-ordinates of the end of my nose, once the nose has been located. I could also use artificial intelligence, and more importantly machine learning.

Why does milk taste sweet?

You might not think of yourself as having a very sweet tooth, but in fact you used to have an insatiable appetite for sweet things. Your body is programmed to seek the wet, fatty, sweet goodness of milk, which provides the perfect sustenance for your growing brain and body. The reward circuitry is self-reinforcing and gives you a hit of dopamine every time you suck on the tit and get a mouthful of your mother's milk, which causes the neural pathways in your brain to become stronger, while others are pruned away. Eventually, you become hard-wired to stuff fat and sugar into your mouth, when to begin with you had only the reward part.

That's machine learning.

We need to give our machine - our computer - a reward. Let's say that for a high-resolution digital photograph of perhaps 8 megapixels, assuming that most of the photos we give the computer will have the nose tip somewhere around the middle, the worst possible guess would be somewhere near the edges. We can set up a super simple reward for our artificial intelligence, giving it a hit of computer dopamine every time it guesses a point somewhere in the middle of the photo.

Obviously this is very flawed.

Very quickly, our artificial computer brain will learn to make guesses in the middle. Even though our computer doesn't know what "middle" is, it will quickly become hard-wired to guess in the middle of the photos we show it, because that's how we set up our reward system.

The guesses are not close to the tip of the nose, but they're a lot closer than if we just let the computer guess completely at random.

We need to refine our reward system.

So, we take a library of thousands and thousands of photos of people's faces, then we record the location of the tip of the nose by manually clicking on the tip of the nose ourselves. We create a huge database full of the correct locations of nose tips, created by humans.

Then, we set up the reward system to reward guesses which are close to the correct location of the nose tip. The closer the guess, the bigger the reward.

Now, we train our computer system with the big database of photos and nose tip locations. Every time the computer guesses, it gets a reward based on how close the guess was. We can run the training millions and millions of times. We keep doing the training until the computer gets really really good at guessing the location of the nose tip.

Remember, the computer has no idea what a face is, and it has no idea what it's really doing. Nobody wrote a program instructing the computer how to do anything. The truth is, nobody really knows how exactly the computer is getting better and better at figuring out where the nose tip is. Nobody could predict how the computer brain is going to wire itself up. The computer sees all those thousands and thousands of photos, which are all very different in immeasurable ways, and somehow it begins to make associations between what it 'sees' and how it should intelligently guess the location of the nose tip.

That's a neural network.

The really interesting thing that happens next, is that we show the computer a photo of a face it hasn't seen before, and it's able to guess where the nose tip is. We use the same artificial intelligence with a brand new face which the computer hasn't been trained to locate the nose tip of, and it's still able to figure it out, because the neural network has hard-wired itself to be really good at fulfilling the rewarding task of pointing to nose tips.

There's nothing particularly amazing or hard to understand about machine learning and artificial intelligence. We're simply training our computer slaves to do simple tasks, by setting a quantifiably measurable reward system, so that the neural network can practice for long enough to get good.

The predictive text suggestions on your phone come from machine learning, which has seen vast quantities of stuff written by people, such that it's fairly easy to guess the word that's likely to follow, based on what you've just written.

So, what about training a computer to be more human and be able to have a conversation? How are we ever going to pass the Turing test and trick somebody into thinking they're talking to a real person?

We need to come up with a way to train artificial intelligence to speak just like a person.

Every time you use a text-messaging service to have a conversation, that data is harvested and analysed. Quintillions of messages are sent between people every year, and all that data can be fed into a machine learning system to train it to come up with typical responses to things people say. Google Mail makes absolutely brilliant "canned response" suggestions, which are usually totally appropriate for the context, because Google has seen quadrillions of banal emails saying little more than "thanks and kind regards". Google employees don't read your emails because they don't have to - a machine does it and it effervesces the very essence of your exchange, such that it knows whether you should reply "love you too" or "see you in the office tomorrow".

It's of no particular use - beyond the present applications - to have so much aggregated data, unless we want to have a very bland, homogenous and unsatisfying experience of life. We we slavishly obey the conclusions of vast pools of data which have been analysed, we'll end up in some sort of dystopian nightmare where are life outcomes are decided at birth, using available data, and reality will become like a piece of text composed by the predictive suggestions your phone came up with.

As an example, I'm going to generate a random number between 1 and 437, which corresponds to the page number of the novel I'm reading at the moment, then I'm going to generate another random number between 1 and 50 and use that word as the 'seed' for the predictive text feature on my fancy brand new iPhone Xs.

Let's go...

Ok... page 35 has been randomly selected.

And... word 17 has been randomly selected.

The 17th word on page 35 of my novel was "of".

OF COURSE IT WAS "of".

Chances are, it was going to be "a", "an", "of", "the", "is", "as", "to" and any one of a zillion different super common words. Let's use the word "sycamore" because it was on page 35.

Here's what my phone just generated:

"Sycamore is a good one and I have had to go back and I get the feeling of being able and then they are taught to work and they have had to do it a little while I’m not gonna was a very long and I have a very good relationship"

Clearly, machine-generated text leaves a lot to be desired.

Critically though, do we really want a single machine mind which can spit out decent text, or do we actually want personalities? Do we want a single generic face which is composed of the average set of features from all 7.6 billion people on the planet, or do we want variation?

Thus, we arrive at the conclusion that we should all be training an artificially intelligent system capable of machine learning, to be just like us, as an individual. It's no use that Google harvests all our data, because it aggregates it all together. It's no use that all the messages you've typed on apps from Apple, Facebook and Google, all the emails you've written and all the documents you've created, are absolutely fucking useless because they contain very little of your personality. Most of the messages you wrote were about food, sex, your children and your pets. Most of the emails you wrote were about the bullshit made-up numbers you type into spreadsheets all day long, which you call your "job". None of it is any use to train an artificially intelligent system to think and act like you.

I haven't figured out the reward system yet, but I'm building up a huge database of stuff I've thought. This stream-of-consciousness seems like utter madness, but I've very deliberately expressed myself in a certain manner: pouring my inner monologue out onto the page. It's ridiculous egotism and something which lots of writers have fallen foul of over the centuries, believing they're immortalising themselves with their words, but we are in an unprecedented era of exponentially growing computer power, yet most of our efforts are diverted into meaningless exchanges which expose very little of the interior of our minds.

173 years ago, Henry David Thoreau built himself a cabin in the woods, lived alone with his thoughts and wrote perhaps 2 million words in the journal he kept for 24 years. It's highly unlikely that his handwritten text will ever be digitised because of the incredible effort involved. By contrast, my 1.1 million words have been extensively search indexed by Google and other search engines, and my digital legacy is conveniently stored in 'the cloud' with perfect fidelity. While most people have been wasting the gifts of the information age by asking their partner if they need to buy bread and milk, I've been gaining what can only be described as a head start in the race to be immortalised by advances in machine learning and artificial intelligence.

How the hell did you think they were going to get the contents out of your brain and into another [artificial] one? Did you think it was going to require no effort at all on your part? Did you think that somebody was going to invent some kind of data-transfer cable?

Yes, it's horribly arrogant to think that anybody would have any interest in spinning up a digital version of me, but you remember that bullied kid at school who everybody hated and ostracised? You remember that you called that kid "nerd" and "geek" and generally abused them because they were good at maths and physics? You remember how you made it your mission in life to make their life as thoroughly miserable as possible?

They're your boss now. They're rich and you're poor.

Those geeks and nerds are suddenly on top of the pile.

You thought you were top dog when you were a kid, but now you're getting left for dust. You're being left to fester in your own filth. You're the underclass.

All of those skills you developed in bullying and abuse quickly became redundant, when all that geek stuff became highly lucrative.

Those late-gained skills of using Facebook groups to share your vile bigotry amongst your fellow thick-skulled dunderheads, has done nothing except line the pockets of the geeks. The geeks have been using the internet for decades to discuss the creation of a better world, where the knuckle-dragging primitives who thought they owned the playground, have somehow been left unaccounted for by 'accident'.

I'm not a big fan of the social exclusion and elitism which is emerging at the moment, but I'm damned if I'm going to stop keeping my technology skills up to date and investing my time and energy in my digital persona. Having put up with a lifetime's worth of bullying during my childhood, I'll be damned if I'm gonna meekly shuffle off into a quiet corner now. I'm sorry that you weren't paying attention when the world went digital and now it's super hard for you to catch up, but that's what happens when you're too busy making vulnerable people's lives a misery, to notice that you're wasting valuable time.

Every word I write is harvested by thousands of computers which comprise part of 'the cloud' and although billions of webpages are lost every single day, content is king and my 1.1 million words can be easily copied from one place to another, unlike the contents of your brain.

This might be a core dump of the contents of my mind, hurriedly written down in a state of kernel panic but it's taken a huge investment of time and effort, which unfortunately has always been required to achieve anything. Without the large databases manually created by humans, the machines would have no datasets to learn from and artificial intelligence wouldn't even be a thing.

Tags: #computing #writing