At Smashing Boxes, we want to be an agency at the forefront of next generation tech platforms such as Internet of Things (IOT), Artificial Intelligence (AI), and Machine Learning. How can one leverage Machine Learning for a marketing website? Read on!
If you visit our homepage or the labs page, you’ll see pink and blue shapes moving fluidly in the background. No two visitors will see the same thing — and over time this animation will hopefully refine itself to be more pleasant to our users. That’s because this animation is learning to increase visitor interaction with every visitor we get!
At Smashing Boxes, an important part of our company culture is to allow innovation and creativity through our Labs program. Labs is a special time every week where we get to explore initiatives or projects of our own choosing. Djarvis (the code behind the undulating background on smashingboxes.com) started out as a labs project. But what is it exactly? Djarvis is a Machine Learning program whose mission is increase conversions and user interaction on our site by directing the background animation.
As is the case with many labs projects, the direction I started out in ended up being different than the place I ended up. The initial concept behind Djarvis was simple: develop a program based on Machine Learning (ML) techniques and train that program to make artwork — preferably artwork that looks pleasing to the eye. And that’s where the problem started.
If you’re a follower of the developments in machine learning, you might have broken into a cold sweat as I described my intentions. Machine Learning algorithms are very dependent on quality training data. As you can probably guess, getting a large enough dataset to make an intelligent artist might be difficult — because, after all, what is good art? More importantly, how do you measure good art? Art, almost by definition, could be considered the most subjective thing ever created by humans.
Before I dive further into the winding road I took to create Djarvis, I’d like to jump into bit of ML vocabulary first, for clarity’s sake. There are two important types of Machine Learning algorithms to know: supervised and unsupervised. The practical difference between the two is simple: supervised ML algorithms require training data to be collected ahead of time, before the algorithm is trained. Unsupervised ML algorithms, on the other hand, don’t require any data up front, but they require the engineer to programmatically define the conditions for success. This means that if we’re training a supervised ML algorithm, we need to be sure we’ve got high quality training data before we can train it — whereas if we’re training an unsupervised algorithm, we have to know how score it on its performance, before we even know what the incoming data might look like. Each class of machine learning algorithm has it’s own drawbacks, and each has use cases to which they are more suited.
Initially, I wasn’t going to make an artist; I was going to make an “art judge.” My first sketch of an approach was to build a Convolutional Neural Network (CNN) which would be able to judge if a particular piece of art was “good” or not. CNNs are a type of supervised learning, so I realized I would need data beforehand. After a quick reality check, it was apparent that sourcing data for this idea was not easy; I’d need not only to have a substantial library of artwork, but metadata for each piece that would give a score to each item. A good quality training dataset would several thousand such artwork entries for any kind of decent return.
Nobody’s got time for that. Back to the drawing board.
What about a ML algo that could create artwork? In theory that could be unsupervised as long as I had a way to score the output — a voting system where a user could score an image might do the trick. I wrote a proof of concept along those lines and showed it around. It became quickly obvious that to actually train the network this way was going to take a prohibitively long time. Machine learning algorithms typically want a lot of rounds of training before the algorithms becomes solid — a fairly conservative number would be somewhere in the neighborhood of 100,000 rounds. Doing some quick guesstimation yielded unacceptable results — assuming it would take a user 30 seconds to vote, and the training user had nothing to but vote on images for 10 hours a day, it would take 83 solid days to fully train the network!
Again — nobody’s got time for that.
That’s when another type of ML algo started to make more sense — an unsupervised algorithm subclass called “Reinforcement Learning” — or more specifically, an algorithm called “Deep Q Learning.” Deep Q Learning is a go-to strategy for creating autonomous game-players. Essentially, Deep Q Learning algorithms are dropped into a scenario, given a set of inputs and a method of scoring their performance, and then are prompted to select an action out of a predefined set of options. After that, the algorithm attempts to learn from the result of its action, taking into account any change in inputs it observes, and it’s single score metric.
For Deep Q Learning, the score metric is critical to the algorithm’s behavior — for the first few hundred training rounds, the algorithm is essentially choosing random actions and observing the output. But after a while, it starts to recognize that actions it takes when certain input parameters are present increase its score (while others decrease it). Perfect for an autonomous game playing robot. But how could we gamify the creation of art?
I’m going to fast forward to what Djarvis actually is and does now, because the in-between parts here are really just a lot of back and forth involving implementation details and technical stuff. Let’s get to the meat of the issue — what is Djarvis?
Djarvis is a Deep Q Learning algorithm with a display built in WebGL primarily using GLSL Shaders. You can see it running in the background on the smashingboxes.com home page, the labs page, and one or two other pages around the site. Djarvis has 10 different “interaction points” — essentially numbers which it has control over which affect the visual characteristics of the background. The algorithm itself is scored based on user behavior on the page; we’re aggregating how long the user interacts with the page, how far they scroll, if they interact with interactive elements of the page, and most importantly if they click on the “contact us” button, into a single score metric. Djarvis keeps its eye on this score, and then makes micro-adjustments to those interaction points, with the intent of increasing its score.
Djarvis’ deep learning “brain” actually runs directly in the browser on the end-user’s computer, so left like that, anything that Djarvis actually learns would only apply to that user. However, we’ve built the capability into Djarvis for it to save its “brain” every now and then to a server. Then, when Djarvis actually boots up for the first time each session, that saved “brain” is actually imported into the new session so each user that visits the site starts their visit with Djarvis’ “memories” intact from the previous visitor. This also means that each visitor to the site actually improves Djarvis — well, that’s the theory at least. Djarvis is in the beginning stages of learning how to interact with visitors to the website, so we’re expecting to have future updates that will improve its ability to interact with users.
Djarvis is an on-going labs project. To stay up-to-date on the most recent changes, you can visit the labs page for Djarvis.
We are Smashing Boxes. We don’t just build great products, we help build great companies. LET’S CHAT.