Building a Neural Network from Scratch

Date: 7/30/2019

Tools: Python & Tkinter

To learn Python, I decided to code a board game called “Camel Up”. The game consists of five camels moving around the board. Every round, players bet on which camel they believe will be in the lead at the end of the leg. Using Tkinter, I designed a GUI to help visualize the game. When I decided to design a neural network, I revisited my code to build an AI opponent utilizing a neural network.

Building the Network

As I viewed the data at a single point in time, I started selecting pieces as inputs for the network. Tracking each of the five camels’ spaces started my inputs at five. During each leg of the race, each camel has a die that is rolled only once. Adding this status increased my total inputs to ten (five camel spaces + five camel roll status). Anytime a camel lands on a space already occupied by another camel, it is placed on top. Now, if the camel underneath moves, it also moves the camel on top of it. Tracking the number of camels above a camel increased my total inputs again by five. This brought my total inputs to fifteen. Each camel could receive three bets per leg. I now had twenty inputs. Each player could also bet on the overall winner and overall loser of the entire race making the total number of inputs twenty-four. Tracking the two players’ score increased the inputs to twenty-six. Finally, I also had an input indicating player 1 or player 2’s turn. The total inputs were now at twenty-seven.

Layers and Outputs

A player can choose to roll, to bet on the overall winner, to bet on the overall loser, or to bet on any of the five camels to win the current leg. This means the final output layer would yield sixteen options. To keep the program from taking too long, I decided on three hidden layers and twenty neurons for each layer with a sigma function as the activation for each layer.

Initialize Weights and Biases

I created a new program in python to create random values for the weights and biases anywhere between -10 and 10.

Writing the new code

After some time coding, I had a simplistic set of functions to handle backpropagation and training. With the code in place, I began playing games against myself. If player2 won the game, I would treat player2’s choices as the correct decision and backpropagate. (and vice versa for player1). After five games, I would take the winning player’s decisions and train the network.

Initial Results

After playing over thirty games, I was seeing little to no improvement. The cost coming out of the program was not decreasing and the decisions made by the AI were poor.

Changes

My first concern was that I was too inconsistent in my own choices. Was I always making the same decision in the same scenario? While I did want some stochasticity, I felt that my choices were too irregular. I wrote a new program to do a Monte Carlo-esque approach. This program would run the leg of the race five times and return the lead camel. It would then take the number of times a camel won and divide by five to find a rough probability of that camel winning. Finally, it would multiply that probability by if a wager could still be made on that camel. This solved two problems. First, it could now provide more consistent answers (that were also more accurate). Second, this sped up the time to play a game.

Second Results

After playing another thirty games, I still saw no improvement.

Second Round of Changes

I know had the program feed back in its answer to the program. Once the program started, it no longer required input to play an entire game. I set the program to play itself 1000 games (training after every five). I removed only taking the winner’s inputs since both players would be making almost identical moves if given the opportunity. I now started to see some improvement. With the randomized weights and biases, the AI started off by having an average cost of 6 over the first five games. By the end, the cost was down to the low 2s.

Another Set of Results

I became stuck here for some time. I optimized the code (started using numpy arrays and matrices) which allowed me to adjust values easier. The could only achieve a cost value as little as .75.

Analyzing Issues

After a few weeks of troubleshooting different issues, I found that the choice of roll was ending up at .5 while all other values went to zero. Since ‘roll’ was the most common correct choice, all the other values would drift towards zero. During backpropagation, the values close to zero only had a marginal affect to the previous layers’ weights or biases. The input values were also having issues. A change in the "roll status" of a camel (a change from 1 to 0), had almost no effect on the final output layer despite being a large change in terms of the game.

Solutions

Over the course of a game of Camel Cup, each player can only select one overall winner and one overall loser. For the game to end, the choice ‘roll’ must be chosen around fifteen times. I wrote a new function to run thousands of games while saving of the inputs and the correct answer but not training the network.

To solve the second issue, I took the inputs generated and normalized the inputs. Now all my values fell between 0 and 1. I also used ReLu weight initialization to avoid my weights or biases from getting stuck at zero or one.

Using the saved off inputs, I chose randomly from the sixteen options to train the network. This allowed the network to train off a diverse set of examples and prevented from over training on the most common examples in a game.

Results after Overhaul

After running 10,000 training sets, I saw a drastic improvement. The neural network was working!

Fine Tuning

To increase the accuracy, I expanded the number of neurons in each hidden layer to 90. To further improve accuracy, I generated a new set of inputs but ran the Monte Carlo simulation to 25 times per turn. While this did prolong the amount of time for the program to run, I only need to do this once since the inputs were being saved.

When analyzing the results, I found that the network was using a poor approach to bet on the overall winner and overall loser. When finding inputs, the program chooses to bet on the overall winner when the lead camel is 4 spaces ahead of the next camel. When it became the next player’s turn, the lead camel would still be 4 spaces ahead and it would also bet on the overall winner. The neural network started looking at if the other player had bet on the overall winner (and it had not). If it saw this, it would bet on the overall winner. This caused a scenario where if I bet on the overall winner or overall loser, the network would as well. I removed the inputs showing if the other player had wagered on the overall winner or overall loser. This resolved the issue.

Game Results

I played the AI in ten games with the AI winning five, losing four, and tying one.

Fine Tuning

To increase the accuracy, I expanded the number of neurons in each hidden layer to 90. To further improve accuracy, I generated a new set of inputs but ran the Monte Carlo simulation to 25 times per turn. While this did prolong the amount of time for the program to run, I only needed to do this once since the inputs were being saved.

When analyzing the results, I found that the network was using a poor approach to bet on the overall winner and overall loser. When finding inputs, the program chooses to bet on the overall winner when the lead camel is 4 spaces ahead of the next camel. When it became the next player’s turn, the lead camel would still be 4 spaces ahead and it would also bet on the overall winner. The neural network started looking at if the other player had bet on the overall winner (and it had not). If it saw this, it would bet on the overall winner. This caused a scenario where if I bet on the overall winner or overall loser, the network would as well. I removed the inputs showing if the other player had wagered on the overall winner or overall loser and this resolved the issue.

Further Improvements

Since I used my previous code from Camel Cup, a lot of the code needed to be fixed. I had over seventy global variables and a confusing structure. I managed to cut out most of the unnecessary global variables but some of the code could still be improved.

A softmax function may be better suited for the final layer. Opposed to the sigma function, which is looking for binary options, the softmax returns probabilities. The neural network confidently returns betting on the correct overall winner, betting on the correct overall loser, and choosing to roll. It is not always confident when returning a bet on a camel to win the leg. The neural network does consistently have the right camel as the highest output in the final layer, however it usually has an output below .9. Intuitively, this makes sense. The inputs being used to train the network are based off a probability, and do not always return the same value for the same scenario.

Link to the code in Github

Tkinter GUI screenshot

< Back to Projects