I recently spent some time going through “Neural Networks from Scratch” where a guy goes over how to build an AI without any of the additional tools at your disposal. I’ve learned a lot about what building neural networks looks like. And I find that learning about it has in some sense lead to more fascination, while at the same time demystifying it to the point where I can understand why some people are totally unimpressed with what AI can do.
Rather than go over the menutia of it. I think it’s informative to reflect on the general vibe of what I’m seeing.
Essentially, whats happening is the computer is creating a group of matrices to multiply together in clever ways that do something useful. That is my assessment of what AI is. once you get past how back propogration works, and once you get past how matrix multiplication works. the rest of it is just finding ways of massaging data to fit your needs. I think it’s, in it’s own way sort of fanscinating how such a simple process can lead to such substantial results.
Having said that, “massaging the data” is a process that takes years of research to figure out how to do right. In reading the book, I learned about some of the finding from that research. And what I mean by massaging is that you will take something that a relatively basic mathematical operation, and layer them over each other so that the parameters are still set properly. But I always believe that thinking about this is terms of principles is going to take you further than, like I said, getting into the menutia.
lets take the simplest neural network I can think of. You are a being sitting on a square with one square next to it that contains food. You’re goal is to get the food. You essentially have one choice, move in the only direction available towards the food. That’s a pretty straight forward propositions.
But let’s imagine for a second that you have no concept of what you’re doing. all you know is that getting food is good, and what you’re able to do. Let’s say you have the option to either stay in one place or move in any direction available, and you choose one of those options randomly. you might get lucky and decide to move, thereby, receiving the reward. but that cannot be assumed. and in this universe, you have no concept that there’s something next you. You can’t see, hear, or feel anything around you. the only perception you have is that you’ve made it to the food, and that that is good.
Now, lets complicate it a bit you are a dot on a 3 dot wide line. And your goal is to find the electronic food located somewhere else on the line in as few moves as possible. Your only options are to move left, or right. In this example, you essentially have 2 choices. Almost. if we include staying still as a option, and if we include moving up or down as an option, then you have 5, on your first move. Remember that if you make a “move,” you have no concept that the move you made did anything. So if you’re in the center of a 1×3 grid, and you decide to move up, you will have no concept that you haven’t moved. all you can do it make a decision at random for what to do, and count the number of times you move before you gain the food, then change how you move next time to accommedate that.
I’ve been going with the most rudimentary examples I can think of for the sake of this exercise, but you could keep uilding up uinder you get to something that resembles intelligence. Moving this example up to a 10×10 grid, or change the grid each time, and changing the location of the food based on some unknown rule, or based on no rule at all. simple doing it randomly. And then creating decisions based on, not what movements to make, but how to decide which movements to make. let’s say we modify thw rules so that you can see the squares around you and you can remember what moves you made in the past. or perhaps you given a set of hints as to where the food is. Once the parameter’s reach a certain level and the hints reach a certain level of complexity, then you start getting into the area of chatGPT.
But the fascinating thing about this for me is that while these programs are able to do these almost magical things, they do so on such simple principles taken to a certain extreme. there’s a conversation that could be had about how this idea that insignificant ideas ca be spread to greaater heights can be applied to other field. the basic principles of capitalism, for example are mostly very simple.
Hello! I hope you’re having a great day. Good luck 🙂