RC W6D2 - Be careful what you wish for

The next video I watched in Andrej Karpathy’s Neural Networks: Zero to Hero series is where he starts building `makemore`, a model that takes in words and ‘makes more’ words like it. In the video, the model uses human names as training data and generates words that sound like human names.

It’s a bigram character-level language model, in which a single character is used to predict the next character. For example, with the name Emma, the model would use E to predict M, M to predict M, and M to predict A. It starts out with a Markov transition-probability model as a baseline, and progresses to a 1-layer neural network model. The model architecture gets more complex in later videos.

His approach emphasizes the simple building blocks of neural network models. From the engineering perspective, it appears that a lot of the heavy lifting with libraries and frameworks is getting the calculations to run fast at scale. I haven’t used Tensorflow and Torch at length to make a comparison, but I get the impression the latter is loved for having better UX. I admit it’s also cute seeing Karpathy do Youtube influencer poses on the video covers.

As with functional programming, it’s fun to read around the topic. I enjoyed Lex Fridman’s framing of deep learning as the extraction of useful patterns from data, and the analog to heliocentrism vs geocentrism in forming simpler and simpler representation of ideas.

What’s interesting about the broader topic of AI is that, unlike functional programming, everyone has a say. This is understandable given the broader societal implications. According to Politico, the release of ChatGPT has pushed the EU back to the drawing board when it comes to regulation. When asked if it's dangerous to release ChatGPT to the public before we fully understand the risks, Sam Altman responded by saying it’s even more dangerous to develop in secret and release GPT-7 to the world.

This highlights how difficult it is to make predictions about the future, as per this podcast interview of Sam Altman.

I think it’s interesting that if you ask people 10 years ago about how AI was going to have an impact, with a lot of confidence from most people, you would’ve heard, first, it’s going to come for the blue collar jobs working in the factories, truck drivers, whatever. Then it will come for the low skill white collar jobs. Then the very high skill, really high IQ white collar jobs, like a programmer or whatever. And then very last of all and maybe never, it’s going to take the creative jobs. And it’s going exactly the other direction.

Later that day I chatted with a friend who wants to find use cases for GPT but hasn’t been motivated to dig up the more boring tasks at work. I responded by saying how my compass at RC is “is this fun or does this feel like work”, and the dream is to have work be fun! The conversation continued as follows.