Have we reached peak GPT yet? It’s only been out for a month and it seems to have completely devoured the zeitgeist. OpenAI has had a ten billion dollar investment from Microsoft, Google is freaking out and minute by minute there are new examples of text-based AI weirdness emerging.
God, it’s a lot.
My own experience with Chat-GPT has, I must admit, briefly turned me into one of those annoying tech-bro types who talks only about the possibilities and none of the downsides of this technology. Here’s a basic rundown of my early experience using it and a (hopefully) slightly more clear-eyed take on what it means.
Like a lot of people who first use ChatGPT I asked it to write me a couple of haikus about various topics, and then re-write them in the style of Shakespeare, then Hunter S Thompson.
Following these “creative” examples, which remind me a lot of style transfer models in an image domain, I started using it as a “first google” replacement to solve basic programming tasks. I’ve never used Copilot (and I hear that is excellent for programming tasks), but ChatGPT (and particularly GPT4) does a great job of writing basic code (particularly python).
It does a reasonable job! About halfway through my testing with it they released GPT4 and the step up in quality was noticeable. Quality in this case means code that runs almost first time, it clearly comments the code section by section and when you ask it to follow up or explain what it’s doing it’s able to talk you through what each bit does.
It’s important to note that the output often has errors – and as I haven’t come up with the code in the first place, debugging presents a different challenge. GPT4 is much better at it, but still comes up with the occasional incorrect bit of syntax. Or, more often, once the program gets to say a hundred lines or so, it starts to forget the earlier parts of the code. I think this is to do with the context window of tokens filling up although I might be wrong about that. This makes debugging weird as all the code looks correct, it will just output variable names that haven’t been used before, or refer to functions that it hasn’t actually coded.
Ultimately, this is just a Large Language Model built to predict the next character in a sequence of characters. It’s easy to get trapped into the hype, and the language used by OpenAI and AI advocates in general (and while I don’t consider myself a wide-eyed optimist / salesperson for it, I do definitely see positive use cases for the technology) around ChatGPT is problematic and leans into the talismanic idea of AI as actually some kind of intelligence. Words like “Hallucinate”, “Remember”, “Forget”, “Explain”, “Show” point towards a theory of mind which needs to be carefully watched out for. It’s very useful for OpenAI and the other main actors in this space to diminish responsibility for the output by using words like “hallucinate” (while also drawing on the language of mental illness!) when these companies will take all the money and try to shape regulation in their favour whenever there’s a positive use case discovered for the tech.
That being said – it is definitely a new experience to be able to ask a language model to act as a socratic questioner and get it to pose questions to you about whatever topic you like, and follow up on the questions. I’ve used it to think out loud on my dissertation topic, and it “feels”, in a completely subjective aesthetic way, good to interact with. It’s clear that OpenAI has a solid “human in the loop” training process for the network as the cadence of the output feels like a conversation. Even as I describe this qualitatively, it feels like I’m falling back into the trap of anthropomorphising the model. I think this is a natural response to the nature of neural networks in general, and something I’ve always found using them. When they glitch out and error, the errors take an aesthetic form that feels evocatively organic. ChatGPT is a model designed to emulate the text-based interactions taking the form of question and answer. No wonder if you play by these rules it functions as what NK Hayles might call a “cognitive assemblage”.
On the nature of errors, I think some more interesting stuff is happening with the under-cooked products that Microsoft and Google are building and beta-testing (almost simultaneously) to try to catch up with what OpenAI is doing. Bing, in particular, is exhibiting fantastic examples of the weird aesthetic of machine-learning errors. Bing has been trained on huge dataset, including a lot of Reddit, and it seems that emojis have been included in the token (alphabet) database. When a machine learning model goes wrong, it can get trapped in stochastic loops – and because the pattern is based on language, those recursive spirals can look like, well..
Anyway there’s so much to digest with this technology, and it’s moving so fast that it’s making the dissertation really difficult to write as the dimensions of the technology keep changing. I’m sure I’ll rant more about this soon.