Yes you GAN (part 1)
This is a more technical post than elsewhere on the blog. If you’re interested in learning more about how to make images or videos with Machine Learning, then you’re in the right spot.
One of the main reasons why I’m studying the MA in Computational Arts at Goldsmiths is that it was one of the few universities to offer specialty classes in machine learning and computer vision. And, more specifically, the artist Memo Akten is resident and I’ve been a follower of his work for years, following his transition from working in Motion Graphics towards becoming a computational artist with keen interest.
His recent work has focused on GAN (Generative adversarial networks), which are closely linked to neural networks and AI. Ultimately, it’s about teaching computers to see and understand in the same way humans do.
For our first research assignment, Eddie Wong, James Quinn and myself worked together to explore an avenue of research and create an artefact as part of this process. The artefact could take the form of a research document, a documentary of sorts, or could be some manifestation of the theoretical practice that was being investigated.
Our group was interested in Hauntology, a relatively recent cultural theory from the 90s that was brought into prominence by a couple of notable theorists including Jacques Derrida and former Goldsmiths resident academic Mark Fischer. ”Hauntology is here not only concerned with the past’s manifestation and presentation in the present, but in how the present will become the past in the future.” (Hauntologies: The Ghost, Voice and the Gallery By Claire M. Holdsworth)
We selected a partly demolished social housing site, Robin Hood Gardens, London, to explore. Using archival photographs and resident’s accounts we reconstructed speculative visions of it using generative adversarial networks. Paired with this were testimonial resident’s perspectives on the estate, recorded by narrators. The end form of the artefact was a deconstructed documentary.
Our group’s abstract and final artefact can be viewed here, however in this post I’d like to go a bit deeper into the technical side. Hopefully this will enable a reader who has a similar passion but is lacking the technical skills I’ve picked up along this (short!) journey to start to play with GAN based image making.
Firstly, I’ve tried this on OSX and Windows and this method works on both. It is greatly accelerated with access to an Nvidia graphics card which allows you to use CUDA. Unfortunately this doesn’t work with ATI cards so anyone with a recent MacBook Pro doesn’t get the speed boost, even with an EGPU. Also, please note I’m writing this in late 2018. There’s a very good chance that the tech has evolved dramatically whenever you’re reading this, and you can probably just download a phone app called “Alexa Pics” that can do this automatically now. Hopefully not – in any case, keep reading!
This process is all based on a library created by Google called Tensorflow for Python. There are other similar machine learning libraries however this is good, fast, and it’s all I know how to use right now so let’s go with it. Christopher Hesse has adapted an implementation of Tensorflow into the pix2pix library and also provided handy tools to save the trained models and reuse them in live examples.
There is a tensorflow addon for Openframeworks on Linux and OSX helpfully created by Memo. Unfortunately I haven’t yet got proficient enough at OF to get that working, so instead I had to go the basic route and use Python to run pix2pix.
The method I used to get this working was:
– Install Anaconda with python 3.6 to your computer, on either OSX or Windows. Anaconda is a great python collection that wraps it all up in a friendly bundle for you so you don’t need to be a ninja to install all the elements separately.
– After it has installed, run Anaconda Navigator and create a new Environment. Give it a name, say “tensorflow” or similar, and Install python 3.6 in the environment. This will take awhile to process all the included libraries, let it do its thing.
– Once this is complete, you’ll see in your Anaconda Navigator you’ll have another environment in the column below “default”
– Highlight the “tensorflow” environment name, it will take awhile to make it active.
– Right click on the name “tensorflow” and Open a new terminal. This will open a new contained environment with all the correct environment variables set up so python 3.6 will work correctly and you can use the terminal command “conda install ..” (where the .. is a library name). It’s important to understand that if you just open a new terminal (OS X) or command prompt (Windows) in the usual fashion it won’t be a conda environment and therefore if you install any python bits and pieces they might not be recognised by python and then it won’t work. You can tell if you’re in the right environment if your command prompt (in OSX or Windows) has the text “ < tensorflow > .. “ next to it.
Once you’re there, type “conda install tensorflow”. If you have a windows machine with a decent Nvidia GPU (1050 or above, 20xx for best results), instead type “conda install tensorflow-gpu” This will get anaconda to install the tensorflow libraries. It’s very satisfying to watch command line install bars tick over. If you don’t have a Nvidia GPU then just use “conda install tensorflow.”
Now that we’ve got the environment setup, and tensor flow installed, we’ll use the Christopher Hesse implementation of pix2pix. Find it here and follow his install instructions. (For more detail into this process I recommend watching Gene Kogan’s lecture where he steps through this process in detail. This blog post is getting very long and I am not a very technically savvy instructor!)
In the next GAN post we’ll prep the images and get training.