Font size

+ -

Line height

+ -

Yes you GAN (part 2)

Preparing images for training

For our artefact we used paired images, creating a Conditional GAN. Conditional meaning we give extra Conditions (labels, contours, anything) that helps the Gan categorise what’s important.

Images must be 256×256 pixels, so, when paired with the condition (label) image, you’ll have a 512×256, A + B image. The A image will correlate to the B image and look different in some way that explains something about the image.

We used 2 image pair sets. The first image pair had label colours (windows were a colour, doors another, windowsills another etc) next to the photo these were taken from.

The second pair set was contour based, the A image had an edge detection effect applied and the B image was the original photo.

Training image pairs – (R) Coloured architectural elements and (L) Outlines

Once you’ve assembled the paired images, you put them in the appropriate folders (again, following the step by step guide on Hesse’s github).

Here is where it helps to have a good GPU, or multiple, if possible. It really helps make this training process an engaging process. You can rapidly prototype image pairs and see how the GAN responds. You might not be providing it with enough information, or too much. It’s fascinating watching the results start to pour out, it’s like watching someone learn how to see in real time.

Our cGAN training results using a labelled (with colour codes) conditional image.

After around 50 training epochs you’ll be able to judge if your GAN is giving interesting results. For our experiments, we found the labelled set worked much better than the edge detection set. This is probably due to the lack of different images we had, ideally to train a GAN you will have thousands of unique images at least, whereas we only had around 10-20 images that we took crops out of.

You can see the results of our GAN training here.

Our cGAN training results using an edge detect conditional image.

Once you’ve trained a GAN you can test it with new images, and this is where you can do some interesting things. We created a new set of templates based on high rise buildings, and got the model trained on RHG to recreate these.

It’s interesting to note how well the GAN took note of the bright windows from the night photos. Labelled with a yellow colour, whenever one of these would pop up in the new label image the GAN would instantly decide the whole image was at night and change everything to fit the night window in.

Once trained we can show the model new labels and it will generate what it thinks the “original frame” would have looked like based on it’s “memory” of the training data.

Once you’ve got some models trained you can export them as “static” models which can then be used in interactive applications. I’m attempting to learn how to do this now, but for more information on this I’d recommend starting here.