Undergraduate Thesis Deep Learning Architectural Style Transfer

NOTE: This is my undergraduate thesis project supervised by Dr. Rynson LAU.

Source code on GitHub

Full report for download

(To be honest, all kinds of diffusion models currently on trend have made this project a waste of time lol)

The examples from various visual artists in this website perfectly interpret the meaning of 'transferring architectural styles', which I would like to automate with deep learning technologies.

An example from the link above, showing a Bauhaus Style (Right) transferred onto Buckingham Palace (Left).
An example from the link above, showing a Bauhaus Style (Right) transferred onto Buckingham Palace (Left).

Current progress:

The project is currently focusing on defining what "transferring" means for architectural styles, which is a key challenge in this field. The definition of style transfer in the context of architecture is not straightforward, as it involves both the visual appearance and the structural characteristics of buildings.

Tests on existing models including Artflow and Pix2pix have been conducted to evaluate their suitability for this task. The results show that popular style transfer models may not be directly applicable to architectural style transfer, as they tend to focus on color and texture rather than structural features.

This example by Artflow + WCT shows that popular style transfer models may not be suitable with this task as only the color and a small trace of texture from the style image (left) are being transferred (bottom right) on to the content image (top right).
This example by Artflow + WCT shows that popular style transfer models may not be suitable with this task as only the color and a small trace of texture from the style image (left) are being transferred (bottom right) on to the content image (top right).
This example by Pix2pix and CMP facade labeling creates unrecognizable images because of the large difference in appearance of windows between sample buildings in database (left) and input house (top right, the same as previous house).
This example by Pix2pix and CMP facade labeling (top right) creates unrecognizable images (bottom right) because of the large difference in appearance of windows between sample buildings in database (left) and input house (top right, the same as previous house).
This example video from using Pix2pix trained with 450 depth images generated by DiverseDepth is currently the best one, the smoothness of which is already impressive. However, there're still significant glitches and unusual movements.
This example video from using Pix2pix trained with 450 depth images generated by DiverseDepth is currently the best one, the smoothness of which is already impressive. However, there're still significant glitches and unusual movements.

Increasing the training set is the next step to improve the quality of the generated images and reduce the glitches and unusual movements observed in the current results.

Sample 1
Sample 2
Sample 3
Sample 4