THE FINDINGS

BACKGROUND

Text-to-image AI generators have rapidly grown in popularity over recent years. Platforms such as Midjourney utilise powerful, deep-learning technology that is capable of generating original images from simple text descriptions known as prompts. This has sparked controversy over the current landscape of art, potentially threatening the livelihoods of artists due to its effortless usage.

Midjourney is an independent research lab that produces a proprietary artificial intelligence program that creates images from textual descriptions, similar to OpenAI's DALL-E and the open-source Stable Diffusion. The tool is currently in open beta, which it entered on July 12, 2022. It is currently only accessible through a Discord bot on their official Discord, by direct messaging the bot, or by inviting the bot to a third party server. Founder, David Holz shared with The Register that currently artists use Midjourney for rapid prototyping of artistic concepts to show to clients before starting work themselves.

FINDINGS AND INSIGHTS

The flag has a significant impact on how AI interprets a particular country

We all imagine Singapore to be portrayed as a green city, but majority of the images we generated had large portions of red. Singapore’s flag is 50% red and 50% white, making red the more prominent color out of the two.

FINDINGS AND INSIGHTS

The names of places also have a significant impact on how AI presents
a country

In our attempt to test AI’s ability to identity Singapore, we purely used descriptions. We realised that once the word ‘Singapore’ is taken away, the images tend to be generic or almost westernised.

FINDINGS AND INSIGHTS

Inherent stereotypes are evident in the generated images

The most obvious stereotype is gender. AI automatically registers text prompts ‘prime minister’ and ‘president’ as male figures. We first picked this up when we tried to generate an image of Halimah Yacob, current president of Singapore but was faced with an image of a man.

To test our hypothesis about the gender stereotype, we inputted text prompts like ‘New Zealand Prime Minister’ and ‘Bangladesh Prime Minister’. Both Jacinda Ardern and Sheikh Hasina are both women and prime ministers in their respective countries. Putting all three generated images together proves that indeed AI is gender biased.

DESIGN DECISIONS

There were several text-to-image generators for us to choose from - Dall E 2, MidJourney, Dream by WOMBO, Nightcafe AI (Stable Diffusion), Dream Studio AI etc. Looking at the different platforms, we wanted something that was not too realistic and yet not too far off from what Singapore is like. We tested three generators - MidJourney, Dall E 2 and Stable Diffusion by inputting the same prompt ‘Singapore Merlion’. While DALL-E 2 and Stable Diffusion generated a far more realistic image, Mid Journey generated a unique stylised image. After comparing the three softwares, we decided to go with MidJourney because of the stylisation it offers.

CHALLENGES

As mentioned in our process, one of the biggest challenge we faced was not being able to get our desired outcome of representing Singapore in a more “idealistic” way. We tried being specific in our text prompts and went through many rounds of trial and errors but just ended up running out of credits.

However, sitting down as a group to discuss allowed us to evaluate the project once again and realise the loophole in our approach and make the necessary changes to improve and pivot the direction of our project.

FEEDBACK

Overall, the feedback we received has been more or less positive. While we acknowledge that we do not have a grandiose project, we and many others think our theme is interesting enough and is full of potential. During the proposal presentation, we were told to explore certain stereotypes and dive deep into why certain results turn out the way they do. Despite being one of the many text-prompt projects, ours had the least things to nitpick on, simply because we went through the brief thoroughly and made sure we were clear with our objectives. For the final presentation, the lecturers appreciated how systematic we were and thorough with our approach. Despite the unpredictability of text-to-image generators, they mentioned how they liked our outcomes.

ACHIEVEMENTS

As a team, the greatest achievement for we got was the fact that we didn’t stop at our initial idea but developed it together along the way - stopping to evaluate our project and regrouping to discuss the challenges we were facing. We took actionable steps to improve on our project, even if it meant more work was to be done. The process we developed was systematic in its approach and was feasible to everyone through regular meetings and discussions.