NOTE: This is a feature that was originally published in the December 2022 issue of the Dee Dee Zine. For much, much more deep insight into everything to do with video games, art, and Japanese culture buy the magazine here (or any of our other magazines from our online store). And don’t forget, until January 1, 2023, you can get ALL of our magazines 50% off, simply by using the code “MERRYXMASALL” at checkout.
Over the past year, the public presence of image generation technology powered by artiﬁcial intelligence has exploded, going from a niche research topic to – according to some proponents – the inevitable future of artistic production.
Powered by artiﬁcial neural networks that allow computer programs to output images based on written prompts from the user, engines like DALL- E, Stable Diﬀusion, and Midjourney can produce convincing-looking images depicting almost anything imaginable in practically any style. With just a reasonably high-end graphics card, or a web service that can provide such processing power for a small fee, anyone can start turning their written prompts – ‘anime girl, boobs, big boobs, super detailed, ultra-detailed, octane render, 4k, Unreal Engine, trending on Artstation’ – into images with next to no technical ability required. as seen similar apocalyptic events in the past.
Such technology represents a massive paradigm shift in artistic production, taking labour out of the hands of individual artists and automating the production of images. The art world has seen similar apocalyptic events in the past. However, AI represents the ultimate realisation of a larger shift, a change in the way in which we see the world itself that has been present in our collective subconscious since the advent of the Web.
The AI technology used in these generators works by simulating the physical processes of learning that occur in the brain and turning these processes into a zero-sum game. A GAN – generative adversarial network – consists of two simulations of neural networks that compete against each other to carry out a particular task, like generating an image realistic-looking enough that one of the networks is ‘fooled’ into ‘recognising’ it as real. To accomplish this the neural networks are trained on vast amounts of data: billions of individual images in the case of some image generation engines.
Some zealous proponents of this technology claim that, given enough tweaking and advancement, AI will quickly make human artists obsolete. In turn, many critiques of AI image generation have been authored by artists and other concerned parties. Critics of AI are quick to point out that the billions of images used to train the neural networks were not ethically sourced. Images from all over the web were scraped and fed into the network to create connections. This includes artwork by artists who didn’t give permission for their work to be used in such a manner. Cultural and racial biases are also reinforced in these training sets: asking for an image of a ‘scientist’ or ‘engineer’ with no further qualiﬁers will usually produce an image of a white man, for example.
In the swirling discourse surrounding AI and its eﬀects on the arts and society in general, the emergence of image-generation engines has been compared to the advent of photography and its eﬀect on painting. With photography, forms of art like portraiture and landscapes became democratised. Sitting for a portrait in the early days of photography took far less time than sitting for a painting: painters took up to a year to deliver a ﬁnished product, while a photographic print could be produced far more quickly.
After photography, though, painters didn’t become completely obsolete. They adapted and eventually accomplished what the camera could not. Impressionism, expressionism, cubism, and other schools and styles emerged in the negative space left by photography’s colonisation of conventional image-making.
Proponents of AI suggest that new image generation technologies could have a similar eﬀect, completely dominating conventional production and forcing artists to completely change their practices in order to stay relevant. Whether there is any conceptual space left that AI cannot appropriate or colonise has yet to be determined, though.
One frequently-cited text that can provide insight into this shift is The Work of Art in the Age of Mechanical Reproduction, a 1935 essay by Walter Benjamin: a German Jewish philosopher and critic who lived and worked up to his untimely death in 1940 while ﬂeeing capture by the Nazis. Benjamin mentions the conﬂict between photography and the traditional arts in his essay, stating that its signiﬁcance was to be found in the change in the function of art heralded by photographic technology. To Benjamin, photography — and its development into ﬁlm — was revolutionary because it allowed for the mass production and distribution of images that, previously, might have only been seen by a select few.
The diﬀerence between traditional works of art and those that could be mobilised for mass production was something that Benjamin called ‘aura.’ In his essay, he states that “even the most perfect reproduction of a work of art is lacking in one element: its presence in time and space, its unique existence at the place where it happens to be.” This singular instance of a work’s presence and history is its aura, and Benjamin goes on to say: “that which withers in the age of mechanical reproduction is the aura of the work of art.” Reproduction by mechanical and, today, digital processes rips an artwork out of its original setting, context, and history. The reproduced work may appear identical to the original, but it is lacking all the subsurface elements that made the original a unique object.
Here is what many commentators get wrong: Benjamin thought that aura was a bad thing, an impediment to artistic and societal progress. To Benjamin, the aura was an artifact of the religious or cult function of art: idols worshipped as gods, sacred images of religious stories or saints and, in more modern times, ‘art for art’s sake’ made without concern for its social function. Aura impedes a work’s ability to have a social impact, as a single painting in someone’s collection can be seen by far fewer people than, say, a ﬁlm or a photograph reproduced in a newspaper. The point of Benjamin’s essay wasn’t to valorise the aura of unique works of art but rather to tear down the cult of the aura in order to allow for the acceptance of mechanically reproducible media, like ﬁlm, that could be distributed on a mass scale. Even so, many people cite the essay’s deﬁnition of aura in defence of traditional forms of art, particularly in the discourse surrounding AI image generation.
The fetishisation of aura, to Benjamin, leads down a slippery slope towards the empty, self-destructive aesthetics of fascism: preserving some imagined aura of the past at the expense of the present and future. This protectionist impulse runs parallel to other aspects of fascist ideology. For example, the embellishment of an idealised past that could allegedly be reclaimed by severing certain ‘degenerate’ elements from society. In the epilogue of his essay, Benjamin coined a dictum encompassing his view of fascist aesthetics: “fiat ars — pereat mundus“— “let art be created, even should the world perish.”
To Benjamin, art was to be mass-produced for the sake of social mobilisation and, ultimately, for ideological purposes, serving as tools of mass education or as propaganda depending on one’s point of view. AI generators have no such overt ideological goal, and this vacuum brings to mind the work of another thinker: Japanese philosopher and theorist Hiroki Azuma.
What the otaku tells us about AI-generated images
In his 2001 book, Otaku: Japan’s Database Animals, Azuma lays out a complex view of the otaku subcultures of the time. His arguments are centred around three interconnected points: that otaku are no longer interested in narratives, that the anime-style characters favoured by otaku are just fragmented collections of interchangeable elements, and that otaku engage with the world through a database-style mode of interaction.
Azuma posits that grand narrativity in Japanese media died out in the 1960s and 70s, with subsequent works ﬂoating untethered from any ﬁrm ideological moorings, being produced for immediate proﬁt rather than social advancement. Media of the 1980s, like Mobile Suit Gundam, saw fans drawn more towards aspects such as mecha design and the intricate Universal Century timeline, using these details to forge their own sense of narrative from the media franchise. The mid-90s, however, saw a paradigm shift in how both fans and producers approached narrative. Azuma can narrow down the moment at which this shift occurred: the ﬁnal broadcast episode of Neon Genesis Evangelion, speciﬁcally the moment when Rei Ayanami runs down the street holding a piece of bread in her mouth.
This moment presents her character not as the result of her circumstances and narrative development throughout the course of the series but as something completely diﬀerent. This moment is detached from the rest of the story entirely. It proves that people adore Rei not necessarily for her character but for the interchangeable fragments from which she is constructed: blue hair, red eyes, pale skin; what Azuma calls ‘aﬀective elements’ or ‘moe elements.’
The language of prompting, listing what one wants to see in a generated image, recalls Azuma’s claim that anime characters have become, rather than distinct personalities, collections of aﬀective elements. “The characters […] are not unique designs created by the individual talent of the author but an output generated from preregistered elements and combined according to the marketing program of each work.” Azuma goes on to mention an otaku search engine, TINAMI, that allows users to search for illustrations with particular characteristics: ‘cat ears’ or ‘maid costumes,’ for example. Prompts perform a similar function, except instead of searching a database for existing content that meets the user’s parameters, the generator makes new content out of simulated memories of its training data. (TINAMI is still around today, and recently updated its rules to disallow unaltered AI-generated content.)
To Azuma, the database is the form of organisation by which otaku see the world. Previous generations organized their sense of reality according to the grand narratives of ideology and belief, and when those became untenable the small narratives of ﬁction served as stand-ins to compensate for the lack of meaning in the wider world. What Azuma calls ‘database consumption’ emerged from the collapse of narrative as a meaningful cultural form and its replacement by collections of discrete elements that can be accessed and organised in any order.
New media theorist Lev Manovich noticed this trend as well, around the same time as Azuma did. In a 1998 paper, Database as a Symbolic Form, Manovich proclaimed: “Indeed, if after the death of God (Nietzche), the end of grand Narratives of Enlightenment (Lyotard), and the arrival of the Web (Tim Berners-Lee), the world appears to us as an endless and unstructured collection of images, texts, and other data records, it is only appropriate that we will be moved to model it as a database.”
AI image generation is, perhaps, the most powerful expression of this database-centric worldview yet developed: every imaginable image can be reduced down to a set of interchangeable prompts that can be reordered, replaced, or tweaked to suit one’s desires. Humanity has invented a digital meat grinder that can churn out a perfectly engineered paste of aﬀective elements with next to no human intervention. Generated characters can pop into existence for a single image and evaporate into nothingness in an instant, with no grounding in any form of narrative whatsoever.
Azuma’s bleak vision of otaku subculture has been ampliﬁed by AI and extrapolated to encompass the entire world, threatening to delete the human element from image-making almost entirely. Such mass production of content has nothing in common with Benjamin’s vision of mechanical reproduction: without a ﬁrm ideological base that could mobilise aﬀective images for the sake of social and artistic progress, we are quickly accelerating towards a hyperﬂat environment in which no one image is any more signiﬁcant than any other in the endless churn of fragments.
Like NFTs and cryptocurrency, AI image engines use vast amounts of processing power to perform tasks that could be easily accomplished otherwise. The impulse to consume energy and resources to endlessly roll virtual dice, in the vain hope that the machine will eventually output a ‘better’ image with properly drawn hands, recalls Benjamin’s dictum on fascist aesthetics: ﬁat ars — pereat mundus. The earth is scorched and the seas are boiling but at least the computer is making pictures that look like they’re ‘trending on Artstation.’
For Further Reading:
- Hiroki Azuma, Otaku: Japan’s Database Animals
- Walter Benjamin, The Work of Art in the Age of Mechanical Reproduction
- Lev Manovich, The Language of New Media, pages 218 – 243
Don’t forget – Buy the Dee Dee Zine, and get 56 pages of features, analysis, interviews and more on Japanese games, art, culture and more. Click here to get the December 2022 issue, in which this feature initially appeared. Or get the rest of the back catalogue of magazines here.