Friday, 29 November 2024

AI and drawing

I have just found a new AI application for drawing. The promotion for this application begins by stating, 'around 10,000 hours of practice are needed to be able to become proficient with a pencil and paper and most of us don't have the time (or the patience).' The promotion goes on to say that, 'However we will all be pleased to know that Google has released a free web-app that allows for terrible digital drawings to be turned into recognisable objects, it is called Autodraw.'

It turns images such as the hand drawn one on the left into smoothed out Disneyesque versions such as the one on the right. I tried it and drew a lion, a not bad version, I thought it had spring in its step, but Autodraw then replaced it with a flat image of a Tiger drawn from the front. It works by having an image bank of 'recognisable' things, all drawn in a similar way. When you draw it assesses what type of thing your drawing is a representation of, it then decides it could be one of several things, then you click on a scroll of small images consisting of what might be what you think you are trying to achieve. It's 'clever' AI technology obviously brings up these images in a similar way to how Amazon brings up new books for you to read based on 'like', but it can only deal in stereotypes. How horrid. The anodyne rhino has nothing interesting about it at all, whilst the image it replaces has all sorts of ambiguities about it that make it an interesting read. Not least is the fact that it is a generic 'animal'; a beast on four legs that only has two, coupled with the issue that it might have an ear or a horn, which makes it far more intriguing. I find this all part and parcel of a syndrome that includes a standardisation of how fruit and vegetables should look, the ones you get in supermarkets are becoming more and more like the photographs of perfect versions of themselves, and the fact that people don't just want to look like the versions of human beings they find on magazine covers and in films, that they are now prepared to have themselves operated upon to become clones of these models. It is as if the whole world has been infected by Plato's idea that underneath everyday reality there is a perfect ideal version of everything and these versions are what everyone now aspires to. Aristotle would have been appalled to see the rise of such a culture and would argue that it needs a good dose of reality. 

Adobe has released a new AI Feature in Photoshop. It can seem to do almost anything that an experienced Photoshop artist can do and "often do it better". Other types of AI generated image manipulation software are also coming onto the market and everyone now has the opportunity to produce a huge range of images to add to the millions already available on the web, which have now in turn all become AI collage fodder. Here are just a few image generation tools, alongside their various marketing blurbs.

Imagen: Analyzes your previous photo edits to create your Personal AI Profile. You can then apply the profile to your Lightroom Classic catalog at a less than 1/2 second per photo.
Photoleap: Transform your landscape and interior photos into works of art. A mobile app. DALL·E 2 for an easy-to-use AI image generator
Artiphoria: Create thousands of images with just one click
Midjourney for the best quality AI image results
DreamStudio (Stable Diffusion) for customisation and control of your AI images
AI Image Generator By Fotor - Fotor, an online photo editor with millions of users worldwide
NightCafe is one of the most popular AI text-to-image generators on the market
Dream by WOMBO was created by a Canadian artificial intelligence startup WOMBO
Craiyon was formerly called DALL-E mini. Simply type a text description and it will generate 9 different images
Deep Dream is a popular online AI art generator tool. It’s very easy to use and comes with a set of AI tools for creating visual content
StarryAI is an automatic AI image generator that turns images into NFT
Artbreeder creates creative and unique images by remixing images. You can use it to create landscapes, animated characters, portraits, and various other images
Photosonic is a web-based AI image generator tool that lets you create realistic or artistic images from any text
DeepAI This is an AI Text-to-image generator. Its AI model is based on Stable DIffusion

AAAAGH!!!

Reading the available literature it would seem that DALL·E 2, Midjourney, and Stable Diffusion are the top three to try. So I will and as I'm working on images that attempt to visualise interoception and have been generating these by having conversations with people, I shall try and build in the same process of holding a conversation, but this time with a software program. 

DALL·E 2: 
The way it works is if you type in a phrase—such as “a photo of an astronaut riding a horse”, it will generate an image based on its understanding of what “astronaut,” “riding,” and “horse” mean. It will  fill in details based on its ability to associate related concepts; astronauts, for instance, tend to appear against a backdrop of stars.

DALL-E 2’s interpretation of “A photo of an astronaut riding a horse.”

The results for myself were very poor. Because it deals in stereotypes it was a total failure when dealing with actual invention. 

Midjourney. It costs just to run a trial and you can suddenly find yourself paying a lot of money every month for this software, but it is very good at what it does, especially if you can find the right words. Nick St. Pierre has obviously spent a lot of time practicing how to get the right words together, and when you do the results are very convincing. However to get what you want, you need to be very articulate in AI speak in order to construct the right prompt. 


The image above is an AI-generated image created using the prompt: “Cinematic, off-centre, two-shot, 35mm film still of a 30-year-old french man, curly brown hair and a stained beige polo sweater, reading a book to his adorable 5-year-old daughter, wearing fuzzy pink pyjamas, sitting in a cozy corner nook, sunny natural lighting, sun shining through the glass of the window, warm morning glow, sharp focus, heavenly illumination, unconditional love,”
 A prompt written by Nick St. Pierre for Midjourney V5.

This is I suppose a type of collaged drawing, the software being designed to use photographic imagery rather than drawn imagery. Once again:

AAAAGH!!!

It's such a knowing lie. The soft warmth of Nick St. Pierre's Midjourney image ticks all our family buttons but has nothing to do with family dynamics, only a constructed idea of what we would all like to think a father/daughter relationship should be like. Oh dear, I dread to think about where this constructed reality will eventually lead us. We know what old fashioned photographic retouching methods led to when in the hands of totalitarian regimes. 


Leon Trotsky and Lev Borisovich Kamenev were airbrushed out of history

Although similar to all the others Stable Diffusion is open source and therefore you feel that at least the code is something that you can freely explore and add to. 

You can generate images with Stable Diffusion by using the Dreamstudio web app. To use Dreamstudio.ai: You need to navigate to the Dreamstudio site and create an account.
Once you are in, input your text into the textbox at the bottom, next to the Dream button.
Click on the Dream button once you have given your input to create the image.
Your image will be generated within 5 seconds.
You can download the image you created by clicking the download icon in the middle of the generated image.

Again I thought the images very clichéd, but I was beginning to realise that this was the point. Everyone wants to look the same, the lip pout in selfies is now universal and anything seemingly not about what everyone else 'likes' is useless. The paradox being that this is all sold as being about 'freedom' and 'creativity' but it is really about statistical averages and stereotyping. 

Stomach ache

The image above of stomach ache is typical of the AI generated products I was able to access when I began the process, but I did ask for a drawn image. Yes it does communicate a pain in the belly, but the image is more akin to a road sign and there is no complexity of feeling, which you get when someone is trying to communicate how pain feels or how the pain might be visualised as an experience. However as I began to get the hang of things, key words became more important and whether I wanted to use it or not, I realised that photo-realism was the style of preference for this technology. Backache is in particular in the field of describing pain, an area that is richly illustrated and which can lead to quite powerful results. 

Backache, back, pain, backbone, muscle, anatomy, spine, human, bone, science, health, injury, medicine, inflammation, ache, torso are all related words that help in a search and these words added to, 'show back body glow with dark background', can get you images such as the one below:

AI generated image of back pain

The image above is though for all its anatomical conviction, still a cliché, but I can see why people would use it, as it has gathered together several visual tropes, and in order to operate rhetorically it has put them together to offer an almost superhuman image. The man's face needs a tweak, perhaps to add 'painful expression', and the body structure is of someone in the best of health, this man is plainly an athlete and it could be more effective if the body language was that of an ageing body or if responses to a severely slipped disc were visualised.

Stomach pain

The image above is one of my own, generated from a mix of hand drawn images and their computer manipulation in Photoshop once they were scanned and was made over several weeks in response to a one to one conversation that at one point focused on an awareness of peristaltic waves and at another the feeling of compaction that you get with constipation. I'm not adverse to using computers and have been using them as part of a printmaking process since their introduction, so I ought to embrace the new possibilities offered. My image was an attempt to convey the complexity of body awareness and how a somatic feeling is a moveable, live thing and not an object, as well as it being not the only inner body sensation that you are aware of; in this case the compacted feeling of a need to relieve oneself when you cant, was not about a pain, it was more about a feeling of 'stuckness' and it had a visual nod towards more geological representations. I'm not very experienced in using AI image generating tools, so perhaps I need more training in how to use them. 

Now that the gates are open to AI use, these is no way they will be closed unless some catastrophic event is caused by it. As artists we will have to accept that it is available and either use it as best we can or carry on using what we already understand as our go to media. Personally I'm of an age where I had to respond to the introduction of the computer as an image generating tool because I was an art teacher and had to keep abreast of technical innovation. I began learning how to write code for a BBC computer, so that i could draw a circle on a screen, I learnt AutoCad in a PC, I made animations in Hypercard on a Mac and put together multimedia interactive art pieces using various types of software that is now all out of date and saved on floppy disks that are redundant and like the computers these things used to run on, now mostly buried in landfill sites. There became a time when I decided that I just couldn't keep up with the rate of change and I returned to hand made image making, with some occasional computer manipulation when I felt it was useful. Knowing one bit of software, (Photoshop) seemed to be enough to get me by, but I was very aware of younger artists, designers and illustrators around me using more and more complex software packages and even more aware that their technological learning curves were becoming never ending; they were constantly having to attend training on a new this or that. All of which is very expensive and there seems to me to be an ever widening digital divide between those who can afford the constant need for training, the rent of ever more expensive software, as well as computers with the memory and processing power to cope with new software and those who have no access to the required funds.

As the digital divide widens there will be more and more people who have no idea of how images are made. Media literacy is falling, and as it does the rhetoric used by those that own the media will become more and more unquestioned. How will someone unversed in the capabilities of AI to generate convincing images be able to choose between what are real and what are constructed images of events? As everything becomes a potential fiction I just have to hope that the stories that have most traction are those that take us all towards a future that embraces sustainability and respects this planet and all the life on it. As an artist who has also always been a teacher, even though I have now retired, my role has to embrace the fact that the story I would like us all to contribute to, is that one of sensitivity to the planet's needs and an accommodation of difference, so that a future harmony is aimed for, one that embraces the interconnectedness of everything. We need to remember that all images are fictions, they are narrated into being and the reality is that they always were. But some stories are more wholesome than others, some stories help us to heal the world and others will harm it, the most important issue being how we respond to them.
How many thousands of megawatt hours of electricity, it takes to run this type of machine thinking I don't know, but I suspect it costs the Earth in energy terms far more than we think. 

See also:

No comments:

Post a Comment