AI & Creativity – #3: What a Year in AI

by Dennis Oswald - December 19, 2023

What a year it has been for AI! Just on 30 Nov 2022 – we witnessed the so-called “iPhone Moment of AI” – some might say the “Oppenheimer Moment” – with the public release of ChatGPT-3. And within a year, OpenAI stirred things up again with new models and a major release, introducing GPTs. Not to mention Sam Altman’s crazy week in November giving us a glimpse behind the curtain of AI & power ...

The last 1-2 years have been an unexpected and ongoing wild ride for me. Since my 1st blog post on October 2022, I had the opportunity to explore AI further and write and speak about it regularly. As a designer, I gave small presentations, webinars, talks, and speeches at conferences (WSC23, Boye23) on “AI & Creativity” to inform people about the opportunities and challenges it will bring to our work & lives.

The release beat of new AI tools, models, and news is getting faster and faster, and this might even be just the beginning. So, let’s have a look at what is cooking in AI for creatives and everyone interested:

Your training on AI, its tools, and how to use it to leverage our job & career is more important than ever. You might start with some (free) courses or resources.
The “established” AI tools like ChatGPT, Midjourney, Runway, etc., have evolved and expanded their features, e.g., Motion Brush.
The next generation of AI tools offers new ways to interact and give you more control over the outputs.
New tools for generating texts, sounds, images, and videos shake things up and claim their spots, e.g., Heygen, Pika, Lyria, Krea.AI.

1. AI will not replace you; however, a person using AI will.

This mantra, quoted in the headline, can be found in endless variations from countless sources. In a nutshell, the statement (or the hope) is AI will take over “tasks,” not “jobs” – for now... as Sam Altman is encouraging and discouraging at the same time.

important point here: these systems are much better at doing tasks than jobs.

and giving people better tools to do their work faster often leads to qualitative changes in what they can do.

(of course, over the long run, we expect these systems will be able to do all of some of… https://t.co/ZAGkGjLST3
— Sam Altman (@sama) September 29, 2023

AI is expected to impact all of our jobs heavily in one way or the other (see WEF Future of Job Report 2023). With the new tools, you are expected to leverage AI to solve your tasks quicker and more efficiently; as the results of a work experiment with 758 consultants across 18 tasks at the Boston Consulting Group with the Harvard Business School points out:

"For each one of a set of 18 realistic consulting tasks within the frontier of AI capabilities, consultants using AI were significantly more productive (they completed 12.2% more tasks on average and completed tasks 25.1% more quickly) and produced significantly higher quality results (more than 40% higher quality compared to a control group)."

Harvard Business School Working Paper, No. 24-103, September 2023

AI tools will level up your skill sets and career independent from your profession. Therefore, we all should understand basic concepts, terms, and processes. So we can discuss what the upcoming “Human-Centred AI” should look like, in which we enable people with AI and not replace them.

To start and gain knowledge, you might look into some of these non-technical (free) online courses or knowledge sources:

DeepLearningAI

I am a big fan of Andrew Ng. His ability to explain complex things comprehensively and easily is a gift.

AI for Everyone – Even though this course was released before Chat-GPT, it explains the general concepts of AI very well and its opportunities and impact on society.
Generative AI for Everyone – This course focuses on generative AI, what it can do and what not, and gives an overview of tools and how to use them in life and work.

AI Transformation Playbook

Landing AI is offering an AI Transformation Playbook as a free download to help you start to transform your enterprise with AI and outlines five steps:

Execute pilot projects to gain momentum
Build an in-house AI team
Provide broad AI training
Develop an AI strategy
Develop internal and external communications

Google: Introduction to Generative AI

Google Cloud Skills Boost offers a free beginner course, “Introduction to Generative AI,” for which you can earn a badge.

People + AI Guidebook by PAIR

This set of methods, best practices, and examples for designing with AI is based on insights from Googlers, industry experts, and academic research. Medium.com is offering a summarized update about the playbook.

Linkedin & Learning

Besides using your stream for inputs on what is cooking at AI, Linkedin Learning also offers a bunch of AI courses, such as:

I am sure there are more out there; however, the core message remains: Go and get familiar with AI for your work and career!

For myself, I always look for fun or small side projects to begin, play around, and learn (about AI), as the fear of failure does not inhibit your creativity while in the playful “Open Mode” – in this context, please watch John Cleese on Creativity in Management – and so, I try to follow this steps:

2. Updates and new tools and features are all over the place

The leading big players, such as OpenAI, Microsoft, Meta, Google, and Adobe, have all taken the next step to implement AI in their eco-systems and applications. Also, on the application level, a lot of improvements happened – inside the “established” tools, so let’s have a look at some:

ChatGPT and GPTs

On Monday evening, 5th November, just a couple of hours before my presentation at the Boye Conference in Aarhus about UX & AI the following morning, OpenAI dropped a major update on their first Developers Conference, and I needed to update a few slides quickly. In a nutshell, they announced:

A new model (faster and cheaper) and developer products (Assistant API, code interpreter)
Customizable versions of ChatGPT, called GPTs, can be trained on specific tasks on custom data (a GPT store was announced but seems to be delayed).

At Netgen, we started experimenting with GPT ideas, e.g., we built a virtual, internal teammate called “Net.Genius”. We gave it some basic knowledge (nothing confidential) about our services, our teams, and the things we like and do. So based on that, chatting with it might help to find the colleague with the skill set you are looking for, a fitting selection of project cases for your presentation, or just translations into Croatian to say “happy birthday.”

These specialized GPTs are easy to build and could be a valuable sparring partner in a lot of tasks, from planning and organizing to writing or visualizing to teach or analyze, especially as OpenAI has plans to roll out a GPT Store in that you might sell your bot (and earn money with it) like in the Apple App Store. Just a few hours after the release, people started creating their ideas. Here are three examples:

Midjourney

If it comes to visualizing ideas and creating images, my weapon of choice from the beginning, and still at the moment, is Midjourney.

The evolution in image generation at Midjourney (and other tools, more on that later) is still mind-blowing, as this version overview of the same two prompts from v1 (Feb 2022) to v5.2 (June 2023) demonstrates:

Or an even more impressive step back by Jacques Alomo (meanwhile, you can upscale to 4096x4096px)

Therefore, I am looking forward to the capabilities of the announced Version 6 (late December) and the new Midjourney WebUI (no more Discord), called “Midjourney Alpha.” At the moment, you can only access it if you generated +10k images, so David Blum was so kind and provided screenshots of it:

Runway

Runway is a fantastic tool for adding motion to visualize concepts. The quality and the options to direct the outcomes leveled up, as you can see in demo videos or at Runway’s movie award gen:48.

The new motion brush is a great tool and breathes life into concept visualizations. If you want to learn more about the capabilities of Runway, check out the Runway Academy explanation videos.

3. Controlling your prompt outputs

One trend in all tools is providing the user better control of the prompted output, especially on generated visuals such as images or videos.

Stable Diffusion – ControlNet

ControlNet enables Stable Diffusion to receive conditional input – scribbles, edge maps, pose key points, depth maps, segmentation maps, etc. – that controls the image generation process, resulting in improved performance of Stable Diffusion.

In September 2023, MrUgleh shared a workflow on reddit that emerged a new genre in AI art with his optical illusions.

Out now, The Original Spiral Town and Checkered Village. Posters, Canvas, Phone Case, Clock?!?!https://t.co/ksu4Rd1rDz

Keep your eyes out for altered, more aesthetic versions. Fixed windows, people, etc. Can't buy from Printify? Here are the blown-up copies. pic.twitter.com/puCeA1eQ39
— MrUgleh (@MrUgleh) September 22, 2023

Fascinated by this new kind of images, I tried out “Illusion Diffusion,” a web GUI on Hugginface.co, to input the shape of the Swiss Cross to fake a Swiss Tourism Campaign I called “imagine {switzerland}.”

Animate Anyone

This controllability is not limited to static images but has also found its way into Animation. Animate Anyone allows consistent and controllable Image-to-Video Synthesis for character animation. The tool can generate videos of anyone from just a single static image.

Real-time Reference

Other image tools, e.g., Leonardo.ai or Krea.ai offer another way to control between input and output, such as Live Canvas or Real-time Generation.

⚡ Leonardo Real Time Canvas

Have you already tested it? It's mind blowing! 🤯 pic.twitter.com/KHlBZmrJTq
— Javi Lopez ⛩️ (@javilopen) December 5, 2023

To jump from static images to video again – Pika.art is expected to be “The ChatGPT moment for video” for creativity in generated motion and editing videos by prompting.

Frankly, there is no reason why it should stop there. The experiment from 2022 from Technical AI artist Sean Simon demonstrates the potential of how real-time inputs might change the future of fields like architecture or engineering.

AI is getting more and more powerful quickly. It will soon help to finally bridge the gap from the real world into experiencing immersive augmented reality by creating 3D objects quickly from almost everything – text, sound, sketches, images, or scans.

Combined with a smart wearable like the new Glasses from Meta & Ray-Ban, or if you want to go a price level up with Apple Vision, we might get close to where these virtualities enter our daily lives. – some tool examples the AI scout Grit Wolany shared with me:

Blockade Labs: creates infinite 360 worlds via text prompts. e.g., “foggy purple rainforest with colorful fauna and fireflies.”

Genie: a 3D foundation model research preview from Luma, running on Discord for text-to-3D or smaller objects – e.g., “a fantasy tree.”

Luma.ai: Luma aims for easy 3D generation via NeRFs or Text-to-3D. By typing a simple text prompt, you can generate 3D models - e.g., a visit to the museum.

We will see many improvements and acceleration coming up in 2024; for now, I will stop here. Have fun exploring the links in this post and the mentioned AI tools over the holidays – Enjoy!

I wish you all a “mAIrry xmas” and a Happy New Year!

Scholarship360°

by Dmytro Melnyk - December 1, 2023