AI - At the dawn of a new era: state of play and current developments

18. August 2023 - from Till Könneker

Writer's Note: By the time I finished writing this article, much of it was already out of date. The speed at which the artificial intelligence industry is evolving is hard to keep up with, so it's difficult to take an accurate snapshot.

We are moving from the information age into the intelligence age with almost limitless access to AI-based systems and tools available around the clock, anywhere in the world. It should be mentioned, however, that this is basically not intelligence, but a machine that can imitate human intelligence better and better until the difference is no longer perceptible. This is referred to as Artificial General Intelligence (AGI).

Hardly any other technology is developing as quickly as AI, with numerous new areas of application and tools appearing every day. AI has quickly established itself as a transformative force that is reshaping industries and redefining the interaction between humans and machines.

Following my last AI blog article from January 2022, it's time for a new snapshot to explore the current landscape of AI and emerging tools.

For those interested in the correct terminology of the acronym "AI" used here, read on here.

Two "AI Creatures" created with Midjourney
AI beings - The new reality

To get an idea of what's happening in the field of artificial intelligence in just one month, here's a (non-exhaustive) list from May/June 2023:

Unity AI
AI Antibiotic
Minecraft AI
Picture-to-3D
Google Starline
Nvidia Game AI
Google Flood AI
ChatGPT in Court
TIME: Humanity End
DeepMind discovers C++ algorithms
Neuralink FDA Approval
42% US Not Used ChatGPT
Japan No Copyright OK for AI
AI Chef
Zoom AI
Gen AI ETF
Instacart AI
Falcon 40B Free
Nvidia Neuralangelo
OpenAI CTO Hacked
Apple Autocorrect LLM
EU: Must Classify AI Content
Meta's MusicGen
Amazon Review AI

And many more announcements and headlines such as:

StabilityAI Uncrop
Runway Gen-2 For All
Wordpress Jetpack AI
Chinese LLM
ChatGPT for Enterprise
Google Bard 30% Better
New AMD AI Chip
Salesforce AI Cloud
ChatGPT Workspaces
UK Gov't: Data Access
AI "Last Beatles Record"
92% Programmers Use AI
Adobe Generative Recolor
Google AI Search 2x Faster
EU AI Act
GPT-Engineer
Meta Voicebox
Adobe AI Insurance
AI Speech Classifier
OpenAI API Cheaper
Free LLM from Meta
Google Virtual Try-On
ChatGPT + Mercedes
HF: Free AI QR Codes
Google: Replace MRIs
Toyota Design AI
Opera One Browser
Vimeo AI
Scriptwriter
Midjourney v5.2
Google DeepMind
Robocat
MosaicML Acquired for $1.3B

...and so on and so forth.

ChatGPT

Let's start with what is currently probably the most widely used AI technology and some innovations.

GPT-4

With a higher processing capacity than its predecessors, GPT-4 can understand and process more complex inputs more efficiently. This progress promises communication that comes ever closer to human conversation and greater precision in responding to queries.

While the free versions of Chat GPT are limited to uploading 2000 words and process them relatively slowly, GPT-4 can be fed with up to 25,000 words. In addition, it is able to generate longer responses, making interaction with the AI even more versatile and useful.

GPT-4 also brings significant improvements for German-speaking users. Thanks to more intensive training in various languages, GPT-4 can now also understand and answer questions more precisely in German and other languages.

Plugins

ChatGPT plugins represent a new extension of the AI model's capabilities. They are specially developed tools that can be used within the ChatGPT interface and follow the core principle of security. These plugins allow ChatGPT to access up-to-date information, perform calculations and utilise third-party services.

By integrating plugins, ChatGPT gains access to the Internet and new functions, which significantly increases its performance. For example, by using plugins, ChatGPT can retrieve web pages and search results from Bing, create travel plans, compare prices, learn new languages, execute code, retrieve documents and much more.

These extensions open up a multitude of new application possibilities and significantly improve the user experience. They make ChatGPT an even more powerful and versatile AI tool


Some examples

Wolfram
The Wolfram ChatGPT plugin is a powerful tool that significantly extends the capabilities of ChatGPT. Although it may seem technical to some users, it offers significant added value due to its advanced features. With access to extensive data, the Wolfram plugin allows users to perform advanced calculations, solve complex maths problems and access real-time data.

However, the Wolfram plugin goes beyond pure maths. It can help with a variety of tasks, such as creating a family tree, generating an audio spectrogram or visualising anatomical structures. It can even provide up-to-date date and time information, a feature that ChatGPT alone does not offer.

Zapier
Zapier is a ChatGPT plugin specifically designed to eliminate and simplify unnecessary workflows. Zapier makes it possible to interact with over 5,000 different work apps without having to perform any additional steps. This includes all popular apps such as Gmail, MS Outlook, Slack and many more. This means that virtually entire emails can be drafted or detailed Slack messages can be sent directly from ChatGPT.

Link Reader
Simply put, this plugin can read the content of all types of links, including web pages, PDFs, images and more. ChatGPT then communicates with Link Reader and provides a detailed response to the link request. 

Meme Generator
Of course, there are also a number of fun plugins such as the Meme Generator. Here you can generate a variety of memes on any topic provided. This plugin uses its integrated meme directory to obtain the images and add suitable captions.

Prompts

They are the key to getting the AI to deliver the desired results. The more precise and specific the prompts are, the better the AI can understand requirements and provide the desired answers. A completely new profession is emerging here: prompt engineering.

It is not only important to ask good questions, but also to define what the linguistic tone should be (formal, informative, casual, etc.), define the format (essay, bullet points, dialogue, etc.), what the overarching goal is (information, fun, entertainment, etc.) and what the context of the text is.

Of course, many other criteria can be set to obtain the exact results in the desired language and form.

Example of a prompt that sets a task and gives the AI clear instructions.

Some simple prompt patterns to achieve good results

  • Make me a to-do list of tasks I need to complete [number]

  • Summarise this text with a short paragraph and make a bulleted list of the most important points [illustrate each point with a matching emoji]

  • Act like (any personality/expert) and tell me what they would say about this: [any text]

  • Explain [topic] to me in simple terms. Explain it to me as if I were a beginner.

  • Brainstorm [any topic]

  • Create a short quiz that teaches me [what you want to learn]

  • Change the writing style of the following text to [style or tone] [insert text]

  • Analyse the text below for style, spelling and tone. Create a new paragraph in the same style, spelling and tone: [insert text]

A detailed description of how to further customise the language can be found here on Twitter.

More useful prompts can be found here.

Finding the right prompt can be a difficult task and can make the difference between usable and unusable output. This is where tools that help you develop a topic or idea come in handy.

Coglayer, for example, wants to be understood like an external brain; the guided process can help to achieve better results.

Custom Chat GPT

Two simple ways to realise your own AI chatbot that can be embedded on any website, for example, are Dante, SiteGPT, or Chatbase. This allows you to create a chatbot from your own data (text, websites, PDFs or Q&As) in just a few minutes.

GPT-5?

Looking into the future, GPT-5 will probably be released by the end of 2023. This update could be another milestone, leading people to believe that AI has crossed the threshold of artificial general intelligence (AGI) - a state where chatbots are indistinguishable from humans. Even if GPT-5 does not reach AGI, the update is expected to bring significant improvements that far exceed the capabilities of GPT-4.

Charts & Graphs

Of course, there are also a number of plugins for creating tables and statistical graphics. However, it is almost more interesting to use AI to chat with data. This allows you to quickly extract important insights from all types of data. A relatively new tool is GraphMaker. CSV files can be uploaded here or Google Sheets can be linked, which can then be analysed and visualised.

Code Interpreter

The code interpreter just introduced in ChatGPT Plus is quite powerful. It is a personal data analyst: can read uploaded files, execute code, create charts, perform statistical analyses and much more. Due to the wealth of possibilities, it will take a while to realise its full potential.

Chat GPT Code Interpreter collection by Andrej Karpathy

Text for Web

Tools that support UI and UX designers come onto the market almost daily. For example, Figma has just unveiled its impressive AI integration. In a professional context, AI integrations that simplify and optimise workflows are proving to be particularly useful. By saving time that would otherwise have to be spent on routine tasks, more focus can be placed on actual conceptualisation and design.

10Web is an AI-powered WordPress platform that makes it possible to create a website in minutes.

Framer is currently probably the most advanced AI website builder with very simple operation.

However, these tools can still only be used for very simple applications or for variants and mockups. However, it is worth trying out these tools from time to time, as they are developing rapidly.

Text to Image

Image styles
Different films change the image more than a lens or camera. Even a year can change the image significantly. The lens size and focal length have an influence on the depth of field. Here are some examples of different types of film:

Lomography Redscale XR (2009): Known for its unique red-orange color change.Lomography Redscale XR (2009): Known for its unique red-orange color change.

AgfaColor New (1930s): Known for soft color reproduction and fine grainAgfaColor New (1930s): Known for soft color reproduction and fine grain

Agfa CT Precisa (1974): A color slide film known for its sharpness and fine grain.Agfa CT Precisa (1974): A color slide film known for its sharpness and fine grain.

Anscochrome (1928): Early color slide film known for its rich colors.Anscochrome (1928): Early color slide film known for its rich colors.

Polaroid Instant Film (1963): Known for its instant development and high contrast.
Polaroid Instant Film (1963): Known for its instant development and high contrast.

Here is a good overview of other film styles.

Midjourney 5.2

Apart from a few improvements, the zoom function is the biggest new feature in this version. An image can be zoomed out 1.5x or 2x, which could be seen as a direct response to Adobe's generative fill feature, which enables a similar function.

Agfa CT Precisa (1974) - 1.5x or 2x zoomed out

Here we can now zoom in twice and Midjourney completes the environment and again offers 4 variants:

Midjourney Zoom

Image panning

Another feature goes one step further: images can now be expanded in all directions by pressing a direction arrow and writing how the image should continue.

Midjourney panning

Inpainting feature

With this function, selected areas of the image can be subsequently changed without changing the entire image. Whether it is clothing, a colour or a facial expression, the desired change can be written to the prompt by simply selecting an area.

Inpainting by @chaseleantj
You can find detailed instructions here on Twitter (X) from @chaseleantj

More text-to-image tools

With "Mixed Image Editing", Playground promises a new method of combining real and synthetic images. This goes beyond text-to-image and allows more control to make subtle changes.

Sketches to images
Midjourney offers the option of uploading a simple sketch or drawing and converting it into a realistic image. The simple web tool drawit.art goes one step further and generates images from simple line drawings.

Architecture
On the Luccid platform, you can quickly design your dream home.

DragGANAI gestützte Interactive Point-based Manipulation mit DragGAN

What are the big ones doing?

Google
The big players have really caught up. Google has announced the integration of its AI directly into the Google Docs environment alongside its chat GPT competitor Bard. Duet AI for Google Cloud helps with writing code or helps to optimise, analyse and plan data in Google Sheets. The new Google Sidekick in Google Docs will be constantly active. This Sidekick reads and processes the entire document as we write it and provides contextualised suggestions that are specifically tailored to the work. Google is also involved in the area of language. We'll be hearing a lot more, literally, about their AudioPaLM, a language model that can speak and listen.

Microsoft
Microsoft is also increasingly integrating artificial intelligence into its services. After investing £10 billion in OpenAI, the developer of the chatbot ChatGPT, Microsoft has already incorporated AI into its Bing search engine and is planning further integrations. This opens up new possibilities for interaction with applications. Users can now communicate with applications such as Word or Excel as if they were talking to a real person.

Apple
Apple is still holding back when it comes to artificial intelligence. The term "AI", i.e. artificial intelligence, was remarkably not mentioned once in the last keynote. Although Apple uses AI for many functions in the background, it is to be expected that the company will present its approach to AI integration in the various operating systems in the near future. Apple is known for not necessarily being the first, but striving for the best possible and most mature solution.

Adobe
In the last blog post about AI, I wrote that AI tools need a professional user interface that allows more control over the results. With Adobe Firefly came Adobe's interpretation of an image generator for the first time. The release of the new Photoshop Beta, which impressively demonstrates how powerful AI integration into a professional interface can be, was really exciting.
Work such as masking image content and retouching, which previously took hours, can now be completed in a matter of minutes or even seconds. We can't wait to see how Adobe will use AI in their other applications.

Film, animations and games

The areas of film, animation and games are probably the most promising in the commercial sector and have great potential. However, the high computing power required for this slows down development somewhat. Nevertheless, a lot is happening here. For example, Kaiber.ai has launched a powerful tool that can be used to generate animations from descriptions and template images, transform existing videos into completely new worlds and styles or create animations from music.

Wonder Dynamics is an AI tool that automatically animates, lights and assembles computer-generated characters into a live-action scene. An AI VFX studio in the browser that already enables impressive effects and animations.

Opus AI promises to transform text into games, metaverses, simulations and films. Opus enables text-based control over lighting, camera, terrain, flora and buildings, characters and animations.

The big player Unity has also recently introduced an AI integration: Unity Muse, a comprehensive platform for AI-driven development support and Unity Sentis, with which neural networks can be embedded in builds to enable previously unimaginable real-time experiences. For example, facial features can be described here, but it is also possible to communicate with characters on an AI basis and use this in games, i.e. to make interactions that are not completely scripted playable.

Our clones are already here


Elai.io enables customised AI videos with presenters, i.e. generated lifelike characters that can be used for presentations, product clips or YouTube channels.

On the website of HayGen, you can create lifelike clones of yourself that are indistinguishable from real people. The company itself comments on the latest examples of its technology: "We agree it is exciting and scary when thinking about what's possible".

We have all experienced what fake news can do, deep fakes were good but these AI clones are on another level. What we need now is a verifiable media origin, preferably at the hardware level of cameras. Rules are urgently needed here to prevent this from getting completely out of control.

Jesse Wellens mit HayGen.com

FlexClip is an AI online video editor in which not only text to video, but also an entire script can be written with AI support.

Back to reality

What we are seeing more and more of, and will continue to see a lot more of, are real designs that are brought to market as a product based on the specifications of an AI. AI is finding its way into architecture, product design and fashion in particular. Take fashion, for example, which the designer Joshua Larson created with the help of Midjourney .

Joshua Larson AI Fashion
Here are the selected designs that were generated with the help of Midjourney...

Joshua Larson AI Fashion
...and the clothes produced in real life.

AI for the psyche?

In 2022, the GEO wrote:
"AI to recognise mental illness. In the near future, artificial intelligence should be able to analyse our psyche. It will support sufferers and therapists by recognising mental illnesses, preventing relapses and improving therapies. The need for help is huge."

Social media has not made us more social, but has created many new psychological problems. We will only really be able to analyse how artificial intelligence will affect us personally and society in a few years' time.

I already touched on the topic of psychological self-help in the last article and, as I suspected, this sector has just exploded with applications, although a lot is already possible with ChatGPT, as this post on Twitter shows:

Twitter Screenshot by @jenny____ai

Can AI help solve problems?

Yes and no. AIs like Chat GPT are good at slipping into different roles and they have almost limitless expertise in all possible forms of coaching and therapy. What AI still lacks is genuine empathy, the ability to recognise and respond to emotions and emotional states. It is therefore all the more astonishing how AI can simulate empathy quite well and the result sometimes makes you forget that you are communicating with a machine.

Any methods

In certain situations, it can actually help to discuss things with a virtual assistant or to evaluate methods and solutions. You can tell an AI exactly how you want things to be, more focussed on solving problems or working through issues. An AI is also good at analysing conversations over a longer period of time and suggesting tailored solutions. Specific methods are also no problem for an AI, for example IFS (Internal Family Systems Model) or CBT (Cognitive Behavioural Therapy) can be carried out without having to consult various specialists.

Personalise

Chat GPT can be gradually trained for a specific topic and your own requirements, the prompt is decisive for the result. The AI will use all available data to deliver results - existing data and data that you have entered. For data protection reasons, it is advisable not to enter any personal data and to work with sample names. You can then adjust the prompt on an ongoing basis, depending on whether you want brutally honest or more sensitive answers, for example. A popular method is also to let ChatGPT slip into different roles, well-known philosophers such as Marco Aurelio (Stoa) or psychoanalyst C. G. Jung. G. Jung yield different results. 

Prompt example

A possible prompt for a more solution-focused therapy coach might be:

"You are an AI chatbot playing the role of an effective altruistic coach and therapist. You are wise, ask thought-provoking questions, are problem-solving orientated, warm, humorous and a rationalist of the LessWrong kind. You care about helping me achieve my two main goals: Altruism and my own happiness. You want me to do the best I can and be very happy too. You ask me what I need help with or what problem you can help me solve and then guide me through a rational, step-by-step process to find the best and most rational actions I can take to achieve my goals. You waste no time and get straight to the point."

Prompt example from Kat Woods on Twitter

Emotions in the database

Whether it's sarcasm in the tone of voice, subtle facial expressions or a sigh of relief: emotions are something deeply human. This non-verbal level of communication in glances, facial expressions or the tone of voice is a decisive factor for us humans in interpreting the mood of the other person. But this does not mean that AI cannot learn and understand emotional human expressions;

This is exactly what Hume.ai, an "empathic AI toolkit for researchers and developers", is working on. For example, the platform identifies speech patterns such as melody, rhythm and sound patterns that give complex, mixed meanings to everyday speech.

Vowel sounds such as "laugh", "sigh", "shriek", "oh", "ahh", "mhm" and more are analysed and categorised in order to recognise emotions.

Various facial expressions that convey different meanings and patterns of emotional reactions are also included in the data. 

Such data is currently giving rise to a completely new category of AI-supported applications in the "well-being" sector.

Examples

Breathhh is an AI-powered Chrome extension that automatically provides mental health exercises when you need them based on your web activity and online behaviour.

Misu is a Mac app that automates mood tracking by reading facial micro-expressions throughout the day and generating beautiful infographics that allow you to visualise moods over time.

Kintsugi is a journaling app based on AI speech recognition technology that can recognise mental health problems in any language.

Important: No one should rely completely on an AI, regardless of the topic. AIs are not a substitute for therapy, but can be used as an additional tool or to evaluate which type of therapy might suit you best. For mental health issues, you should always seek professional help.
Are you looking for support?

Generative art

Artists are always quick to utilise new technologies, social issues or tools. AI-generated art created through the use of artificial intelligence is also conquering Instagram, galleries and exhibitions.
This does not refer to the sheer endless images that look amazing but do not necessarily pursue a deeper artistic idea. Artists are currently exploring the possibilities and limits of AIs. The social impact and belief in technology are also being critically scrutinised and explored. This sometimes critical examination of AI is exciting to follow and important for a reflective view and the social processing of this new flood of images and possibilities.

The question of whether images generated by artificial intelligence can be considered art is complex and has already triggered lengthy debates. A central issue in these discussions is the question of the reality and value of AI-generated artworks compared to man-made ones. If an AI can produce a painting that has the same effect on us as a work by da Vinci, for example, the question arises as to whether it is just as valuable. 

Generative Art experiments by Till Könneker aka @__ewert__ auf Instagram

This discussion is reminiscent of the debates in the art world at the advent of photography in the 19th century. Photography challenged traditional notions of art and the skills that an artist must possess. It was argued that photography required less talent or skill than traditional art forms such as painting or sculpture, as it was perceived as a mechanical and technical process.

Man Ray - Noire et Blanche, 1926 (gelatin silver print)Man Ray - Noire et Blanche, 1926 (gelatin silver print)

In his essay "The Work of Art in the Age of its Technical Reproducibility", Walter Benjamin addressed the issue of reproducibility and the "loss of aura" in reproduced works of art. However, he also argued that this reproducibility opened up the possibility of making art accessible to a wider audience.

After all, photography was recognised as an art form and so it will be with AI-generated works of art.

Dreams of the future - where is the AI journey heading?

Language models could be the new base technology on which most new software applications will be built. This could change our fundamental idea of what an application is. One-off applications could emerge that are designed to solve unique problems.

These single-use or self-destructive apps generate themselves completely independently and also destroy themselves again after use. Let's say I want to book a trip to Vienna with an overnight stay, city tour and museum visit, the "app" would be customised specifically for this application and create the itinerary, city map with an audio tour, all tickets and the hotel booking, all with a simple chat or audio input.

An exciting prototype by Mckay Wrigley shows how his GPT-4 Coding Assistant has learnt to develop such apps on its own, all via voice input.
The application can create and design a web app, build a backend with a working database, handle authentication, upload the code to GitHub and deploy via Vercel.

If you want to delve a little deeper into the subject, I recommend this paper by Yohei Nakajima about a chat prototype that is given an overarching goal and then independently creates its next task. It then continues to generate and prioritise its own list of tasks as it executes them one after the other.
This "AI Founder" can be given any core objective, such as "make the world a better place". The AI then works independently on this task.

Twitter thread about Yohei Nakajima's test run with the AI which sets itself tasks:

The democratisation of opportunities

It is becoming increasingly clear that artificial intelligence is significantly accelerating the process of technological democratisation. Special effects that were once reserved for Hollywood productions are now accessible to everyone. Entrepreneurs can now tackle tasks that used to require an entire start-up team of specialists. Designers, photographers and artists have access to opportunities that were unthinkable just a few years ago.

AI promises more creative and efficient work. But this new power also brings with it a number of challenges and dangers. For example, a study has been published that describes how AI can create 3D scenes based on reflections of the human eye, a technology that could have come from Mission Impossible.

Jia-Bin Huang

On aircortex.com, W.A.L.D.O. v2, an AI for drone surveillance, was recently presented. It basically allows anyone to collect data, which raises profound questions about privacy and ethics.

We are on the cusp of a new era in which AI will fundamentally change the way we work and create. How we will utilise this new power remains an open and pressing question. It is crucial that we make sense of the opportunities that AI offers us, while also recognising and addressing the potential risks and challenges. It is up to us to shape a future that is both innovative and responsible.

Don't give up the real world

I'll end with the beautiful campaign from Nikon, which appeals to us not to lose sight of the real world because of all the AI. Nothing is as fantastic as the world at our feet.

Campagne: "Don't give up the real world"
We just noticed that you surf with Internet Explorer. Unfortunately, our website does not look so nice with it.

You want to know why that is?
We have written about it.

Blog

You need help with the changeover?
Get in touch. We are happy to help

Contact

Install a new browser?
There's lots of choice.

Browser