Contemporary Music and AI: an historical overview, and future predictions.

The 12th of July 2022, almost two years ago, the launch of an innovative software shaked the web and society alike: MidJourney was first released as an open beta. The first steps into consumer-available AI applications were made, and along that the first controversies ensued, following concerns by artists worldwide on the legitimacy of the software’s training database and its possible applications.

MidJourney is what is called a Generative AI, a form of Artificial Intelligence which thanks to what are called Large Language Models is capable of in-depth analysis of a database of paintings and the use of these analysis to create new content thanks to a prompt sent by the user.

The release of another LLM AI software, Chat-GPT in November 30, 2022, marked another milestone in AI’s technological Space Race: With its ability to almost perfectly converse with the user, its capability to seemingly invent anything from short stories to detailed narratives, and its incredible information database, Chat GPT instantly set the bar for AI development all around the scientific world.

Chat-GPT, with its newest iteration, GPT4o (13th of May, 2024), seemingly accepts any kind of text, audio and visual (both picture and video alike) input, and can generate any combination of text, audio and image outputs, according to OpenAI’s description, and has become the foundation for many AI assistants like Microsoft’s Copilot and the new iteration of Apple’s Siri.
Of course, this marks an incredible milestone in scientific research, and brings us closer to all the Science Fiction novels and stories we are used to consume since the genre’s debut with the 1868 novel “The Steam Man of the Prairies” by Edward S. Ellis, and robots and AIs became common in 1900s Science Fiction thanks to Isaac Asimov’s works like “I, Robot”, and the first definition of the Laws of Robotics by the russian-american author.

Why all the fuss, then, if this is an incredible scientific milestone, comparable to the invention of the wheel, or the discovery and “taming” of electricity?

The introduction of Generative AIs in the consumer market presented itself right from the beginning as a double-faced coin:
On one side, users can now access any information faster than has ever been, with the almost infinite knowledge contained inside Chat-GPT’s database, based on an enormous amount of indexed websites, realizing what the internet has always been praised for (the free sharing of knowledge);
On the other hand, it has become almost common knowledge that Generative AIs have NOT been trained with proprietary databases, and takes all their models and formulas from the web, inside which copyright protected content is a large pie of the cake for which AI companies didn’t pay.

Of course, this is ethically unacceptable, and many lawsuits have already been issued towards MidJourney’s team, OpenAI, StableDiffusion, Suno and many others.

This lack of economic compensation towards artists of all arts represents the first of many problems regarding AI’s ethics (or, to be fair, AI companies).

Another issue that is becoming more and more heavy in the artistic market is the ease of access to high quality AI models for companies, which interests are facing towards a very low-cost quality-quantity equation in which human artists represent the worst option cost-wise.

As an ex-startupper myself, I cannot lie saying that companies are totally wrong.
The artist market is full of low quality workers that asks more than what they bring in value-wise, and a small companies (like an Indie Game Developer Studios) have got a very small pool of choices if they want to survive in an oversaturated market:
Hire a human artist, with typically long production times and a great cost, but which brings in high quality assets to the studio;
Or subscribe to a Generative AI service like Dall-E or Stable Diffusion, pay a small fee every month (which is usually in the order of hundred of dollars cheaper than humans), at the cost of less coherence and an overall lower quality of the final assets in exchange of very fast production rates.

Apparently, small and large companies alike chose the second option, and we’re now entering into a digital dark age of arts, where artists are gradually fading from existence as an economic class due to lack of work in small to mid-tier jobs, while commercials and entertainment scripts and visuals are AI generated.

This phenomenon didn’t occur only in artistic fields, sadly, as it affected journalism and scientific press-alike.
It doesn’t come as a surprise the fact that all over the web users are recommending each other to insert custom strings in Google searches like “before:2023”, which leaves all pages created post-2023 out of the search results:
In an AI-saturated web, where people cheaply paste GPT prompts into news articles to save time and money ignoring the many hallucinations by the AI Assistant, which makes the answer factually incorrect or partial a lot of the time, it’s not a surprise people choose to rely on years-old informations.

The problem relies in part on the lack of education of the typical consumer, and the greedy nature of Capital.

Now, far from me is a socialist or communist take on AI and companies, due to my past as a startupper which I mentioned earlier, but the growing need for more products both in entertainment and press markets calls for the lowering of qualitative standards, which until now was possible due to the consumer’s lack of sensibility to art and literature, and the growth of functional analphabetism (the practical inability to comprehend texts and informations) along with the lowering of the consumer’s attention span (which is now lower than a goldfish, according to a study conducted by Microsoft’s neuroscience équipe in 2015), which lowers qualitative standards so much that companies can cut off human workers to rely almost completely on AI assistants, saving money in the meantime.

How does this affect music?

The rising of AI Artists movements in the past two years represents to me, a classical composer, a scary step in the world’s artistic history.
I don’t condemn at all AI as tools which an artist can use to elaborate his works, like in the case of Adobe Photoshop’s Generative AI tools;
I condemn the use of Generative AI as the complete substitute of human thought and technique.

In the late 1800s phonographs and gramophones were invented, and the main critique arose to condemn those new technologies, which ultimately resulted in the 1900s, with the technological progress in hardware and software fields, in the rise of electroacoustic instruments and synthesizers.

There is a subtle yet giant difference between the Pulse Generators of WDR’s Studio für elektronische Musik in Colonia, and the Generative AI of OpenAI:
While the Pulse Generators paired with Oscilloscopes did indeed create sound, they did not substitute for Karlheinz’s thought process. Composer Karlheinz Stockhausen chose how the Pulse Generators would work paired with the other hardware available in Colonia’s laboratory, and the hardware did not substitute his processes, based on serialism and complex combinations of parameters.
Suno AI (a musical-oriented Generative AI), on the other hand, arbitrarily creates something based on a more or less complex prompt by the user, mix-and-matches elements from its database, and generates something the user would find acceptable after a more or less long reiteration of the process.
The user is basically limiting itself to letting the machine do any work, saying “yay” or “nay” to whatever result comes out of the machine’s womb.

This constitutes, in my opinion, a heavy loss for any artist, where he does not have any power in the creative process except for a final judgment on the finished product, and can’t in any way determine the outcome outside of some tags (heavy metal song, distorted guitar, male vocalist, growl, etc).

What is worse, is the self-appointment of the “artist” title by people who do not in any way participate in the creation of the finished art piece, and act more as a commissioner than anything, asking for “classical music, symphonic, happy, rhythmic, 1700s-like”.

What is being hailed as “a new way to make art accessible to anyone without any walls imposed by the difficulty of learning complex techniques” is, in my opinion, making people even less able to express themselves, becoming slaves to a robotic master.

Becoming subjects to limitations imposed by software is something I would never wish for anyone, not even to my worst enemy.
I myself experienced the abyssal difference between writing listening to Sibelius’ playback function and writing knowing the effective acoustic results of what I’m composing.
Banally, I’m thinking of a Suite on themes by John Williams I wrote when I was 17 years old with my friend and colleague Tommaso Bencini (a great jazz saxophonist) in high school, and which got played by the school’s orchestra: although on Sibelius it sounded great, at the concert it sounded like an amateurish brass band of elementary school children’s rendition of Williams’ masterpieces.

What is happening, here, is that new creative people are limiting themselves by choosing to ignore competences necessary for the most basic artistic endeavor in favor of letting an algorithm do the job for them, eliminating any possibility of unicity and personality in the artpiece they want to create.

If the art market is to die sooner than later, except for the multi-million dollar productions, at least the artist which will not be able to make a living out of their passion should set the personal quality bar so high to not let themselves be controlled by the same AI softwares that are replacing them while creating their own art.