Voice acting plays an important but often overlooked role in gaming—featuring in around half of the 400 games released every year. The average game contains around 50,000 lines of dialogue, which equates to about 50 hours of speech. Typically a slow and tedious process, voice acting in video games can take years; this is due to frequent iteration cycles and the accompanying logistics of casting, scheduling, recording, directing and editing.
Sonantic, a venture backed, machine learning startup, has developed a new text to speech tool which creates uniquely human-sounding voices that can be generated in seconds. As well as dramatically reducing the pre-production phase in a voice-based game’s development, the tool unlocks creative potential within wider entertainment for all projects that require quality voice, fast. The actors whose artificial voices feature in commercially released projects also receive a share of the content license fees.
Voice tech is nothing new, and recent developments in deep learning have brought it into all of our living rooms—however the voices we hear from the likes of Siri and Alexa are stilted and robotic, missing the subtle nuances and inconsistencies that are intrinsic to human speech.
Sonantic specializes in advanced speech synthesis, recording human actors’ performances and leveraging proprietary deep learning algorithms to augment the data captured from their voices. The resulting computer-generated speech sounds convincing and expressive, and can be rendered instantly. Games designers can select a voice based on gender, personality, accent, tone, and emotional state.
As part of its new brand identity, the design team created a wordmark, a tool to manipulate this, a set of expressive gradients and a marketing website. Sonantic’s voices are created procedurally, and all of the brand assets created by Pentagram’s design team echo this. Both the wordmark and the gradients are driven by the same underlying system of generated noise. The dynamic wordmark echoes the nuanced, emotional delivery of Sonantic’s voices and the vivid abstract gradients emulate Sonantic’s high-quality expressive speech synthesis.
The wordmark is never shown in its original form, but appears in many different permutations.
The logo changes according to four distinct expressive states; each of these states is based on a quadrant of the circumplex model of emotions (echoing the distinctive emotional qualities of Sonantic’s digital voices). Using a ‘displacement map’, the shape of the letter O changes according to the combination of states, e.g. ‘pleasant + activated’ or ‘unpleasant + deactivated’. In digital applications, an animated version based on the same criteria is used.
A core colour palette of cool greys and pale pink was established which can be set to opacities at the increments of 40%, 60%, and 80% in digital contexts for increased flexibility. Three highlight colours (orange, a mid-blue and a deep red) are used for calls-to-action, titles, and within the background gradients. These mix hot colours with digitally generated forms, reflecting the warmth of Sonantic’s voice outputs and their origins in AI. The gradient assets can be cropped in different ways to give numerous options—these are formatted with and without noise, to suit different applications.
Yassin Baggar’s Beausite Classic typeface in Clear and Light is employed throughout the identity, with the low-contrast Neo-Grotesk adding a clean and modern look which contrasts with the colourful gradients.
Sonantic has used machine learning to create a voice product which is set to positively impact the highly successful gaming and entertainment industry in the UK and beyond. Pentagram’s sophisticated identity reflects its unique product, perfectly capturing the expressiveness and emotion of Sonantic’s computer-generated voices.