Joshua Howe: The Sound of Text

In Part I Styling the Sound of Text, I covered how website developers can currently style the look of text whether it’s the font, appearance, size, color and other features. There are currently unapproved specifications from the W3C now to provide the ability to be able to style the sound of text on a page.

Here in Part II, I'll cover some of the main benefits and assets of using aural style sheets with a focus on usability, accessibility and engagement.

Accessibility

One of the reasons why I get excited about this idea is that in order to take advantage of it, developers will have to write their code in a way that the browser can read and translate the words rather than purely displaying them on the page. So rather than displaying an image of a menu or site content in an image, they'd need to code it in a way that the browser can read the text. When text is formatted this way, it means that it will have a greater probability of being accessible to those with disabilities who rely on technology to read the page to them such as an individual who is blind or visually impaired or has a learning disability.

Text would also have to be formatted in a logical reading order so that the browser read information in order rather than relying on the conventions currently which assume the reader is scanning visually.

However, this isn’t to say that the rest of the page will be coded in a way that will allow individuals with disabilities to navigate the page or access other content. For example, be able to navigate without a mouse as individuals who are blind do, or making controls for rich media such as video accessible.

Texture

As for use of the aural properties in a web page, there are a number of ways a developer might use the opportunity to add richness and texture to their content. They could alternate two voices on the page to simulate a conversation, whether it’s male / female or two of the same sex. The specification mentions the ability to specify different versions of the same voice, such as adult male. There is also the possibility to have the voice be of a child.

Additionally, they could have each voice solely or primarily emulate out of either right or left channel, so that for someone using headphones, it would enhance the effect. This is what the specification refers to as “voice balance”. Anyone who has listened to Whole Lotta Love from Led Zeppelin can testify to the impact of switching channels or changing ears can have.

A developer could also use a voice that matches its target audience such as having a woman speaking to women, or inversely, woman selling products to men (such as is currently done using models to sell men’s hygiene products or jewelry.)

An alternative would be to modify the speed or volume of a voice or add a dramatic pause as the message becomes more poignant. The specification notes the ability to increase volume by using numbers 1-100, and the ability to control the speed or where a sentence might pause and for how long. The could subtly increase the speed and volume as the message increases in intensity, much like the background music does in a movie.

Search Engine Optimization (SEO)

As the web evolves, search engines continue to find ways to gather meaning from the signals and content out there. In addition to down ranking "black hat" SEO techniques, Google looks for additional signals form pages to determine what your content may be so as to bring better, more relevant, results. To this end, they've been using image file names as a signal of image content. So rather than a default file name assigned by the camera which is usually a combination of letters and numbers such as 3242r3.jpg developers are encouraged to use meaningful file names representative of the content of the image. cat.jpg, ford_mustang.jpg etc.

With being able to style text by sound, Google and other search engines will be able to derive greater understanding of content based on the tone and feel of the text. Using the voices chosen, speed, pauses and other information, they'll be able to identify not only content, but perhaps emotion.

The web remains a young animal and is changing and adapting. We continue to add tools which allow us to add texture and relevancy to our content. It's how we visually and viscerally engage readers.

See also

Part I Styling the Sound of Text
CSS Speech ModuleAppendix A Aural Style Sheets (part of Cascading Style Sheets Level 2 Revision 1 (CSS 2.1) Specification June 2011)

Joshua Howe

Pages

Monday, October 15, 2012

The Sound of Text

Accessibility

Texture

Search Engine Optimization (SEO)

No comments:

Post a Comment