Coded object names and AI misinterpretations of "A-typical" voices

celineframpton
May 7, 2021
5 min read

Updated: Dec 2, 2021

Celine Frampton, untitled experimentation, April 2021.

At the beginning of term one, I did some interview with my sister (henseforth J) talking about the medical objects she uses, and I help her to use on a daily to weekly basis. The interviews as she requested were kept for my research workbook only, but could be utilised to help generate ideas and think through medical objects in the perspective of the user.

When discussing such objects, I noted that J had, what she called, nicknames for the medical tools/objects/aids she used on a recurrent basis. (reason for this current excluded from public viewing - may be updated in the future but I know why.)

I'm interested in how coincidently we both used codes frequently. I have explored the use of code in works and titles in the last year or so, as I'm interested in how they conceal / disguise / reveal , their relationship to to the idea of jargon, how things are named or abbreviated in medical and technological fields. Essentially, how coded words create jargon or a secret language that alters modes of inaccessibility and accessibility. Im also interested in how this inaccessibility and accessibility could relate to the public versus the personal. How codes can be used for something - an object / tool that is visually overt , is subjected to and draws the normative gaze but is personal / and sometimes is exclusively used within a personal/domestic setting.

When recording the list as "spoken word" audio pieces references the way in which J usually says most of these coded names of her objects. She noticed a "speech to text" option under her phones "voice recorder" default application. She asked if we could try it, and considering speech to text and AI recognition are two elements I have been interested in / explored in previous works, and the fact I didn't know this was now a built in feature of Samsung phone and was curious I obliged.

The resulting work explores how coded names can be a form of reclamation of objects which separates their identification and naming from an external force - Government run organisations and the way in which they name medical objects/tools/aids. For example, a walker owned - and I say that in the loosest sense of the word as there is a sticker that says " Return to Enable stores, property of Ministry of Health (MOH) "- by J has 4 stickers all with different codes / numbers and barcodes.

And, how in turn these codes can trick / confuse systems who use AI - specifically virtual assistants such s Siri, bixby, google etc. who have been built based upon "typical" human speech patterns and voices. "Intelligent virtual assistants are AI-driven programs that can understand natural language and complete tasks based on your spoken commands." [1] VAs voice recognition have been based upon a-typical speech. Instead of J speaking a command into a Virtual assistance (Bixby or google, Samsung Phone & TV) which the tv can't understand purely because it lack education in the variety of human speech conditions and patterns. Which in turn doesn't fulfil it's purpose of assisting. J speaks her own code which the AI cannot decipher into a list nor paragraph structure. A language processor created by code is tricked by another code.

Interesting when we consider that the technology industry can be associated with jargon. Specifically, abbreviations, initialisms and acronyms - which make some technological speak almost impenetrable or inaccessible to the general public. for example, AR, CGI, MR, XR... .Which you would think would make AI / VA systems able to understand abbreviations - not necessarily there meanings in this context because that is solely known by J and myself, but be able to list them as "phrases" but it cannot.

In Virtual Assistants as a Tool for People with Disabilities, Sam Berman notes that "each {virtual assistant} has their strengths and weaknesses, but they are all extremely valuable and functional tools that exhibit a great opportunity to be leveraged by people with disabilities to help us accomplish daily goals and overcome daily challenges." [2] I think the premise of VA's are greatly beneficial - when thinking in reference to J, spelling and sentence structure in writing can be difficult, but she is far more confident in speaking. Though thinking about numerous times I have witnessed J using VAs, the amount of occasions is correctly interprets / recognises what she says - and can complete that task for her is (anecdotally) less than 20%. More often than not, it results in her becoming frustrated that the VA cannot understand what she is saying - and so, she either has to type it or has to requests someone to type it or say it for her.

I think it is important to note, that the misunderstanding of speech of people with disabilities is not restricted to AI and VA. As with a loth of things, the more you are exposed to it the easier it is to understand. This seems to hold true to understanding a-typical speech / speaking patterns. Living with my sister for over 20 years , and my parents longer, speaking with J is essentially the same as talking to anyone else. I can understand at the same rate / level as anyone else. On occasions where we are with people who who don't know / interact with J as readily as we do, it becomes clear that people don't understand what she is saying so readily. Obviously, this isn't the fault of J but arrises issue with people being taught / exposed to idealised speech and speech patterns.

I think its interesting to consider how AI and VA could accomodate A-typical speakers who use their applications.

Existing technologies that look into this :

"Google’s offering are that you have to interact with the Assistant using full sentences. This could make it difficult for a person with certain cognitive impairments to interact with it if they have difficulty conceptualizing full sentences in their mind before speaking out loud." [3]

On Artificial Intellegence, Nicholas V. Findler, Prof. Emeritus of Computer Science and Engineering, notes that the basic objectives of AI "it is to enable computers to perform such intellectual tasks as decision making, problem solving, perception, understanding human communication (in any language, and translate among them), and the like. Proof of this objective is the blind test suggested by Alan Turing in the 1930s: if an observer who cannot see the actors (computer and human) cannot tell the difference between them, the objective is satisfied. [4]

--------

Terminology used within this post that refer to people with disabilities such as non-verbal and "a-typical" have been sourced from texts and in communication with my sister J. Such terminologies have complex meanings and implications, and can often be controversial and subjective to the individual. They are used with the greatest of intent, sensitivity and to communicate conditions and disabilities - and the representation / references to these conditions and the people live with them - in spheres of medicine, academia and the general population. I am constantly learning and these articles will be re-edited when deemed terms need to be altered. I will use double quotation marks to emphasise some terms which are commonly used or are jargon specific but whose underlying meanings / implications I believe are problematic.

[1] - "A Siri-ous guide to the world of voice assistants: AI virtual assistants explained for 2021," 2020, https://digitalhumans.com/blog/what-are-virtual-assistants/

[2] & [3] -Sam Berman, "Virtual Assistants as a Tool for People with Disabilities," 2017, https://www.linkedin.com/pulse/virtual-assistants-tool-people-disabilities-sam-berman

[4] -Nicholas V. Findler, Artifical Intelligence, https://emerituscollege.asu.edu/sites/default/files/ecdw/EVoice1/n1%20Findler.htm

2018

Coded object names and AI misinterpretations of "A-typical" voices

Recent Posts

Comentarios