Jumio is detailing some of the steps developers can take to minimize the amount of bias present in their AI algorithms. In that regard, the company emphasized the importance of the datasets used to train those algorithms, noting that a database that does not reflect the entire population will reproduce those same blind spots when that database is fed into an AI model.
With that in mind, Jumio argues that AI algorithms should only be trained using large databases that are representative of the population that they will be applied to. Developers that fail to take such precautions will end up with biased algorithms that do not perform as well for many members of the public, as in the case of speech recognition algorithms that are trained with the voice samples of white, upper-class Americans that struggle when asked to identify the accents and speech patterns for those who fall outside that narrow demographic.
In plain terms, that means that developers that want to eliminate bias need to create and implement a plan for doing so long before they start training their algorithm. They need to evaluate the origins and the integrity of the dataset they plan to use if they are getting that data from a third party, or they need to make sure that their data is representative if they are collecting their own.
They also need to apply the same level of rigor to the labeling process. If problems like glare and blur are mislabeled in the database, then the algorithm will end up learning and applying the wrong labels when it gets turned loose on the real world. Jumio advises developers to handle labeling internally instead of outsourcing it or using an automated program, and to introduce quality controls to make sure that any errors can be caught and amended.
Finally, Jumio noted that the team developing the algorithm and doing the labeling should be just as diverse as the population in the dataset, with members of different races, nationalities, genders, ages, and professional backgrounds. A diverse team will be less likely to recreate biased assumptions that would otherwise go overlooked if everyone on the team has similar life experiences.
The takeaway is that biased AI is created when a biased dataset gets encoded into an algorithm, and an algorithm that performs poorly can hurt business, and lead to discrimination and other legal issues if that bias is not addressed. Jumio believes that many organizations will start to demand AI solutions that minimize demographic bias moving forward, so developers that fail to adapt could fall behind competitors that are more careful with their data.