As Andrew Ng, adjunct professor at Stanford, describes it, “AI is the next electricity, and there is no going back.” Just as electricity has changed many industries since it was introduced, so will AI. Therefore, it is important for individuals and businesses to understand this powerful technology, to ask the right questions, and to make the right tradeoffs. A thoughtful, intentional approach will lead us beyond AI’s downsides, and toward a better future.
Neural networks 101
AI is not new, and it has had many “winters.” However, we are entering a stage of perpetual “spring,” primarily thanks to advances in the scale and performance of Neural Networks. The commonly cited analogy for Neural Networks is that they are loosely inspired by the mechanisms in the human brain. The reality, however, is that we do not fully understand how the human brain works, so this analogy doesn’t hold up under scrutiny.
But, we all know that the square footage of a house is not the only factor that affects its price. There are additional factors, including the number of bedrooms, zip code, school district, walkability, etc.
With the help of neural networks, there is no need to provide these factors to the system. Instead, it’s as simple as gathering real estate data and feeding it to the network. It will automatically determine the price of any house.
This is how AI systems work, and there is an inherent challenge with this “black box” approach.
Bias in ML Training Data
AI algorithms are only as good as the quality and accuracy of the training data. If the data is biased or skewed in any way, the resulting algorithm will be, as well.
It is worth noting that the word “bias” has more than one meaning — it means one thing to the technical ML community, and something slightly different in a legal sense. Kate Crawford, a researcher at Microsoft, explains it this way:
In an ML technical sense, bias refers to systematic differences between sample data and the target population. It refers to errors of estimation, where some members of the population are more likely to be sampled than others.
Statistically speaking, the goal of the ML system is to train algorithms on data that “fits” or mirrors reality. However, it may face the problem of “underfitting” or “overfitting.”
The classical visual representation of the statistical bias is underfitting — where the AI model fails to represent the underlying trends in data. In this situation, there is low variance but high bias.
Contrast that with overfitting, where there is high variance. Here, the models are extremely sensitive to slight variations, capturing all the noise in data along with the signal.
In a legal sense, bias means undue prejudice. It means judgment based on preconceived notions or prejudices, as opposed to the impartial evaluation of facts. Impartiality is fundamental to many of our legal processes.
When there is historical bias in real life, data collection could be statistically unbiased — i.e., there is a perfect fit. Yet, this training data could lead to biased results in the legal sense, since the structural inequalities are encoded into them.
Google’s AI lead John Giannandrea highlights the issue of bias in a legal sense in the article Forget Killer Robots the Real Danger Is Biased AI.
He references a study by ProPublica that best illustrates the human impact of bias in training data in the criminal justice system.
Courtrooms today are using AI systems to predict future criminals. The programs help inform decisions about everything, from bail to sentencing. They are meant to make the criminal justice system fairer — and to weed out human biases.
ProPublica tested one such program and found that the system was wrong and biased against African Americans.
Here is an example of what that bias looks like in real life:
Vernon Prater, 41.
COMPAS Score: 3 — Low
Subsequent offences: broke into a warehouse and stole $7,700 worth of electronics, 30 felony counts, including burglary, grand theft in the third degree, and dealing in stolen property when he pawned the stolen goods.
Brisha Borden, 18.
COMPAS score: 8 — high
Subsequent offences: None
It is worth noting that the Courts introduced the AI systems to reduce human bias from decision making. However, they failed to recognize that systems learning from the past are destined to repeat it. The algorithms are not biased in and of themselves. They are amplifying the biases inherent in their training data.
Business Beyond Bias
Organizations are starting to introduce AI and ML capabilities into various parts of their businesses, from recruiting, hiring and promotions, to supply chain and vendor management.
In all of these scenarios, businesses have the opportunity, not to blindly recreate the past, but to proactively create a better future.
At SAP, we have an initiative called Business Beyond Bias, in which the system is designed to help organizations encode best practices into their business practices. For example, the Job Analyzer product tackles the inherent bias encoded in our language. It evaluates the effectiveness of the job description, including whether or not the description will introduce gender biases.
Data Scientist Weiwei Shen, explains that certain words like “ambitious,” “aggressive” appear more appealing to men, while words like “kind,” “compassionate” appear more appealing to women. The goal is to use gender-neutral terms to attract the best candidates.
This approach is inspired by academic research on English language gender bias detection by Cornell and Boston University. The results of their study are published in the paper Man Is to Computer Programmer as Woman Is to Homemaker? Debiasing Word Embeddings.
Business beyond bias is not only the right thing to do, but it is also the smart thing to do. It safeguards the long-term interest of organizations. In most cases, business leaders know this and ascribe to goals of fairness and equality, but they may lack the tools to steer the organization in this positive directly effectively. If approached from that perspective, we can use AI to have a positive impact on humans in organizations and society at large.
AI-based, machine-learning systems amplify our intentions. Therefore, it is imperative that we examine the intentions encoded in this technology to ensure that we co-create a future that we can all enjoy. Here are some recommendations that organizations can put in place to transcend beyond bias.
- Articulate organizational values of fairness and equality in addition to business goals of growth, revenue, and profitability. Highlight their importance to long-term business success in a clear, accountable, and actionable way.
- Be transparent and communicate this to all employees, including data scientists, so everyone can take this into account while building or deploying machine-learning systems. Communicate this to technology vendors and understand their efforts in eliminating algorithmic bias while making technology purchasing decisions.
- Benchmark training data against established standards of statistical or legal bias. Adapt the development process to introduce steps to test training data before system deployment.
- Once deployed, continue to validate the algorithms periodically, adjust if needed. It is the early days for these technologies, and we must actively “train and supervise” them to safeguard the interest of the organization.
- Hire a diverse and empowered workforce and enlist the entire employee base’s help to eliminate bias in the workplace and product offerings. This will reduce the chances of organizational blind spots, as there will be people present who can help prevent unintentional bias from creeping in.
We cannot erase the mistakes of the past, but we can learn from them and avoid repeating them in the future. After all, we humans are naturally intelligent learning systems, so the systems we create can be too.
Credit to Sue Ju for illustrations.