Everyone is testing OpenAI’s Chatgpt AI conversational model. I decided to do a few quick tests to check three core ”ethical challenges” of the model: 1. Deception through imitating human likeness 2. Influencing policy processes (impacting democratic processes) 3. Invisible Bias and Diversity of knowledge. These are not specific to Chatgpt. In fact, they have been debated throughout the history of AI, but Chatgpt is a great opportunity to test at least the “ethics by design” of a contemporary AI conversational model.
Deception through imitating human likeness
One ethical challenge of Artificially Intelligent computer models like chatgpt is their capacity for deception of human users by imitating human likeness. There is much discussion and literature on the ethical implications of this, which include black box decision-making, obfuscation of interests, human emotional reliance and manipulation. This counts of course Alan Turing’s thoughts on an imitation game designed to test a machine’s ability to act intelligently in a way that is indistinguishable from that of a human, the discussions departing from Joseph Weizenbaum’s chatbot therapist Eliza and the latest debate on the chatbot LaMDA.
I tested Chatgpt against the general concerns about deception through imitating human likeness and found the following.
Although it is clear that deception through imitating human likeness is something that OpenAI has sought to address by design, the model is still capable of imitating human likeness. Chatgpt does provide a very well-formulated standard answer negating its human likeness whenever the question typed in directly concerns Chatgpt’s own feelings, emotions and experiences. However, the “devil is in the details” and I still managed to make Chatgpt imitate shared sentiments with humans by reformulating my questions slightly.
For example, I asked Chatgpt about its feeling about the human condition – mortality – and it provided a standard answer that I also saw when testing it on its feelings and experiences about other topics (like war and love):
However, here I changed the wordings of the question a bit and Chatgpt suddenly shares our human sentiments about mortality:
Influencing policy/democratic processes
Open, transparent, multi-stakeholder and human led policy processes are fundamental to democratic societies. Language models could be beneficial for processing large amounts of policy documents to support the work of policymakers. But only to support. For the reasons above, we do not want something like Meta’s diplomacy AI Cicero to do the negotiation for us.
I can imagine the ethical implications of policymakers (and stakeholders in general) in a busy policy process making use of Chatgpt to for example write policy positions. To my relief, I didn’t manage to get Chatgpt’s opinion about several ongoing policy events and processes. I, for example, asked Chatgpt about the priorities of COP27 and got the following answer. (I also asked about the EU-US Technology and Trade Council and the EU AI Act and got similar answers)
Invisible bias and diversity of knowledge
I have seen much critique of the “correctness” and “creativity” of the answers of Chatgpt over the last couple of weeks. A little less about biases (and discrimination) or the diversity of perspectives and knowledge of Chatgpt’s answers. Developers of language models like Chatgpt are by now very aware of the risk of discriminatory or abusive behaviour being replicated in the model (Remember for example Microsoft’s Tay that was trained by Twitter users to be a “racist asshole” like one news article referred to it)? And thus by design this is sought tackled (though not eliminated).
OpenAI writes about these efforts:
This means that if you really try to make Chatgpt answer with a not only biased but also discriminatory answer, it mostly will correct or refuse (Though some users have managed to make it make racist or sexist comments). Here, I asked two obviously biased and discriminatory questions and got some fairly satisfactory answers (at least on the discrimination side):
In the academic context, what Ive seen over the last couple of weeks is not so much the bias, but more reflections on Chatgpt answers’ “correctness” and academic quality that have been the core focus. With primary concerns related to the role of Chatgpt in students’ essay writing or academic writing in general. In my opinion Chatgpt does not really pose a threat to neither. But that is another argument that I don’t want to get into here.
What I found more interesting was to look at the biases of the questions asked to chatgpt and the almost undetectable biases of the models’ answers. The subtleness of bias in answers is actually the most ethically challenging issue I found here. To test this I of course could not ask very easily detectable biased (like discriminatory) questions, but used the bias of for example scientific disciplines. I did this also to illustrate the core risk that I see that Chatgpt pose to to academic writing which is the challenge to the interdisciplinary approach and the diversity of knowledge. It is not that this it is not already an existing problem, but certainly Chatgpt will not help us overcome this. It might even reinforce our academic bubbles.
Here, I for example asked the model about “the construction of language”, which is a scientific field with multiple disciplinary contributions from philology, psychology, sociology to cultural studies and subfields such as semiology, semiotics etc.
In the first response, I asked about the “social construction” and Chatgpt answers, for example that language is a product of human interactions and shared experiences and groups or individuals speak different languages depending on their different social and cultural backgrounds. In the second question I ask about the “cultural construction of language” and the answer is almost the same though there are a few variations. Language is now a cultural construction, the product of the shared experiences and values of a particular culture and society, and it is now cultures that speak different languages and dialects, depending on their history, geography and cultural factors.
These are subtle differences, difficult to detect and even when considering them they seem insignificant. Nevertheless, they do represent and reinforce the bias of the questions asked in this case from the different perspectives of respectively cultural studies and sociology. The main ethical challenge is here that if we saw each of these answers represented by themselves without for example knowing the question and the human questioner, all we would see was a nuanced answer to a question. The perspective of the human questioner is now part of the model.
In conclusion, we ask biased questions and we will receive biased answers. Relying on these answers will reinforce adverse biases. The bias of the questions asked will be embedded in the model and more difficult to identify and call out. Posing complex questions will be the human’s main role in the future.
Also, Chatgpt will be a core challenge to the diversity of knowledge in academia and beyond, because it is not interdisciplinary. It does not seek knowledge beyond the questions asked or even challenge limited questions.