Skip links

Time To Teach Students To Understand Machine Learning

New Project at Aarhus University has received funding by The Novo Nordisk Foundation will teach students to understand machine learning

In 2021, I submitted my PhD dissertation. The concluding paragraph read like this:
”Emphasizing the discussion on present perspectives and the analysis of the use of digital technologies in society today, it is evident that students need not only to learn to understand programming in terms of step-by-step algorithms and mathematical concepts, but also to understand machine-learning models and to critically reflect on such models. Research on this area is lacking and much needed”.

The argument is based on the fact that machine learning systems have a significant impact on everyone’s life, and children’s interaction with such systems is increasing. Therefore, children need to learn to understand how such systems work and can be developed, including how they affect themselves and society, in order for them to be able to participate in and take responsibility for the development of society.

In collaboration with colleagues, I have been given the opportunity to carry out research in this particular area. The Novo Nordisk Foundation has granted just over DKK 2.1 million for our research project that aims to investigate how lower secondary school students can learn to understand the use of machine learning models in science contexts, including becoming able to understand and take a critical stance on the real use of such models to participate responsibly in their development.

Our intention is to develop and test a course and an accompanying evaluation tool in collaboration with two biology teachers in lower secondary school, which will be tested in four classes. Specifically, the students are going to develop a machine learning model that can ‘recognize’ biological images and classify them. The students will collect data themselves, i.e. in this case take a series of photos that a machine can ‘learn’ to recognize, and they have to work in a scenario-based way, i.e. based on solving a problem relevant to society. For example, we plan for them to work in groups to create a machine learning model that can classify leaves on trees or plants, skin types, waste, or food. For example, using photos of different skin types, they could work on making a model to help people know how much sun they can tolerate and what sunscreen they should use, and using photos of waste, they could work on making a model to help sort waste.

The course will provide opportunities for debates on ethics, data protection, data privacy, algorithmic errors, overtraining, etc. in class. Along the way, students will encounter kinds of problems that also exist in the real world today – for example, biased data if they do not train their machine with photos of all skin tones, or if they only take photos of food from a particular store. Their work should provide an opportunity for developing competences that relate to the real world, including understanding how large machine learning technologies actually work in society – for example, how machine learning can help identify diseases at earlier stages than without this technology, or help farmers assess the suitability of their land for particular crops.

We expect the output of the project to be a research-based science course and an associated evaluation tool that can be used directly or redesigned by other teachers and students. In addition, the project contributes to the development of science didactics at the intersection of biology and computer science education through our experiences with teaching in practice. For example, through observation of teaching and evaluation of students’ professional development, we expect to identify challenges that prevent students from developing their competences as intended.

The project runs for three years, starting January 2023 and ending December 2025, and it can be followed here