Incorporating AI-tools in an introductory course in Object Oriented Programming.

The advent of generative AI tools, particularly since the launch of ChatGPT in November 2022, has significantly impacted academia. Concerns regarding the potential misuse of these AI tools by students for cheating in exams and assignments have been widely debated. While the possibility of misuse exists, it is essential to recognize that these tools are here to stay. As with any technological advancement, it is incumbent upon educators to teach students to use these new tools with a critical and reflective mindset.

In university programming courses, it has traditionally been common for students to seek assistance and guidance online, either as a supplement to or in place of consulting teaching staff. Students frequently turn to online forums for solutions to programming assignments, and when they encounter code that appears to address parts or the entirety of the assignment, they often copy it without fully comprehending its functionality. This practice can result in a lack of understanding and impede the students’ learning process. Furthermore, students often fail to evaluate the quality, robustness, or readability of the code before integrating it into their own work. This behavior is particularly concerning when students transition into the industry after three to five years at university and continue to uncritically copy and paste solutions found online into production code. The advent of generative AI tools has exacerbated this issue, as students can now generate code without fully understanding its functionality or assuming full responsibility for it.

Therefore, how can courses and assignments be designed to encourage students to practice critical thinking and reflection while seeking assistance, whether through AI tools or random internet searches? Additionally, at what point in the syllabus can we reasonably expect students to critically evaluate code from diverse sources?

Introduction to OOP

Developing complex IT systems requires more than just mastering a programming language. To create sustainable, reusable, and maintainable solutions, developers must also incorporate established best practices in software engineering. Unfortunately, many IT/CS systems suffer from low-quality code, which can be extremely costly for society. Reports indicate that both practitioners and educators consider good code quality an essential competency for prospective engineers. However, most introductory computer science courses (CS1) focus more on programming languages than on good code quality and robust code, leaving the topic of robust and high-quality code to be addressed in later courses.

As part of the Bachelor of Engineering in Computing Science program at NTNU, a course in introductory object-oriented programming is given in the first semester. More than 14 years ago this course adopted a top-down approach, by introducing the concepts of objects and classes from the very first lecture based on the textbook by the authors Michael Kölling and David Barnes: “Objects first with Java”. We believe that learning to program is more about developing an attitude towards what makes your code great and learning best practices rather than soly focusing on the syntax of a programming language. Hence the course at precent focus on:

Early objects – so that the students learn from the very first day how collaborating objects are used to solve complex problems

Early code quality – if you do not learn what high quality code is from the very beginning, you will develop bad habits that are difficult to change later

Early robustness – validation of parameters in method calls, using and handling exceptions, and writing good unit tests.

Early design principles – establish best practices in OOP design as early as possible (coupling, cohesion, SOLID etc)

Because of this focus, the students establish the necessary understanding to be able to perform critical and reflected evaluation of third-party code at a very early stage.

Early AI

With the emergence of AI, the students suddenly had a tool that could write much of the code for them, far beyond already existing tools for code completion. Again, we believe in early adoption. Hence an assignment was designed to be given to the students as early as possible, but after they had gained sufficient understanding of what makes code great. Two common practices being used in the software industry are pair programming and peer review (pull requests). The assignment was hence designed by incorporating two different AI-tools to play the roles of the pair programmer and the peer reviewer

With the advent of AI, students suddenly had access to tools capable of generating substantial portions of code, surpassing the capabilities of existing code completion tools. In line with our belief in early adoption, an assignment was designed to be introduced to students as early as possible, but only after they had acquired a sufficient understanding of what constitutes high-quality code. Two common practices in the software industry, pair programming and peer review (pull requests), were incorporated into the assignment. Consequently, the assignment was structured to utilize two different AI tools to simulate the roles of the pair programmer and the peer reviewer

Figure 1 Proposed approach

While both GitHub Copilot and ChatGPT are based on the same underlying generative AI concepts, GitHub Copilot have been trained on massive amounts of textual data and open-source code from public sources and public GitHub repositories. Although ChatGPT primarily have been trained on natural language, it is also quite good both in generating code and to validate and evaluate code.

Design of the AI assignment

The assignment was designed to lead and guide the student through a structured set of activities. The assignment is given to the students in week 5 of their first semester. In the assignment, the students were asked to follow these steps:

Use the GitHub Copilot plugin in their IDE to generate code for a class only based on a short documentation of the class of the type “Represents a person with name and age” (Using GitHub Copilot as a pair programmer)
Analyse the code suggested by Copilot with respect to the quality- and robustness criteria being taught in the course and write down their reflections (in a text field in the assignment)
Ask ChatGPT to analyse the code generated by Copilot by asking the following question: “How would you evaluate the following Java code?” followed by the code. (Using ChatGPT as a peer reviewer)
Compare the analysis given by ChatGPT against their own analysis of the code. Followed by writing down their observations and reflections.
Refactor the generated code to the level of quality and robustness taught in the course.
Describe which changes and additions were made to the code in the refactoring and explain why these changes increase the quality and robustness of the code.
Finally, the student was asked to answer some questions to reflect on the gained experience with using AI tools as a pair programmer and peer reviewer.

Evaluation of the AI exercise

The assignment was given the first time in September 2023. To evaluate if the assignment did have an effect of the students critical thinking practices, we conducted a pre- and a post-survey. The survey was developed based on established theories on Critical Thinking. 19 questions were designed. The format employs a five-point Likert Scale with choices indicating the frequency of Critical Thinking practices, ranging from never to always. The questions were reviewed and refined by an expert panel of 3 university professionals in IT/CS. Participation in the pre- and post-surveys was voluntary and anonymous and had no effect on the course assessment.

Figure 2 Changes in Critical thinking Practises

Figure 2 presents the changes in students’ opinions and behaviour. The first numerical column shows the mean of students’ answers in the pre-survey, while the second numerical column shows the mean of students’ answers in the post-survey. Note that many of the statements are positive, reflecting behaviours we encourage in students, but some are negative. Therefore, the last (green) column displays the positive difference between practices before and after the intervention.

It was interesting to observe that out of the 19 statements, only 5 showed a negative (undesired) change. The most significant change was the mean of the statement “I use a piece of code in my program even if I do not fully understand it”. How much of these changes are direcly related to the assignment, and how much are related from the teaching in the classes are difficult to say. But the overall improvement in the students’ critical practises were reassuring.

In addition to the pre- and post-surveys, the assignment had open ended questions asking the students to reflect on the observations and their experience from using AI in coding in the manner suggested by the assignment.

Figures 3-6 Results from the students’ reflections

As can be seen, the students in general were critical to the robustness and quality of the code generated by the GitHub Copilot. It was also reassuring to see that the student understand that they are the ones responsible for the code they deliver, being produced by the students themselves or using AI-tools.

Summative Assessment by portfolio

In the course discussed in this article, written examinations were replaced with portfolio assessments over four years ago. The portfolio comprises a substantial project that the student works on for approximately ten weeks, accompanied by a written report in which the student reflects on and discusses the solution in relation to the theoretical concepts taught in the course. Additionally, the student is required to discuss the use of AI tools and reflect on their experience with these tools. During the ten-week period, students are provided with three oral feedback sessions with the instructor or learning assistant. The final submission is then graded.

The move from exam to portfolio has also been a move from purely assessing the student to also become an important part of the learning process for the student.

Despite the transition to portfolio assessments, which permits students to extensively utilize AI tools and other resources for assistance, there has not been a significant improvement in grades compared to the previously used written examinations.

Moving on

Since its introduction in 2023, the AI assignment has become a mandatory component of the first-semester course. The use of AI tools is actively encouraged by instructors, who demonstrate and discuss both effective and ineffective methods of utilizing these tools with students. One of the primary advantages of training students to use AI tools effectively is that it allows instructors to focus more on the critical concepts of designing high-quality code and robust software, rather than on syntax and language features. Additionally, these AI tools function as extensions of the instructor and teaching assistants, as students frequently use AI tools as their initial resource before seeking feedback from instructors or learning assistants.

New tools have also emerged, such as the Code Tutor from Khan Academy and more will come. Code Tutor GPT assists students in thinking through problems by offering hints, strategies, and conceptual explanations, although it will not write the code unless explicitly requested. Several students have found the Code Tutor to be a valuable supplement to the feedback and supervision provided by instructors and learning assistants.

As educators, we need to reconsider both our teaching methods and our assessment strategies. With the widespread availability of new AI tools, it no longer makes sense to test students on their ability to reproduce theory in written exams, as these can easily be completed using AI tools. The same applies to programming course exams. Instead of testing students’ ability to write correct code, we should assess their understanding and ability to demonstrate concepts and design principles within software development at a higher level, while simultaneously maintaining and further developing their skills in critical thinking and reflection.