Evaluating the Effects of Virtual Pair Programming on Students' Achievement and Satisfaction

Pair programming is a lightweight software development technique in which two programmers work together at one computer. In literature, many benefits of pair programming have been proposed, such as increased productivity, improved code quality, enhanced job satisfaction and confidence. Although pair programming provides clear pedagogical benefits, its collocation requirement and the limited time during a lab session are serious barriers in the full deployment and evaluation of this programming technique. 
 
This paper reports on a study that investigated the effectiveness of Virtual Pair Programming (VPP) on student performance and satisfaction in an introductory Java course where students worked collaboratively in pairs on homework programming assignments, using online tools that integrated desktop sharing and real time communication. The results of this study support previous research findings and suggest that VPP is an effective pedagogical tool for flexible collaboration and an acceptable alternative to individual/solo programming experience, regarding productivity, code quality, academic performance and student satisfaction.

INTRODUCTION Originating from industrial settings where collaborative software development is the norm, pair programming has come recently in the centre of interest of computer science educators, in an effort to transform the solitary activity of learning to program into a collaborative problem solving process. The principal idea behind problem-based learning is that students are presented with a problem and they begin working in small groups and negotiate in defining the problem precisely, assess what they know, identify what they need to know, plan how to proceed, gather information, collaborate on the evaluation of hypotheses, brainstorm possible solutions, choose one solution, look back and reflect and arrive at clearly stated solutions.
Collaborative exchanges in problem solving tasks extend cognitive activity and team members are able to monitor individual thinking, cope with different opinions, give and take feedback that results in clarification and change, and provide social support and encouragement to each other [1]. Pair programming, as well as test-driven development and refactoring, lies at the core of extreme programming, a discipline of software development based on values of courage, communication, feedback, and sim-plicity. Extreme programmer teams initially build code according to a simple design and through testing, continuous feedback and design improvement, they keep it that way.
Pair programming especially is a practice in which two programmers work at one computer, collaborating on the same design, algorithm, code or test. Sitting side-by-side and assuming the roles of "driver" and "navigator" or "observer" the two programmers discuss about the current implementation, possible alternatives and errors, searching for a better algorithm to use, optimising parts of the code, creating functional tests for every piece of code and finding better functions or libraries to call. The driver has the control of the keyboard and actively implements the code while the navigator looks at the driver's work and identifies tactical and strategic defects and issues [2]. From time to time, the developers switch their roles, so that both equally develop code.
II. WHY TO USE PAIR PROGRAMMING? Williams at NCSU, examining the effectiveness of student pair programmers, references many studies that have shown that pair programming creates an environment conducive to more advanced, active learning and collaboration, leading to students being less frustrated, more confident, and more interested in information technology [3] [4]. Although computer programming is known to be a complex skill that is difficult to master, pair programming has been shown to produce improved outcomes: better quality software, faster code production, fewer defects, increased programmer confidence in solutions and greater enjoyment [4][5] [6].
Williams and Kessler observing effective pairs concluded on several behaviors that tend to happen naturally and contribute to the achievement of the benefits of pair programming: pair pressure, negotiation, courage, reviews, debugging, learning and pair trust [6]. The collective ownership of all that is produced, the need not to let down their partner, the encouragement when the other is stuck and the collaborating review and testing keep partners much more focused on the task at hand and help them improve their programming skills.
III. VIRTUAL PAIR PROGRAMMING Communication is the core function of negotiation and cooperation that allows information and expertise to be exchanged between team members. The close physical proximity, considered a basic element in the delivery of prompt responses during pair programming, is frequently a serious limitation e.g. when teams have geographical barriers or scheduling conflicts that prevent them from being collocated. The inevitability of distributed work in industry and education forces software developers to adopt online technologies and implement VPP in order to have the benefits of pair programming in the online environment.

A. Issues and solutions for VPP support
Rapid advances in computer networks and internet technologies have made it possible for developers from different geographical locations to form virtual teams in a distributed setting and jointly develop the same artefact of software in a collaborative way [7]. To allow effective implementation of VPP it is necessary to introduce team awareness support that is closely comparable to collocated pair programming practices, where physical proximity facilitates easy and quick communication, collaboration and coordination.
Dourish and Bellotti define team awareness as "an understanding of the activities of others, which provides a context for your own activity" [8]. Team awareness allows each user to be informed about others users' activities and track the changes that other collaborators have made to a group project. Maintaining awareness in virtual settings requires additional effort and the extensive use of various communication tools such as newsgroups, MUDs, email, text chat, and instant messaging [9].
Effective real time communication is necessary for the virtual pair members to obtain information on how each other react on something the other says or does. To have a physical sense of a reaction, information on body gestures, the faces or even the tone of the voice of the other person is required.
Collaboration is a term that can be used to indicate any form of interaction between software developers, in colocated or distributed settings, working on the same set of artefacts. In the context of VPP, collaboration involves regular synchronous meetings, planning and negotiation, design decisions that are translated into code, regular code integration, and ongoing communication between pair members during the development lifecycle. A crucial point for successful collaboration in general, is the manner in which individual work is related to the team as a whole.
The collaboration among people who are engaged in a common task requires the coordination of the activities related with the task and of the resources used during its execution. Concerning coordination, being in two different locations, the challenge for virtual pairs consists on how to synchronize their availability, adjust the time differences, and integrate their activities [10].
Successful implementation of VPP assumes the existence of a collaborative environment designed to support communication and awareness with a collection of tools that allow pair members to access, manage, and share information through an integrated information infrastructure. In the domain of computer-supported cooperative work (CSCW), virtual collaboration is supported mostly by groupware applications that provide the tools for synchronous/real-time team activities such as pair programming.
There are two basic approaches for VPP via the Internet. The first one is to send screen-buffer information through the network, broadcasting the display of any application from a member to all the others. Using screen sharing applications such as Microsoft NetMeeting, Symantec PCAnywhere, VNC (or one of its derivatives RealVNC, TightVNC, etc) or the built in tool "Remote Assistance" of Windows XP, virtual pairs can view a common desktop and control remotely an IDE (Integrated Development Environment) compiler, writing code in turns. Free and reliable instant messaging/ video conferencing applications like Paltalk, Skype and ooVoo, integrating video, audio and text chatting, accompany screen sharing applications and connect pair members providing presence awareness and immediate interaction. Free web conferencing applications such as Microsoft NetMeeting (installed in both members' boxes) and Dimdim or vRoom from Elluminate, combining audio/ video conference with desktop shearing and interactive whiteboard, provide an integrated information facility for effective collaboration.
The second approach toward VPP is the design of collaborative environments that are domain-specific and integrate collaborative editors with versioning and configuration control tools (such as CVS), databases, videoconference facilities and email systems [10]. Stand alone programs like SubEthaEdit, UNA or NetEdit, and any IDE that support collaboration, such as Eclipse with various plug-ins such as Sangam, QuickShare or Eclipse Wiki Editor, can be used to facilitate the collaborative aspects of VPP. In contrast to desktop sharing systems, these collaborative editors are event driven programs, transmitting only messages that are important for pair programming, and thus they don't need high bandwidth connections. Using a stand alone collaborative editor like UNA, all users who have opened a particular project space see the same set of opened files, the same chat history, the same notes, the same whiteboard, and can freely move in and edit all parts of a document, without locking. Workspace awareness, i.e. the knowledge of the state or actions of other participant, is evenly high in collaborative editors using plug-ins like Sangam, DocShare or Jazz for Ecliplse IDE. These plug-ins benefit from all platforms features such as syntax highlighting and auto-indentation, adding their own capabilities for version control, tracking of local changes and defects, simultaneous editing even in the same line, easy floor handling and role switching.

B. Previous experiments on VPP
Baheti et al studied the effectiveness of VPP measuring the quality and productivity of distributed pairs. They used tools like VNC, Microsoft NetMeeting and instant messengers and found that VPP was a feasible way of developing object-oriented software and as effective as collocated pair programming. The virtual pairs produced comparable code to that of collocated teams with respect to the productivity (lines of code per hour) and quality (test subjects' grades) and showed a higher level of communication and collaboration [11].
Hanks conducted an experiment on VPP using VNC with a modification, to allow for a second cursor that can be controlled by the navigator to point at areas of the screen without affecting the driver's state. Two groups of students, one remote and one collocated, were compared on performance on assignments and final exam. No statistical significant differences were found on students' achievement between collocated and virtual pairs [12].
Stotts et al experimented for 5 weeks with 4 distributed pairs of graduate students, having two of them work as virtual synchronous pairs (utilizing VPP) and the remaining two work as more traditional virtual teams (no pair programming). All pairs used Microsoft NetMeeting for code development and Yahoo Messenger for voice communication. The number of test cases passed was the metric used for program quality while productivity was calculated as the mean total time for development, without any concern for lines of code measures. The researchers found that VPP teams wrote 70% more unit test cases than the non-pair programming virtual teams, satisfied all the test cases and competed their projects in about 60% less time [13].
Natsu et al conducted an experiment with 9 pairs (5 distributed and 4 collocated in the laboratory) using their COPPER system, a synchronous source code editor that allows two distributed software engineers to write a program using pair programming. After a 90 minutes session students evaluated the usability of the tools provided by the editor, the usefulness of the floor control and the awareness mechanisms and the adequacy of the communication. All pairs had near equal perception about the usefulness of the system, indicating the feasibility of VPP with tools that provide the means for simultaneous code editing and presence and workspace awareness [14].
The experiments described above indicate that effective collaborative software development is possible with a few simple, widely-available tools (screen sharing, Internetbased audio communications) [13]. The aim of the present study was to evaluate the implementation of VPP in an introductory programming course and the impact that would have on students' achievement and satisfaction. In the following sections we describe the course organization, the experimental design, the research questions, the metrics that were used, and the results that were obtained.
IV. COURSE ORGANIZATION Introduction to Computer Programming COMP 120 is a beginning level programming course. The primary goal is to teach students basic elements of programming, objectoriented programming and problem-solving skills. This course is taught in the second semester of the first year and is appropriate for students with no prior programming experience. There are no strict prerequisites, but a basic background in math and computer skills is required. Students are supposed to feel comfortable using a computer as an everyday tool (e.g., using a web browser, writing email, using word processing applications, downloading and installing software).
The Java language is used to introduce foundations of structured, procedural, and object-oriented programming. Topics include I/O, data types, operators, operands, expressions, conditional statements, iteration, recursion, arrays, functions, parameter passing, and returning values. Students are also introduced to classes, objects, object references, inheritance, sorting, polymorphism, exception handling, searching, Java Collections, and Applets. COMP 120 is required for computer science majors, electrical engineering majors, and also for the Computer Programming Certificate, a program designed to enable students with undergraduate degrees or working professionals to upgrade their programming skills or make a career change.
V. THE EXPERIMENT An experiment was conducted in the COMP 120 course, among the 129 students enrolled in 2007 fall semester, aiming to assess the effectiveness of VPP in an effort to move the course to a more learner centred and collaborative direction. Traditionally, the course is taught during 12 weeks, with two hour lectures and one two-hour lab each week, and student grades are based on one midterm exam, one final exam and 8 homework programming assignments completed by students working on their own. In each lab room there are about 20 students working independently to modify the code and extent the functionalities of a sample program, after the explanations and goals given by the instructor. Usually, the lab time is never enough for the majority of students to finish the tasks assigned them, such as compiling the code, testing it, debugging it and refining it, and much work remains to be done at home, augmented by additional tasks defined in every homework programming assignment after the end of the lab period.
This semester, half of the students (65) completed all their assignments individually as usual (solo section), while the others (64) used pair programming and collaborated upon the last 4 assignments (VPP section). During the first 4 weeks all students completed homework assignments individually and they had strictly one week to submit their solutions. Since the course uses an objects first approach and concepts such as classes, objects and methods are introduced as early as the first week, providing weekly feedback and appropriate scaffolding at the beginning is crucial to ensure that individual students understand object oriented programming as it applies to Java.
After the midterm exam at the end of the 5 th week, students in VPP section were randomly assigned a partner according to their grades in the 4 first assignments to ensure that each team included members with approximately equal previous knowledge and abilities. In general the literature indicates that small differences in cognitive level are more conductive to effective collaboration and cognitive growth than larger differences [15]. Although other studies support precisely the opposite and suggest the novice expert pairing, we chose the first approach as more suitable in a freshman class.
All VPP students were given a brief introduction to pair programming technique and instructed to switch regularly roles between driver and navigator and respond constructively to feedback, in order to keep an objective view about the direction in which the program is going and look for the strategic implications of the developing code. They were cautioned to avoid the temptation to break projects into smaller parts to be completed independently and tolerate partner's questions or comments as a necessary part of the process, without becoming annoyed or upset. As pair programming requires real time collaborative effort, they had to spend over 70% of the total time on an assignment interacting synchronously with their partner. All VPP students attended an orientation lesson in the laboratory rooms and collaborated in pairs on short projects using NetMeeting in order to become familiarized with its desktop sharing (running NetBeans IDE), chat and video conference features.
Students in both sections, VPP and solo, had to record the time needed to complete each programming assign-ment (running successfully all the accompanied test cases), the lines of code per hour (only executable lines and data declarations) and the number of defects/bugs. Instructor made clear to students that all the data that would be gathered was to be used only for evaluating the programming technique of pair programming and the only thing that they had to worry about was as always the quality and functionality of their programs. Although all students in pairing section were assigned partners from their class in order to avoid schedule conflicts, they were allowed to re-pair if scheduling difficulties arose in meetings or even to work alone if insurmountable problems occurred.

VI. RESEARCH HYPOTHESES AND METRICS
This experiment was aimed at examining the effects of pair programming in introductory programming courses. The hypotheses that were tested are listed in Table I.
To decide on H1 we examined two factors: code productivity and software quality. Generating more code faster and of high reliability is a real challenge especially for novice programmers and pair programming according to previous research motivates students to succeed in this difficult effort. Data from midterm and final examination were used to compare the performance of VPP and solo students and conclude on H2. A survey about students' perceptions of pair programming administered in class before the final test and gave VPP students the opportunity to evaluate the new technique. Statistical analysis of the survey responses gave us data to test H3.

A. Code productivity
Over time, there have been many attempts to define metrics that effectively measure software development productivity. Most of them are amazingly complicated and very difficult to apply. In this experiment project productivity was calculated as the amount of work, i.e., lines of code, divided by the effort used, i.e., the development time for each assignment. Measuring project development time means just summing up all hours spent on design, programming, testing and bug fixing. In Table II are presented the numbers of lines of code (loc) and the time elapsed for the development of each programming assignment.
The number of loc written by pairs was approximately 7% less than the number of loc produced by solo students but the difference was not significant at 0,05 level. On the contrary, significant difference was found between the completion times of each assignment. Although VPP teams had in all cases better development times, comparing pair effort (doubling the time of the team) with that of individuals, an average of 57% more effort was needed from pairs for writing the same amount of loc. This increment in pairs' effort lies near the findings of Nosek, who reported that the pairs spent on the average 42% more effort than individuals (the difference was not statistically significant) [16]. This result implies that VPP is rather a time consuming technique at least as implemented by novice programmers.

H1
Students who use VPP on programming assignments will produce better programs faster than solo programming students.

H2
Students who work virtually in pairs will earn exam scores equal to or higher than solo programming students.

H3
Students in pairs enjoy pair programming and will have a positive attitude towards collaborative programming settings. In Table III, results of a t-test analysis for productivity measured in loc/hr, show significant differences (p < 0,05) between VPP and solo sections, with pair students to be more productive than solo programmers. This result is in agreement with the findings of the experiments made by Baheti et al, who found that collocated pair teams performed better than solo programmers but did not achieved statistically significantly better results than VPP teams, and Lui and Chan, who concluded that pair programming achieves higher productivity when a pair writes a more challenging program that demands more time spent on design [11] [17]. Although a single dimensional measure of productivity, such as loc/hr, gives a good picture of individual or pair productivity, it is evidence that productivity is a poor measure if desired level of quality is not taken into account.

B. Code quality
Software quality measurements are related to the absence of defects that would cause a program to behave unpredictably or stop successful execution. As each line of code is a potential point of failure and takes time to plan, type, review, and debug, fewer lines of code reduce failures and increase coding speed. On the other hand, clever code takes longer to plan, review, and debug and can also have more points of failure per line of code. During the course, students instructed that variables, operators and statements are the real points of failure and reducing those is what truly reduces defects.
From Table IV is apparent that pairs produced code of higher quality with about half fewer defects (p < 0,05). Although students were instructed to count only logical/design-type defects or syntax-like defects that were not flagged by the compiler, it was difficult to conclude the seriousness of those defects or if they were discovered before or after testing. In any case, given that all students had the same previous programming background, it seems that VPP members through collaboration and continuous code reviewing improved code quality. This result agrees with the findings of many previous studies which have reported smaller defect counts for pair programming [3][4][5] [6].

C. Students' performance
Students' grades on their programming assignments were used as a direct measure of their ability to program. In Table V, mean scores and standard deviations in each of the four assignments are given for VPP and solo students, while a t-test analysis indicates the differences between them.
Although VPP students achieved better scores, there were no statistically significant differences in any of these assignments between them and solo students at 0,05 level. The better degrees that achieved on average the students that worked in pairs reflect the higher code quality (more compact code for the same functionality and fewer postdelivery defects), maintainability (program's ability to stay the same or to adapt to change), performance, and documentation quality. Both sections showed a progressive improvement in their scores partly due to the nature of the assignments that integrated classes used in previous ones and had been corrected and partly due to continuous code exchanging and discussion through the class forum.  A comparison of midterm and final examination scores for both sections, using t-test analysis, revealed no significant differences between pairs and solo students (see Table VI). Both midterm and final examinations were written individually in-class tests designed to assess students' comprehension and problem solving ability on short programs, methods, or classes. It is surprising that although both sections had almost identical performance in midterm examination, VPP students performed better in final examination. This is a strong indication that on average pair members benefited from collaboration in gaining better understanding of fundamental programming concepts and object oriented techniques.

D. Students' attitude and perceptions
As a whole, the 64 students in VPP section had a highly positive attitude toward pair programming. Students ranked the following statements with strongly disagree = 1, disagree = 2, neutral = 3, agree = 4, strongly agree = 5: Q1: I enjoyed programming with a partner more than programming alone. Q2: Pair programming motivated me to stay on task.  In Table VII, students' rankings in the survey questions are shown in detail. Mean values ranged from 4,05 to 4,48 (all in the area of agree and strongly agree) and positive attitude (summing answers of strongly agree SA and agree A) was over 80% in every question. Students enjoyed pair programming (92%) instead of programming alone and they felt (95%) that their partners' pressure had significant impact on motivating them stay on task. These two findings are very encouraging taking into consideration the unavoidable differences in time schedules or in personali-ties or even in skill levels. One reason that justifies this result is that students in pairs had similar abilities. Research has shown that pairing students of similar abilities motivates them work most effectively and compatibly with their partners [18]. Another reason may be the flexibility of the distributed programming they experienced using tools that permitted them to work anytime anywhere.
Students agree (93%) that real time interaction helped them to look deeper in programming concepts and gain knowledge from the continuous reviewing process. Defect removal was also much more efficient (84%) resulting in higher confidence in the produced code (81%). Students' perceptions are in accordance with previous studies reported that students like pair programming, believe that this programming style improves software quality and feel more confidence in their solutions [4][5][6] [11].
VIII. CONCLUSIONS The findings of this study confirm the three hypotheses we set up at the beginning. Testing hypothesis H1 we conclude that students who used VPP on their assignments produced code of better quality, with about half fewer defects and were more productive in Loc/hr. Comparing students' performance based on the grades they achieved in each programming assignment and on midterm and final test, we found no difference between VPP and solo programming students. This result confirms the H2 hypothesis that VPP students would have at least equal scores on programming projects and exams as their solo counterparts. Examining students' satisfaction towards pair programming through their responses in the survey questionnaire, we confirm hypothesis H3 that students in pairs perceive pair programming as a positive learning experience.
Students in pairs used on line technologies that provide desktop sharing and real time communication (at least audio and chat) and collaborated when it was convenient for both partners. In conversations with VPP students, the instructor was being informed frequently about the tools they were using and the difficulties they were experiencing. In most cases simple solutions like NetMeeting and the Remote Desktop Sharing feature of Windows accompanied with free VoIP applications like Skype, worked perfectly.
The results of this study suggest that VPP was an effective pedagogical tool for flexible collaboration and an acceptable alternative to individual programming experience. Our intention is to conduct more experiments like this so that we can draw conclusions about pair programming technique in general and its virtual/ distributed form particularly.