An efficient two-heuristic algorithm for the student-project allocation with preferences over projects

Abstract

The Student-Project Allocation with preferences over Projects problem is a many-to-one stable matching problem that aims to assign students to projects in project-based courses so that students and lecturers meet their preference and capacity constraints. In this paper, we propose an efficient two-heuristic algorithm to solve this problem. Our algorithm starts from an empty matching and iteratively constructs a maximum stable matching of students to projects. At each iteration, our algorithm finds an unassigned student and assigns her/his most preferred project to her/him to form a student-project pair in the matching. If the project or the lecturer who offered the project is over-subscribed, our algorithm uses two heuristic functions, one for the over-subscribed project and the other for the over-subscribed lecturer, to remove a student-project pair in the matching. To reach a stable matching of a maximum size, our two heuristics are designed such that the removed student has the most opportunities to be assigned to some project in the next iterations. Experimental results show that our algorithm is efficient in execution time and solution quality for solving the problem.

Keywords

Approximation algorithm heuristic search matching problem student-project allocation problem

1 Introduction

In project-based courses, students must be assigned to projects to build projects together. To do this, either lecturers can propose a list of students for their projects or departments can assign students to lecturers for doing projects. If doing so, it is evident that there exist conflicts not only among lecturers but also among students since lecturers usually choose good academic students for their projects, while students typically choose projects based on their preferences. The question for this problem is how to allocate students to projects to meet the preference requirements of both students and lecturers. To solve this problem, Abraham et al. introduced the Student-Project Allocation problem (SPA) in 2003 [1]. In particular, an instance of SPA involves three non-empty sets of students, projects, and lecturers. Each lecturer offers a set of projects and ranks a subset of students in strict order of preference in their lists to whom they intend to supervise, while each student ranks a subset of the available projects in a strict order of preference in their lists that they find acceptable. Moreover, each project is offered by a unique lecturer, both projects and lecturers have capacity constraints to indicate the maximum number of students assigned to projects and lecturers. In this context, a stable matching refers to an assignment of students to projects such that there exist no student-project unstable pairs, i.e., the student and project are not matched together in the matching, but they prefer each other to their current assigned partners in the matching. Then, they proposed a student-oriented algorithm running in a linear time to find a student-optimal stable matching for a given instance of SPA, in which each student is assigned to the most preferred project they could get in any stable matching. In 2007, Abraham et al. [2] extended their work and introduced two algorithms for SPA. The first one is a student-oriented algorithm described in their previous work [1], while the second one is a lecturer-oriented algorithm running in a linear time to find a lecturer-optimal stable matching for a given instance of SPA, in which each lecturer is assigned to the most preferred students to whom they could get in any stable matching.

In practical applications, requiring each lecturer to rank a subset of students in a strict order as in SPA is unfair for both lecturers and students. For example, lecturers often strongly prefer to supervise students with good academic results rather than students with poor academic results. This sometimes leads to conflicts among lecturers and students. Moreover, if there are many students in SPA, it is difficult for lecturers to rank students in their lists. For these reasons, in 2008, Manlove and O’Malley [12] proposed a variant of SPA, called SPA with preferences over Projects (SPA-P), where lecturers rank a subset of projects in strict order rather than a subset of students in their lists. Given an instance of SPA-P, Manlove and O’Malley [12] showed that stable matchings could have different sizes, and the problem of finding a maximum stable matching is NP-hard even if each project and lecturer has a capacity 1.

In the last few years, several researchers have proposed efficient approximation algorithms and provided the lower and upper bounds for SPA-P. An algorithm is said to be an r-approximation algorithm for SPA-P if it results in a stable matching M with |M_opt|/|M| ≤ r for all SPA-P instances, where M_opt is the stable matching of maximum size. In 2008, Manlove and O’Malley [12] extended the well-known Gale-Shapley algorithm [5] to propose a 2-approximation algorithm with linear time complexity, namely SPA-P-approx. This algorithm consists of a sequence of proposals, in which an unassigned student with a non-empty list proposes the most preferred project on her/his list to form student-project pairs of a matching such that the lecturers and projects satisfy their capacity constraints. The algorithm returns a stable matching in a finite number of iterations. In 2012, Iwama et al. [6] extended the SPA-P-approx [12] using Király’,s idea [8] and proposed a $\frac{3}{2}$ -approximation algorithm with a linear time complexity, namely SPA-P-approx-promotion, to find a stable matching for instances of SPA-P. In 2020, Viet et al. [16] considered SPA-P as a constraint satisfaction problem and proposed a local search approach based on the min-conflicts algorithm [13, 15] to solve it. Recently, Manlove et al. [10, 11] investigated an integer programming approach to SPA-P and proposed a $\frac{3}{2}$ -approximation algorithm to find stable matchings that are very close to having maximum cardinality over their tested instances.

So far, both SPA and SPA-P have received significant attention from the research community for their roles in developing automated systems for student project allocation. Several examples can be found at various institutions, such as the School of Computing Science at the University of Glasgow [9], the Faculty of Science at the University of Southern Denmark [3], and the Department of Computing Science at the University of York [7].

Our contribution : In this paper, we first analyze the weaknesses of the SPA-P-approx [12] and the SPA-P-approx-promotion [6] algorithms for solving the SPA-P problem. Accordingly, we propose two heuristic functions to improve the drawbacks of the SPA-P-approx and SPA-P-approx-promotion algorithms. We then propose an efficient two-heuristic algorithm, namely SPA-P-heuristic, for solving the SPA-P problem. We also show that our algorithm returns a stable matching after a finite number of iterations. To conduct our experiments, we propose a method to generate random SPA-P instances. Our experimental results over all the tested scenarios show that our proposed algorithm is much more efficient than the SPA-P-approx [12] and SPA-P-approx-promotion [6] algorithms in terms of execution time and solution quality for SPA-P instances of large sizes. Therefore, our algorithm can be applied to develop intelligent systems for student-project allocation.

The remainder of this paper is structured as follows. Section gives a formal definition of SPA-P, Section algorithm presents our proposed algorithm, Section experiments discusses the experiments, and Section conclusions concludes our work.

2 SPA-P problem

In this section, we remind the SPA-P problem given in [6, 12]. An instance I of the SPA-P problem comprises a set S = {s₁, s₂, ⋯ , s_n} of students, a set P = {p₁, p₂, ⋯ , p_q} of projects, and a set L = {l₁, l₂, ⋯ , l_m} of lecturers, where (i) Each lecturer l_k ∈ L offers a non-empty set P_k of projects in strict order of preference in their lists, subject to P₁ ∪ P₂ ∪ ⋯ ∪ P_m = P and P_{k
₁} ∩ P_{k
₂} = ∅ , ∀ k₁ ≠ k₂; (ii) Each student s_i ∈ S ranks a non-empty set A_i ⊆ P of projects in strict order of preference in their lists; (iii) Each lecturer l_k ∈ L has a capacity $d_{k} \in ℤ^{+}$ to indicate the maximum number of students to whom they can supervise; and (iv) Each project p_j ∈ P has a capacity $c_{j} \in ℤ^{+}$ to indicate the maximum number of students who can work p_j together. Hereafter, we use the list of notations shown in Table \ref tab:symbol for reader convenience.

For ∀s_i ∈ S, ∀l_k ∈ L, and ∀p_j ∈ P, we denote rank (s_i, p_j) and rank (l_k, p_j) by the rank of p_j in s_i’s and l_k’s lists, respectively. If s_i and l_k prefer p_j as the α^th and β^th choices in their lists (1 ≤ α, β ≤ q), then rank (s_i, p_j) = α and rank (l_k, p_j) = β, respectively. For ∀p_j ∈ A_i and ∀p_t ∈ A_i, if s_i prefers p_j to p_t, we denote by rank (s_i, p_j) < rank (s_i, p_t). For ∀p_j ∈ P_k and ∀p_t ∈ P_k, if l_k prefers p_j to p_t, we denote by rank (l_k, p_j) < rank (l_k, p_t). Moreover, we denote rank (s_i, p_j) =0 for ∀p_j ∈ P \ A_i and rank (l_k, p_j) =0 for ∀p_j ∈ P \ P_k.

An assignmentM in I is a subset of S × P such that if (s_i, p_j) ∈ M, then p_j ∈ A_i. If l_k offers p_j and (s_i, p_j) ∈ M, then we say that s_i is assigned to p_j and l_k in M, p_j is assigned to s_i in M, and l_k is assigned to s_i in M.

For ∀s_i ∈ S, we denote M (s_i) by the set of projects assigned to s_i in M and |M (s_i) | by the number of projects in M (s_i). If M (s_i) =∅, then we say that s_i is unassigned in M. For ∀p_j ∈ P, we denote M (p_j) by the set of students assigned to p_j in M and |M (p_j) | by the number of students in M (p_j). If |M (p_j) | > c_j, |M (p_j) | < c_j, or |M (p_j) | = c_j, then we say that p_j is over-subscribed, under-subscribed, or full, respectively. For ∀l_k ∈ L, we denote M (l_k) by the set of students assigned to l_k in M and |M (l_k) | by the number of students in M (l_k). If |M (l_k) | > d_k, |M (l_k) | < d_k, or |M (l_k) | = d_k, then we say that l_k is over-subscribed, under-subscribed, or full, respectively.

Table 1
List of some notations

I Instance of the SPA-P problem

S Set of students

L Set of lecturers

P Set of projects

A _i Set of projects ranked by student s_i ∈ S

P _k Set of projects offered by lecturer l_k ∈ L

M Matching

M (s_i) Set of projects assigned to student s_i in M

M (p_j) Set of students assigned to project p_j in M

M (l_k) Set of students assigned to lecturer l_k in M

s _i Student

l _k Lecturer

p _j Project

c _j Capacity of project p_j ∈ P

d _k Capacity of lecturer l_k ∈ L

n Number of students

m Number of lecturers

q Number of projects

A matchingM in I is an assignment such that |M (s_i) |≤1, |M (p_j) | ≤ c_j, and |M (l_k) | ≤ d_k for ∀s_i ∈ S, ∀p_j ∈ P, and ∀l_k ∈ L. If |M (s_i) |=1, we denote M (s_i) by the project assigned to s_i in M.

A pair (s_i, p_j) ∈ (S × P) \ M is a blocking pair for a matching M if all the following conditions are met, where p_t = M (s_i): 1. 1.

p_j ∈ A_i, i.e., s_i finds p_j acceptable;

either p_t =∅ or rank (s_i, p_j) < rank (s_i, p_t), i.e., s_i prefers p_j to p_t;

|M (p_j) | < c_j and either \\ (a)s_i ∈ M (l_k) and rank (l_k, p_j) < rank (l_k, p_t), or\\ (b)s_i ∉ M (l_k) and |M (l_k) | < d_k, or\\ (c)s_i ∉ M (l_k), |M (l_k) | = d_k, and rank (l_k, p_j) < rank (l_k, p_z), where p_z is l_k’s worst non-empty project, i.e., l_k ranks p_z with the lowest priority in M (l_k), where M (p_z)≠ ∅.

The concept of blocking pair refers to a situation where a student s_i finds a project p_j acceptable and s_i prefers p_j to her/his currently assigned project, then s_i and p_j have the potential to form a better matching than the current matching by replacing their current assigned partners.

Table 2

An instance of the SPA-P problem

Students’ lists	Lecturers’ lists	Students’ ranks	Lecturers’ ranks
s₁: p₁ p₂ p₅	l₁: p₃ p₁ p₂	s₁: 1 2 0 0 3	l₁: 2 3 1 0 0
s₂: p₁ p₄	l₂: p₄ p₅	s₂: 1 0 0 2 0	l₂: 0 0 0 1 2
s₃: p₂ p₁ p₄		s₃: 2 1 0 3 0
s₄: p₃		s₄: 0 0 1 0 0
s₅: p₃ p₄		s₅: 0 0 1 2 0
s₆: p₅ p₃ p₄		s₆: 0 0 2 3 1
Projects’ capacities: c₁ = 2, c₂ = 2, c₃ = 1, c₄ = 1, c₅ = 2.
Lecturers’ capacities: d₁ = 3, d₂ = 3.

A matching M in I is called stable if no blocking pair exists for M; otherwise, M is called unstable. Given a stable matching M, we denote |M| by the number of students assigned in M, i.e., the size of M. If |M| = n, then M is called perfect; otherwise, M is called non-perfect. The SPA-P problem aims to find a stable matching with maximum size, known as MAX-SPA-P [12].

An instance I of the SPA-P problem is given in Table \ref tab:SPA-P-Example, where S = {s₁, s₂, s₃, s₄, s₅, s₆}, P = {p₁, p₂, p₃, p₄, p₅}, L = {l₁, l₂}. The students’ and lecturers’ lists columns show the preference lists of students and lecturers over projects, respectively, i.e., A₁ = {p₁, p₂, p₅}, A₂ = {p₁, p₄}, A₃ = {p₂, p₁, p₄}, A₄ = {p₃}, A₅ = {p₃, p₄}, A₆ = {p₅, p₃, p₄}, P₁ = {p₃, p₁, p₂}, and P₂ = {p₄, p₅}. The students’ and lecturers’ ranks columns show the rank of projects in the students’ and lecturers’ lists, respectively, where if s_i and l_k prefer p_j as the α^th and β^th choices in their lists, then rank (s_i, p_j) = α and rank (l_k, p_j) = β. For example, in the students’ lists, we write “s₁: p₁p₂p₅”, meaning that s₁ prefers p₁ as the first choice, p₂ as the second choice, and p₅ as the third choice and therefore, rank (s₁, p₁) =1, rank (s₁, p₂) =2, and rank (s₁, p₅) =3 in the students’ ranks. We use similar notations in the lecturers’ lists. The matching M = {(s₁, p₅), (s₂, p₁), (s₃, p₂), (s₄, p₃), (s₅, p₄)} is unstable because there exist two blocking pairs consisting of (s₁, p₁) and (s₆, p₅) for M. Specifically, (s₁, p₁) ∉ M and s₁ prefers p₁ to p₅, so s₁ should be assigned to p₁. Meanwhile, (s₆, ∅) ∉ M, s₆ prefers the most p₅, and |M (p₅) | < c₅, so s₆ should be assigned to p₅. The matching M = {(s₁, p₁), (s₂, p₁), (s₃, p₄), (s₄, p₃), (s₆, p₅)} is a stable matching of size |M|=5. On the contrary, the matching M = {(s₁, p₅), (s₂, p₁), (s₃, p₁), (s₄, p₃), (s₅, p₄), (s₆, p₅)} is a maximum stable matching and it is also perfect since its size is |M|=6.

3 Proposed algorithm

\label sec:algorithm

In this section, we first propose two heuristic functions used in our algorithm. Then, we propose an algorithm to find stable matchings of maximum size for the SPA-P problem. Finally, we give an execution of our algorithm for the SPA-P instance given in Table \ref tab:SPA-P-Example.

3.1 Heuristic definitions

\label sec:heuristic-definition

We consider the SPA-P-approx [12] and SPA-P-approx-promotion [6] algorithms for finding maximum stable matchings of SPA-P instances. The main principle of SPA-P-approx is as follows. At the beginning, the algorithm initializes a matching M to be empty, meaning that every student is unassigned to any project offered by lecturers. At each iteration, the algorithm finds the first project p_j of an unassigned student s_i with a non-empty list. If p_j is full, meaning that p_j does not have a slot for s_i, then the algorithm deletes p_j from s_i’s list so that it does not choose p_j for s_i in the next iterations. Otherwise, the algorithm provisionally assigns p_j to s_i to form a stable pair (s_i, p_j) ∈ M. When p_j is assigned to s_i, the lecturer l_k who offered p_j is assigned to s_i. If l_k is over-subscribed, the algorithm removes an arbitrary student s_r from M (p_z), where p_z is l_k’s worst non-empty project, and deletes p_z in s_r’s list. Meanwhile, SPA-P-approx-promotion [6] algorithm runs as follows. At the beginning, the algorithm marks all students as unpromoted. At each iteration, the algorithm chooses the first project p_j of an unassigned student s_i with a non-empty list at each iteration. If p_j is full, the algorithm removes an arbitrary student s_r from M (p_j) and adds (s_i, p_j) to M. If l_k is over-subscribed, the algorithm removes an arbitrary student s_r from M (p_z) such that l_k is full, where l_k is the lecturer who offered p_j and p_z is l_k’s worst non-empty project in M (l_k).

We found that removing an arbitrary student s_r in M (p_j) or M (p_z) is a weak point of both SPA-P-approx and SPA-P-approx-promotion algorithms. For example, we consider a specific case where the three following conditions are met: (i) M (p_z) consists of at least two students s_r and s_w; (ii) s_r ranks only one project; and (iii) s_w ranks more than one project. If we remove s_r from M, then s_r is unassigned in M forever, while if we remove s_w from M, then s_w can be assigned to some project in her/his list at the next iterations.

In the general case, we find in each iteration of the above algorithms that when a project p_j ∈ A_i is assigned to a student s_i ∈ S, if p_j is over-subscribed, we need to remove a student from M (p_j) so that p_j is full. We recognize that the student removed from M (p_j) should have the most remaining projects in her/his list since she/he has the most opportunity to be assigned to some project at the next iterations. Moreover, when a project p_j is assigned to a student s_i, the lecturer l_k who offered p_j can be over-subscribed. If l_k is over-subscribed, we need to remove a student from M (l_k) so that l_k is full. To keep M stable, the student removed from M (l_k) must be a student assigned to a project p_z, which is l_k’s worst non-empty project, so that condition (3c) in the definition of a blocking pair is not violated. Similar to the case where p_j is over-subscribed, the student removed from M (p_z) should have the most remaining projects in her/his list since she/he has the most opportunity to be assigned to some project at the next iterations. To solve these two issues, we propose two heuristic functions as follows:

Case 1: When a project p_j is over-subscribed, we propose a heuristic function for every student s_r ∈ M (p_j) as follows:

f (s_{r}) = rank (l_{k}, p_{j}) + | A_{r} | / (q + 1), \forall s_{r} \in M (p_{j}),

(1)

where |A_r| is the number of projects in A_r and q is the number of projects in P. Then, the student is chosen to be removed from M as follows:

s_{w} = argmax (f (s_{r})), \forall s_{r} \in M (p_{j}) .

(2)

It is evident that a student s_w determined by Equation \eqref eq:fsw means that rank (l_k, p_j) is maximum and |A_w| is maximum. If we remove s_w from M (p_j), then we keep the students in M (p_j) who have the least opportunity to be reassigned to projects in their lists and remove the student s_w in M (p_j) who has the most opportunity to be reassigned to some project in her/his list at the next iterations, since the student s_w ranks the most projects in her/his list.

Case 2: When a lecturer l_k is over-subscribed, we propose a heuristic function for every student s_r ∈ M (l_k) as follows:

g (s_{r}) = rank (l_{k}, p_{z}) + | A_{r} | / (q + 1), \forall s_{r} \in M (l_{k}),

(3)

where p_z = M (s_r) is a project offered by l_k. Then, the student is chosen to be removed from M as follows:

s_{w} = argmax (g (s_{r})), \forall s_{r} \in M (l_{k}) .

(4)

Similarly, a student s_w determined by Equation \eqref eq:gsw means that rank (l_k, p_z) is maximum and |A_w| is maximum, where p_z = M (s_w). Since rank (l_k, p_z) is a positive integer number and 0 < |A_w|/(q + 1) <1, if we remove such a student s_w meaning that p_z is l_k’s worst non-empty project and s_w ranks the most projects in her/his list. By doing so, we not only keep M stable but also keep in M (l_k) the students who have the least opportunity to be reassigned to projects in their lists and remove the student s_w in M (l_k) who has the most opportunity to be reassigned to some project in her/his list at the next iterations.

Since the student s_w removed from M (p_j) and M (l_k) corresponds to the maximum f (s_w) and g (s_w) values given in Equations \eqref eq:fsw and \eqref eq:gsw, we call such a student s_w the worst student removed from either M (p_j) or M (l_k).

3.2 Algorithm for SPA-P

\label sec:SPA-heuristic

Our heuristic algorithm for finding maximum stable matchings of SPA-P instances, namely SPA-P-heuristic, is shown in Algorithm 1. During the execution of the algorithm, each student is marked active (i.e., a (s_i) =1) or inactive (i.e., a (s_i) =0). At the beginning, every student s_i ∈ S is active and unassigned in M. At each iteration, if there exists an active student s_i with an empty list, then she/he becomes inactive forever (i.e., a (s_i) =0). Therefore, she/he is unassigned in M and the algorithm runs the next iteration (lines 5–8). Otherwise, she/he is assigned to her/his most preferred project p_j to form a pair (s_i, p_j) in M, and she/he becomes inactive (lines 9–12). If p_j is over-subscribed, then the worst student s_w in M (p_j) determined by Equation \eqref eq:fsw is removed from M (lines 14–15), s_w deletes p_j in her/his list (line 16), and she/he becomes active again (line 17). If l_k is over-subscribed, then the worst student s_w in M (l_k) determined by Equation \eqref eq:gsw is removed from M (lines 20–22). If so, s_w deletes p_z in her/his list, where p_z is offered by l_k and assigned to s_w (line 23), and she/he becomes active again (line 24). The algorithm repeats until all students are inactive and returns a stable matching.

Lemma 3.1.\label alg:terminate SPA-P-heuristic terminates after a finite number of iterations.

Proof. Given an instance I of SPA-P, we have each student s_i ∈ S ranked |A_i| projects. Initially, every student s_i ∈ S is marked active. At each iteration, a student s_i is assigned to her/his most preferred project p_j and she/he becomes inactive. When s_i is assigned to p_j, the lecturer l_k who offered p_j is assigned to s_i. If both p_j and l_k are not over-subscribed, then the algorithm terminates after n iterations. If p_j or l_k is over-subscribed, then some student s_w ∈ M (p_j) or s_w ∈ M (l_k), respectively, is removed from M. If so, s_w deletes M (s_w) from her/his list and becomes active again. Although s_w becomes active, since s_w deletes M (s_w) in her/his list, s_w is not reassigned to M (s_w). Thus, if some student s_r is not assigned to any project, then s_r deletes every project p_t ∈ A_r, making s_r’s list empty and s_r inactive forever. We let

S = S₁ ∪ S₂, where S₁ is a set of students assigned to projects and S₂ is a set of students not assigned to any project. If so, we have (i) S₁∩ S₂ = ∅; (ii) every s_i ∈ S₁ is inactive since s_i is assigned to some project in her/his list; and (iii) every s_r ∈ S₂ is inactive since s_r’s list is empty after deleting |A_r| projects in her/his list. This shows that all students are inactive after a finite number of iterations and therefore, the algorithm terminates since it runs when there exists an active student s_i ∈ S. \myend

Lemma 3.2.\label alg:time SPA-P-heuristic finds a solution of SPA-P in O (n × q²) time.

Proof. It follows by the proof of Lemma 3.1 that in the best case, where every student is assigned to the first preferred project in their lists to form a stable matching, our algorithm takes O (n) time. In the worst case, where every student ranks all the projects in P and proposes the last preferred project in their lists, our algorithm takes O (n × q) time. When a project p_j is over-subscribed, our algorithm finds the worst student s_w in M (p_j) to remove from M, so it takes O (q) time in the worst case. When a lecturer l_k is over-subscribed, our algorithm finds the worst student s_w in M (l_k) to remove from M, so it takes O (m) time in the worst case. Therefore, our algorithm takes O (n × q × (O (q) + O (m)) = O (n × q × max (q, m)) time to find a solution of SPA-P. Since each lecturer must propose at least one project, we have q ≥ m or max (q, m) = q. This shows that our algorithm takes O (n × q × max (q, m)) = O (n × q²) time to find a solution of SPA-P. \myend

Lemma 3.3. SPA-P-heuristic returns a stable matching.

Proof. We assume that the algorithm returns a matching M and there exists a blocking pair (s_r, p_t) ∈ (S × P) \ M for M. During the execution of the algorithm, we consider two cases:

Case 1: If p_t is not deleted from s_r’s list, then s_r’s list is not empty. If s_r is unassigned in M, then s_r is marked active. If so, the while loop would not have terminated and we get a contradiction with Lemma 3.1. Hence, s_r is assigned in M. Since we assume that (s_r, p_t) blocks M, i.e., s_r prefers p_t to p_z = M (s_r). However, when s_r proposes p_z, p_z was the first project on s_r’s list and we get a contradiction. Hence, M is stable.

Case 2: If p_t is deleted from s_r’s list, then this occurs when (i) p_t is over-subscribed. If so, p_t becomes full and (s_r, p_t) cannot block M; or (ii) l_k is over-subscribed, where l_k is the lecturer who offered p_t. If so, g (s_r) is maximum as shown in Equation \eqref eq:gsw, i.e., rank (l_k, p_t) is maximum and |A_r| is maximum. Since rank (l_k, p_t) is a positive integer number and 0 < |A_r|/(q + 1) <1, meaning that p_t is l_k’s worst non-empty project in M (l_k). Now l_k becomes full in M and l_k’s worst non-empty project in M (l_k) is better than p_t. Hence, (s_r, p_t) cannot block M. Since we assume that (s_r, p_t) blocks M, we get a contradiction. Hence, M is stable.\myend

3.3 Example

Table 3
An execution of SPA-P-heuristic for the instance given in Table \ref tab:SPA-P-Example

Iter. s_i p _j l _k M ∪ (s_i, p_j) Over-subscribed M \ (s_r, p_t) M

1 s ₁ p ₁ l ₁ (s₁, p₁) {(s₁, p₁)}

2 s ₂ p ₁ l ₁ (s₂, p₁) {(s₁, p₁), (s₂, p₁)}

3 s ₃ p ₂ l ₁ (s₃, p₂) {(s₁, p₁), (s₂, p₁), (s₃, p₂)}

4 s ₄ p ₃ l ₁ (s₄, p₃) l ₁ (s₃, p₂) {(s₁, p₁), (s₂, p₁), (s₄, p₃)}

5 s ₅ p ₃ l ₁ (s₅, p₃) p ₃ (s₅, p₃) {(s₁, p₁), (s₂, p₁), (s₄, p₃)}

6 s ₆ p ₅ l ₂ (s₆, p₅) {(s₁, p₁), (s₂, p₁), (s₄, p₃),(s₆, p₅)}

7 s ₃ p ₁ l ₁ (s₃, p₁) p ₁ (s₁, p₁) {(s₂, p₁), (s₃, p₁), (s₄, p₃),(s₆, p₅)}

8 s ₅ p ₄ l ₂ (s₅, p₄) {(s₂, p₁), (s₃, p₁), (s₄, p₃),(s₅, p₄),(s₆, p₅)}

9 s ₁ p ₂ l ₁ (s₁, p₂) l ₁ (s₁, p₂) {(s₂, p₁), (s₃, p₁), (s₄, p₃),(s₅, p₄), (s₆, p₅)}

10 s ₁ p ₅ l ₂ (s₁, p₅) {(s₁, p₅), (s₂, p₁), (s₃, p₁), (s₄, p₃),(s₅, p₄), (s₆, p₅)}

In this section, we consider the execution of our algorithm for the SPA-P instance given in Table \ref tab:SPA-P-Example. First, our algorithm sets all students to be active and unassigned in a matching M, i.e., M = {}. Then, it runs the iterations shown in Table \ref tab:SPA-P-Heuristic-excution, where p_t = M (s_r). Specifically, the iterations are as follows: 1.

At the 1st, 2nd, and 3rd iterations, s₁, s₂, and s₃ are assigned to their most preferred project p₁, p₁, and p₂, respectively, offered by l₁. Therefore, we have M = {(s₁, p₁), (s₂, p₁), (s₃, p₂)} and s₁, s₂, and s₃ are marked inactive.

At the 4th iteration, s₄ is assigned to her/his most preferred project p₃ offered by l₁. Therefore, we have M = {(s₁, p₁), (s₂, p₁), (s₃, p₂), (s₄, p₃)} and s₄ is marked inactive. Since l₁ is over-subscribed, from Equation \eqref eq:gx, we have M (l₁) = {s₁, s₂, s₃, s₄}, g (s₁) =2.50, g (s₂) =2.33, g (s₃) =3.50, and g (s₄) =1.17, i.e., g (s₃) is maximum. From Equation \eqref eq:gsw, (s₃, p₂) is removed from M. So, we have M = {(s₁, p₁), (s₂, p₁), (s₄, p₃)}, s₃ deletes p₂ from her/his list and she/he is active again.

At the 5th iteration, s₅ is assigned to her/his most preferred project p₃ offered by l₁. Therefore, we have M = {(s₁, p₁), (s₂, p₁), (s₄, p₃), (s₅, p₃)} and s₅ is marked inactive. Since p₃ is over-subscribed, from Equation \eqref eq:fx, we have M (p₃) = {s₄, s₅}, f (s₄) =1.17, and f (s₅) =1.33, i.e., f (s₅) is maximum. From Equation \eqref eq:fsw, (s₅, p₃) is removed from M. So, we have M = {(s₁, p₁), (s₂, p₁), (s₄, p₃)}, s₅ deletes p₃ from her/his list and she/he is active again.

At the 6th iteration, s₆ is assigned to her/his most preferred project p₅ offered by l₂. Therefore, we have M = {(s₁, p₁), (s₂, p₁), (s₄, p₃), (s₆, p₅)} and s₆ is marked inactive.

At the 7th iteration, s₃ is assigned to her/his most preferred project p₁ offered by l₁. Therefore, we have M = {(s₁, p₁), (s₂, p₁), (s₃, p₁), (s₄, p₃), (s₆, p₅)} and s₃ is marked inactive. Since p₁ is over-subscribed, from Equation \eqref eq:fx, we have M (p₁) = {s₁, s₂, s₃}, f (s₁) =2.50, f (s₂) =2.33, and f (s₃) =2.33, i.e., f (s₁) is maximum. From Equation \eqref eq:fsw, (s₁, p₁) is removed from M. So, we have M = {(s₂, p₁), (s₃, p₁), (s₄, p₃), (s₆, p₅)}, s₁ deletes p₁ from her/his list and she/he is active again.

At the 8th iteration, s₅ is assigned to her/his most preferred project p₄ offered by l₂. Therefore, we have M = {(s₂, p₁), (s₃, p₁), (s₄, p₃), (s₅, p₄), (s₆, p₅)} and s₅ is marked inactive.

At the 9th iteration, s₁ is assigned to her/his most preferred project p₂ offered by l₁. Therefore, we have M = {(s₁, p₂), (s₂, p₁), (s₃, p₁), (s₄, p₃), (s₅, p₄), (s₆, p₅)} and s₁ is marked inactive. Since l₁ is over-subscribed, from Equation \eqref eq:gx, we have M (l₁) = {s₁, s₂, s₃, s₄}, g (s₁) =3.33, g (s₂) =2.33, g (s₃) =2.33, and g (s₄) =1.17, i.e., g (s₁) is maximum. From Equation \eqref eq:gsw, (s₁, p₂) is removed from M. So, we have M = {(s₂, p₁), (s₃, p₁), (s₄, p₃), (s₅, p₄), (s₆, p₅)}, s₁ deletes p₂ from her/his list and she/he is active again.

At the 10th iteration, s₁ is assigned to her/his most preferred project p₅ offered by l₂. Therefore, we have M = {(s₁, p₅), (s₂, p₁), (s₃, p₁), (s₄, p₃), (s₅, p₄), (s₆, p₅)} and s₁ is marked inactive.

Since all students are inactive, the algorithm returns a stable matching M = {(s₁, p₅), (s₂, p₁), (s₃, p₁), (s₄, p₃), (s₅, p₄), (s₆, p₅)} of size |M|=6, which is a perfect matching.

4 Experiments

In this section, we present some experiments to evaluate the performance of our SPA-P-heuristic algorithm. We chose the SPA-P-approx [12] and SPA-P-approx-promotion [6] (for short, we call it SPA-P-promotion) algorithms to compare their solution quality and execution time with those of our SPA-P-heuristic algorithm since both SPA-P-approx and SPA-P-promotion are approximation algorithms with a linear time complexity. We implemented three algorithms by Matlab R2019a software on a laptop computer with Core i7-8550U CPU 1.8 GHz and 16 GB RAM, running on Windows 11.

Datasets. To conduct our experiments, we generated random SPA-P instances with four parameters (n, m, q, σ), where n is the number of students, m is the number of lecturers, q is the number of projects, and σ is the total capacity of q projects offered by all the lecturers, i.e., $σ = \sum_{j = 1}^{q} c_{j}$ . Given four parameters (n, m, q, σ), our method to generate a random SPA-P instance is as follows:

1. 1.
Generate a set S = {1, 2, ⋯ , n} of students, a set P = {1, 2, ⋯ , q} of projects, and a set L = {1, 2, ⋯ , m} of lecturers.
2.
Generate randomly non-empty sets P₁, P₂, ⋯ , P_m of projects such that P₁ ∪ P₂ ∪ ⋯ ∪ P_m = P and P_i∩ P_j = ∅ for i = 1, 2, ⋯ m, j = 1, 2, ⋯ m, and i ≠ j. We consider P_k as a set of projects offered by lecturer l_k ∈ L (l_k = 1, 2, ⋯ , m).
3.
Iterate for each l_k ∈ L and each p_j ∈ P, if p_j is at the position β^th in P_k, then we set rank (l_k, p_j) = β; otherwise, we set rank (l_k, p_j) =0. By doing so, we have a rank matrix of all the lecturers.
4.
Distribute the total capacity σ of all the projects randomly to the capacity c_j of each project p_j ∈ P (j = 1, 2, ⋯ , q) such that 0 < c_j < σ and $\sum_{j = 1}^{q} c_{j} = σ$ .
5.
Calculate the total capacity ρ_k of all the projects p_j ∈ P_k and generate the capacity d_k of each lecturer l_k ∈ L by setting d_k to be some percentage of ρ_k (e.g., d_k = 100 % ρ_k, or d_k is a random integer number such that 80 % ρ_k ≤ d_k ≤ 100 % ρ_k).
6.
Generate randomly non-empty sets A₁, A₂, ⋯, A_n of projects such that A_i ⊆ P. We consider A_i as a set of projects proposed by student s_i ∈ S (i = 1, 2, ⋯ , n).
7.
Iterate for each s_i ∈ S and each p_j ∈ P, if p_j is at the position α^th in A_i, then we set rank (s_i, p_j) = α; otherwise, we set rank (s_i, p_j) =0. By doing so, we have a rank matrix of all the students.
As a result, our method represents an instance of SPA-P by a rank matrix of students, a rank matrix of lecturers, a capacity list of projects, and a capacity list of lecturers, which are inputs for our algorithm.

In our experiments, we generated 100 instances of SPA-P for each value of n. In each instance, we chose the values of m and q to make the student-to-lecturer ratio and the student-to-project ratio suitable for real applications. Besides, σ is chosen based on the value of n. To compare the performance of our SPA-P-heuristic algorithm with that of SPA-P-approx and SPA-P-promotion algorithms for SPA-P instances, we ran SPA-P-heuristic, SPA-P-approx, and SPA-P-promotion algorithms for each instance to find their solution and execution time. Then, we determined the percentage of perfect matchings, the average of unassigned students, and the average execution time found by each algorithm run on 100 instances of SPA-P to compare their performance.
4.1 Experiment 1

Table 4
Parameter values for datasets in Experiments 1 and 2

ID n Number of lecturers Number of projects

(0.02n ≤ m ≤ 0.1n) (0.1n ≤ q ≤ 0.4n)

Min Max Min Max

1 500 10 50 50 200

2 1000 20 100 100 400

3 1500 30 150 150 600

4 2000 40 200 200 800

5 2500 50 250 250 1000

6 3000 60 300 300 1200

7 3500 70 350 350 1400

8 4000 80 400 400 1600

9 4500 90 450 450 1800

10 5000 100 500 500 2000

ID	n	Number of lecturers	Number of projects
1	500	10	50	50	200
2	1000	20	100	100	400
3	1500	30	150	150	600
4	2000	40	200	200	800
5	2500	50	250	250	1000
6	3000	60	300	300	1200
7	3500	70	350	350	1400
8	4000	80	400	400	1600
9	4500	90	450	450	1800
10	5000	100	500	500	2000

Fig. 1

Comparing solution quality and execution time of SPA-P-heuristic, SPA-P-approx, and SPA-P-promotion algorithms.

In this experiment, we chose the values of parameters n, m, and q as shown in Table \ref SPA-P-experiments. For each value of n varying from 500 to 5000 with steps 500, we generated 100 instances of SPA-P of parameters n, m, and q, where m and q are random numbers constrained by 0.02n ≤ m ≤ 0.1n and 0.1n ≤ q ≤ 0.4n, respectively. The constraints of m and q mean that the student-to-lecturer ratio is from 10 to 50 and each lecturer offers from 1 to 20 projects. In each instance, we let each student randomly rank from 1 to 20 projects in the set of projects offered by all lecturers. We set the total capacity σ of projects offered by all lecturers as n, i.e., σ = n. Then, we distributed σ randomly to the capacity c_j of each project p_j ∈ P to ensure that $\sum_{j = 1}^{q} c_{j} = σ$ and 1 ≤ c_j ≤ 100. Besides, we set the capacity d_k of each lecturer l_k to the total capacity of projects offered by l_k, i.e., d_k = ∑c_t, where c_t is the capacity of projects p_t ∈ P_k. By setting so, this scenario is a challenging experiment for the algorithms to find perfect matchings in SPA-P instances since each student has only a slot to be assigned to each project in their lists.

Figure 1(a) shows the percentage of perfect matchings found by SPA-P-heuristic, SPA-P-approx, and SPA-P-promotion algorithms. When n increases from 500 to 5000 with steps 500, SPA-P-heuristic finds a much higher percentage of perfect matchings than SPA-P-promotion and SPA-P-approx. Specifically, SPA-P-heuristic finds from 73% to 88% of perfect matchings, SPA-P-promotion finds from 51% to 73% of perfect matchings, while SPA-P-approx fails to find any perfect matchings of SPA-P instances.

Figure 1(b) shows the average number of unassigned students found by SPA-P-heuristic, SPA-P-approx, and SPA-P-promotion algorithms. When n increases from 500 to 5000 with steps 500, SPA-P-approx finds stable matchings with more than 22 unassigned students. Meanwhile, SPA-P-heuristic finds fewer unassigned students in stable matchings than SPA-P-promotion. This means that the stable matchings found by SPA-P-heuristic are larger than those found by SPA-P-promotion in terms of size.

Fig. 2

Comparing solution quality and execution time of SPA-P-heuristic, SPA-P-approx, and SPA-P-promotion algorithms.

Figure 1(c) shows the average execution time of SPA-P-heuristic, SPA-P-approx, and SPA-P-promotion algorithms. When n increases from 500 to 5000 with steps 500, the average execution time of SPA-P-approx increases from 0.0097 seconds to 5.2027 seconds, the average execution time of SPA-P-promotion increases from 0.0327 seconds to 3.9276 seconds, and the average execution time of SPA-P-heuristic increases from 0.0150 seconds to 2.5655 seconds. We see that when n ≥ 4000, SPA-P-heuristic runs about two times faster than SPA-P-promotion and SPA-P-approx.

Figure 1(d) shows the average number of iterations found by SPA-P-heuristic, SPA-P-approx, and SPA-P-promotion algorithms. We see that the average number of iterations found by SPA-P-heuristic is slightly smaller than that found by SPA-P-promotion, but larger than that found by SPA-P-approx. However, the average execution time of SPA-P-heuristic is much smaller than that found by SPA-P-promotion and SPA-P-approx, meaning that at each iteration, SPA-P-heuristic needs a smaller computation than SPA-P-promotion and SPA-P-approx.

4.2 Experiment 2

In this experiment, we chose the values of parameters n, m, and q as those constraints in Experiment 1. In each randomly generated instance of SPA-P, we set each student to rank randomly from 1 to 20 projects in the set of projects offered by all lecturers. Moreover, we set the total capacity σ of projects offered by all lecturers as 1.1n, i.e., σ = 1.1n, and distributed σ randomly to the capacity c_j of each project p_j ∈ P such that $\sum_{j = 1}^{q} c_{j} = σ$ and 1 ≤ c_j ≤ 100. Besides, we set the capacity d_k of each lecturer l_k ∈ L to a random integer number in [0.9ρ_k, ρ_k], where ρ_k is the total capacity of projects offered by l_k. This means that $σ = \sum_{k = 1}^{m} ρ_{k} = \sum_{j = 1}^{q} c_{j} = 1.1 n$ . Since 0.9ρ_k ≤ d_k ≤ ρ_k, we have $0.9 \sum_{k = 1}^{m} ρ_{k} \leq \sum_{k = 1}^{m} d_{k} \leq \sum_{k = 1}^{m} ρ_{k}$ , i.e., $0.99 n \leq \sum_{k = 1}^{m} d_{k} \leq 1.1 n$ . Therefore, if some generated instances that $0.99 n \leq \sum_{k = 1}^{m} d_{k} < n$ , then they have not any perfect matching.

Figure 2(a) shows the percentage of perfect matchings found by SPA-P-heuristic, SPA-P-approx, and SPA-P-promotion algorithms. When n varies from 500 to 5000 with steps 500, SPA-P-heuristic finds from 74% to 89% of perfect matchings, SPA-P-promotion finds from 69% to 85% of perfect matchings, while SPA-P-approx finds only from 0% to 20% of perfect matchings. It is obvious that SPA-P-heuristic finds a higher percentage of perfect matchings than SPA-P-promotion and SPA-P-approx. Compared to Experiment 1, we can see that when the total capacity of projects increases, i.e., the capacity of each project increases, it is easy for these algorithms to find perfect matchings in SPA-P instances.

Figure 2(b) shows the average number of unassigned students found by SPA-P-heuristic, SPA-P-approx, and SPA-P-promotion algorithms. When n increases from 500 to 5000 with steps 500, SPA-P-approx results in stable matchings with more than 15 unassigned students. In contrast, SPA-P-heuristic yields stable matchings with fewer unassigned students than SPA-P-promotion. This means that the stable matchings found by SPA-P-heuristic are larger than those generated by SPA-P-promotion in terms of size.

Figure 2(c) shows the average execution time of SPA-P-heuristic, SPA-P-approx, and SPA-P-promotion algorithms. When n varies from 500 to 5000 in increments of 500, the average execution time of SPA-P-approx increases from 0.0083 seconds to 3.1844 seconds, the average execution time of SPA-P-promotion increases from 0.0220 seconds to 3.0629 seconds, and the average execution time of SPA-P-heuristic increases from 0.0083 seconds to 1.0946 seconds. This shows that SPA-P-promotion and SPA-P-approx exhibit similar execution time, while SPA-P-heuristic runs approximately three times faster than both SPA-P-promotion and SPA-P-approx.

Figure 2(d) shows the average number of iterations used by SPA-P-heuristic, SPA-P-approx, and SPA-P-promotion algorithms. As in Experiment 1, we can see that SPA-P-heuristic used a smaller number of iterations than SPA-P-promotion, but a larger number of iterations than SPA-P-approx. Moreover, we can see that when the total capacity of projects increases, all these three algorithms not only find stable matchings faster than those, but also use a smaller number of iterations compared to those in Experiment 1.

4.3 Experiment 3

Fig. 3

Comparing solution quality and execution time of SPA-P-heuristic, SPA-P-approx, and SPA-P-promotion algorithms.

In this experiment, we chose n = 5000 and varied the total capacity σ of projects offered by all lecturers from 0.8n to 1.5n with steps 0.1n, i.e., σ varied from 4000 to 7500 with steps 500. For each combination of parameter values n and σ, we generated 100 instances of SPA-P, in which other parameters were set as follows: (i) m and q were random integer numbers constrained by 0.02 ≤ m ≤ 0.1n and 0.1n ≤ q ≤ 0.4n, i.e., 100 ≤ m ≤ 500 and 500 ≤ q ≤ 2000; (ii) σ was distributed randomly to the capacity c_j of each project p_j ∈ P such that $\sum_{j = 1}^{q} c_{j} = σ$ and 1 ≤ c_j ≤ 120; and (iii) d_k of each lecturer l_k ∈ L was a random integer number such that 0.8ρ_k ≤ d_k ≤ 1.2ρ_k, where ρ_k is the total capacity of projects offered by l_k. As mentioned in Experiment 1, the constraints of m and q mean that the student-to-lecturer ratio was chosen from 10 to 50 and each lecturer offered from 1 to 20 projects.

Figure 3(a) shows the percentage of perfect matchings found by SPA-P-heuristic, SPA-P-approx, and SPA-P-promotion algorithms. We see that when the total capacity σ ∈ {4000, 4500}, all these three algorithms cannot find any perfect matching since we have σ < n, i.e., the total capacity σ of projects is not enough slots for n students. However, when σ = 5000, i.e., each project has only a slot for each student, all these three algorithms cannot find any perfect matching. When σ increases, the percentage of perfect matchings found by these algorithms increases since the capacity of projects and lecturers increases. However, SPA-P-heuristic finds a higher percentage of perfect matchings than SPA-P-approx and SPA-P-promotion.

Figure 3(b) shows the average of unassigned students found by SPA-P-heuristic, SPA-P-approx, and SPA-P-promotion algorithms. When σ increases, the average of unassigned students found by these algorithms decreases, meaning that the sizes of stable matchings increase. Moreover, we see that SPA-P-heuristic finds stable matchings whose sizes approximate those of SPA-P-promotion (i.e., the green line overlaps the blue line) but are larger than those of SPA-P-approx.

Figure 3(c) shows the average execution time of SPA-P-heuristic, SPA-P-approx, and SPA-P-promotion algorithms. When σ increases, the average execution time found by these algorithms decreases since the capacity of projects and lecturers increases, making these algorithms find stable matchings easier. When σ increases from 4000 to 7500, the average execution time of SPA-P-heuristic decreases from 7.66 seconds to 0.36 seconds, the average execution time of SPA-P-approx decreases from 53.28 seconds to 0.75 seconds, and the average execution time of SPA-P-promotion decreases from 113.82 seconds to 0.78 seconds. When σ = 4000, SPA-P-heuristic runs about 15 times faster than SPA-P-promotion and about 7 times faster than SPA - P - approx. When σ ≥ 5500, SPA-P-heuristic runs about 2 times faster than SPA-P-promotion and SPA-P-approx. In particular, when σ decreases from 5000 to 4000, the execution time of SPA-P-approx and SPA-P-promotion significantly increases, while that of SPA-P-heuristic almost remains unchanged.

Figure 3(d) shows the average number of iterations used by SPA-P-heuristic, SPA-P-approx, and SPA-P-promotion algorithms. As in Experiments 1 and 2, we see that SPA-P-heuristic used a smaller number of iterations than SPA-P-promotion, but a larger number of iterations than SPA-P-approx.

4.4 Remarks

In summary, we see from the three experiments above that SPA-P-heuristic outperforms SPA-P-approx and SPA-P-promotion in solution quality and execution time. This can be explained as follows:

1.
In SPA-P-approx, there are two main reasons to show that this algorithm performed poorly in finding maximum matchings for SPA-P instances. Firstly, when an unassigned student s_i with a non-empty list chooses the first project p_j in her/his list, if p_j is full, then p_j is not assigned to s_i. However, if s_i has only a project p_j in her/his list and if the algorithm does not assign p_j to s_i, then s_i is single. Since M (p_j) is a set of students assigned to p_j, if the algorithm removes some student from M (p_j) and assigns p_j to s_i, then s_i is not single. Secondly, when a student s_i is assigned to a project p_j in her/his list, meaning that s_i is assigned to a lecturer l_k who offered p_j. If l_k is over-subscribed, the algorithm removes an arbitrary student s_r from M (p_z), where p_z is l_k’s worst non-empty project, and deletes p_z in s_r’s list. If s_r remains only a project p_z in her/his list, then s_r becomes single. Since M (p_z) is a set of students assigned to p_z and in this case, the algorithm should remove another student from M (p_z) rather than s_r. Moreover, removing an arbitrary student s_r from M (p_z) makes the algorithm find a stable matching difficult, leading to inefficient execution time.
2.
In SPA-P-approx-promotion, If a project p_j is full, the algorithm removes an arbitrary student s_r from M (p_j) and adds (s_i, p_j) to M. If a lecturer l_k is over-subscribed, the algorithm removes an arbitrary student s_r from M (p_z), where l_k is the lecturer who offered p_j and p_z is l_k’s worst non-empty project in M (l_k). Similar to SPA-P-approx, removing an arbitrary student s_r in M (p_j) or M (p_z) is a weak point of SPA-P-approx-promotion. Moreover, when a student s_i with a non-empty list is unpromoted, she/he is allowed to recover her/his original list once again to find a project again in her/his list. This makes the algorithm inefficient in execution time.
3.
In SPA-P-heuristic, our heuristic functions f (x) and g (x) given in Equations \eqref eq:fx and \eqref eq:gx are used to keep the students in M who have the least opportunity to be reassigned to projects in their lists and remove the students in M who have the most opportunity to be reassigned to projects in their lists. Therefore, our SPA-P-heuristic solves the weaknesses of SPA-P-approx and SPA-P-promotion algorithms.

Finally, the scenarios of our experiments, where n = 5000, m ranges from 100 to 500, and q ranges from 500 to 2000, show that our SPA-P-heuristic results in maximum stable matchings in approximately 1.0 to 7.0 seconds. This underscores the remarkable efficiency of SPA-P-heuristic for dealing with large SPA-P instances.
5 Conclusions

In this paper, we propose a SPA-P-heuristic algorithm to find maximum stable matchings of SPA-P instances. At the beginning, our algorithm initializes a matching to be empty and sets all the students to be active. At each iteration, our algorithm finds an active student with a non-empty list. If such a student exists, our algorithm assigns to her/him the most preferred project in her/his list to form a student-project pair in the matching. If the assigned project overcomes its capacity, our algorithm uses a heuristic function to remove the worst student among students assigned to the project in the matching. If the lecturer who offered the project overcomes her/his capacity, our algorithm uses another heuristic function to remove the worst student among students assigned to the lecturer in the matching. When a student is assigned to a project, she/he becomes inactive. When a student removes a project assigned to her/him, she/he deletes the project from her/his list and becomes active again. Our algorithm repeats until all the students are inactive. We show that our algorithm returns a stable matching after a finite number of iterations. Our experimental results over all the tested scenarios show that our SPA-P-heuristic algorithm outperforms SPA-P-approx and SPA-P-promotion algorithms regarding solution quality and execution time for SPA-P instances of large sizes.

The SPA-P problem consists of variants such as the Student-Project Allocation with preferences over Projects with Ties (SPA-PT), the Student-Project Allocation problem with lecturer preferences over Students (SPA-S) [2], or the Student-Project Allocation problem with lecturer preferences over Students with Ties (SPA-ST) [4, 14]. Therefore, our approach can be extended by defining suitable heuristic functions to solve these problems efficiently.

References

Abraham

D.J.

Irving

R.W.

Manlove

D.F.

, The studentproject allocation problem. In Proceedings of the 14th International Symposium on Algorithms and Computation, (2003), pp. 474–484, Kyoto, Japan.

Abraham

D.J.

Irving

R.W.

Manlove

D.F.

, Two algorithms for the student-project allocation problem, Journal of Discrete Algorithms5(1) (2007),73–90.

Chiarandini

Fagerberg

Gualandi

, Handling preferences in student-project allocation, Annals of Operations Research257(1) (2019),39–78.

Cooper

Manlove

, A 3/2-approximation algorithm for the student-project allocation problem. In Proceedings of the 17th International Symposium on Experimental Algorithms, (2018), pp. 8:1–8:13, L’Aquila, Italy.

Gale

Shapley

L.S.

, College admissions and the stability of marriage, The American Mathematical Monthly9(1) (1962),9–15.

Iwama

Miyazaki

Yanagisawa

, Improved approximation bounds for the student-project allocation problem with preferences over projects, Journal of Discrete Algorithms13(1) (2012),59–66.

Kazakov

, Co-ordination of student-project allocation. Manuscript, University of York, Department of Computer Science,http://www-users.cs.york.ac.uk/kazakov/papers/proj.pdf, 2001.

Király

, Better and simpler approximation algorithms for the stable marriage problem, Algorithmica60(1) (2011),3–20.

Kwanashie

Irving

R.W.

Manlove

D.F.

Sng

C.T.S.

, Profile-based optimal matchings in the student/project allocation problem. In Proceedings of the 25th International Workshop on Combinatorial Algorithms, (2014), pp. 213–225, Duluth, USA.

10.

Manlove

Milne

Olaosebikan

, An integer programming approach to the student-project allocation problem with preferences over projects. In Proceedings of the 5th International Symposium on Combinatorial Optimization, (2018), pp. 313–325, Morocco.

11.

Manlove

Milne

Olaosebikan

, Student-project allocation with preferences over projects: Algorithmic and experimental results, Discrete Applied Mathematics308(1) (2022),220–234.

12.

Manlove

D.F.

O’Malley

, Student-project allocation with preferences over projects, Journal of Discrete Algorithms6(4) (2008),553–560.

13.

Minton

Johnston

M.D.

Philips

A.B.

Laird

, Minimizing conflicts: A heuristic repair method for constraint satisfaction and scheduling problems, Artificial Intelligence58(1-3) (1992),161–205.

14.

Olaosebikan

Manlove

, Super-stability in the student-project allocation problem with ties, Journal of Combinatorial Optimization43(1) (2022),1203–1239.

15.

Russel

Norvig

, Artificial Intelligence: A Modern Approach. Prentice Hall Press, Upper Saddle River, NJ, USA, 3rd edition, 2009.

16.

Viet

H.H.

Van Tan

Thanh Cao

, Finding maximum stable matchings for the student-project allocation problem with preferences over projects. In Proceedings of the 7th International Conference on Future Data and Security Engineering, (2020), pp. 411–422, Quy Nhon, Vietnam.

I	Instance of the SPA-P problem
S	Set of students
L	Set of lecturers
P	Set of projects
A _i	Set of projects ranked by student s_i ∈ S
P _k	Set of projects offered by lecturer l_k ∈ L
M	Matching
M (s_i)	Set of projects assigned to student s_i in M
M (p_j)	Set of students assigned to project p_j in M
M (l_k)	Set of students assigned to lecturer l_k in M
s _i	Student
l _k	Lecturer
p _j	Project
c _j	Capacity of project p_j ∈ P
d _k	Capacity of lecturer l_k ∈ L
n	Number of students
m	Number of lecturers
q	Number of projects

Iter.	s_i	p _j	l _k	M ∪ (s_i, p_j)	Over-subscribed	M \ (s_r, p_t)	M
1	s ₁	p ₁	l ₁	(s₁, p₁)			{(s₁, p₁)}
2	s ₂	p ₁	l ₁	(s₂, p₁)			{(s₁, p₁), (s₂, p₁)}
3	s ₃	p ₂	l ₁	(s₃, p₂)			{(s₁, p₁), (s₂, p₁), (s₃, p₂)}
4	s ₄	p ₃	l ₁	(s₄, p₃)	l ₁	(s₃, p₂)	{(s₁, p₁), (s₂, p₁), (s₄, p₃)}
5	s ₅	p ₃	l ₁	(s₅, p₃)	p ₃	(s₅, p₃)	{(s₁, p₁), (s₂, p₁), (s₄, p₃)}
6	s ₆	p ₅	l ₂	(s₆, p₅)			{(s₁, p₁), (s₂, p₁), (s₄, p₃),(s₆, p₅)}
7	s ₃	p ₁	l ₁	(s₃, p₁)	p ₁	(s₁, p₁)	{(s₂, p₁), (s₃, p₁), (s₄, p₃),(s₆, p₅)}
8	s ₅	p ₄	l ₂	(s₅, p₄)			{(s₂, p₁), (s₃, p₁), (s₄, p₃),(s₅, p₄),(s₆, p₅)}
9	s ₁	p ₂	l ₁	(s₁, p₂)	l ₁	(s₁, p₂)	{(s₂, p₁), (s₃, p₁), (s₄, p₃),(s₅, p₄), (s₆, p₅)}
10	s ₁	p ₅	l ₂	(s₁, p₅)			{(s₁, p₅), (s₂, p₁), (s₃, p₁), (s₄, p₃),(s₅, p₄), (s₆, p₅)}

ID	n	Number of lecturers		Number of projects
(0.02n ≤ m ≤ 0.1n)		(0.1n ≤ q ≤ 0.4n)
Min	Max	Min	Max
1	500	10	50	50	200
2	1000	20	100	100	400
3	1500	30	150	150	600
4	2000	40	200	200	800
5	2500	50	250	250	1000
6	3000	60	300	300	1200
7	3500	70	350	350	1400
8	4000	80	400	400	1600
9	4500	90	450	450	1800
10	5000	100	500	500	2000