Abstract
The job/career matching has been ever increasingly important social issue in every country. The career matching is to assign the right candidate to the right company. The matching is passive process by employment counselor who cannot go through all profiles of candidates and also companies’ profiles. And there can be a human mistake assigning the wrong candidate to a wrong company. To minimize miss-matching cases, the paper investigates several statistical analysis methods and also proposes an artificial intelligence based method. In this paper, the method named Artificial Intelligence based Design platform (AID) has been developed for career matching between university students and companies. The efficiency of AID has been proven by comparing the results (matching rate representing perfect matching) obtained by the proposed method AID and statistical methods such as least squares, Pearson correlation, Manhattan distance. The paper shows that AID can produce zero miss-matching for student’s skill and company’s needs while the statistical methods produce more than 30% miss-matching. In other words, AID can assign the right student to the right company.
Introduction
In hiring, recommending a suitable job for a job seeker is not an easy task. The traditional online recruiting applications normally use only simple Boolean operations to compare the basic requirement of jobs offered by employers and basic qualification information of job seekers to generate matched job results for job seekers. In addition, the lack of job counselor at public service cannot account the huge amount of profiles from job seeker/candidates and companies. Therefore, it is quite often that irrelevant jobs are matched or irrelevant candidates are matched. To prevent this miss-matching, it is required more accurate systematic solutions.
The best matching and assignment problem is a kind of multi-objective combination optimization problem. There were several researches for matching/assignment problems; Sonia and Puri [1] that solves resource assignment problem by giving jobs primary and secondary priorities with a linear programming model for the amount of resources. There are specific linear models developed for assignment model such as Hungarian Method [2], Stable Marriage Problem [3], Weighted Matching Problem [4], Stable Roommates Problem [5] and Hospital/Residents [6]. Altay et al. applied heuristic method such as Genetic Algorithm with statistical model to assign the right student with the right skill which the company required [7]. In this paper, the problem proposed by Altay et al. has been considered and resolved using on-house solver named Artificial Intelligence based Design platform (AID). The results obtained by AID are compared to the methods proposed in [7] in terms of the perfect matching and miss-matching rate.
In this paper, several methods including statistical and heuristic methods have been investigated. In addition, the alternative method based on an artificial intelligence has been developed to maximize the perfect matching that assign the right candidate to the right company.
The rest of paper is organized as follows; Section 2 presents the method that applied to solve the matching problem between students and companies. The problem of rationale is defined in Section 3. In Section 4, the trend of preferences from students and companies are analyzed. The numerical results obtained by statistical methods and AID are described and compared in Section 5. Finally, the conclusion and discussion are described in Section 6.
Method
In this paper, an on-house software named Artificial Intelligence based Design platform (AID) is implemented for the optimization of career matching between final year of students at university and companies.
Artificial intelligence based design platform (AID)
AID has been developed to solve complex real-world design problems as shown in Fig. 1. It is capable to perform cluster analysis the big-data from the private, Social Networking Services (SNS) and government.

Artificial Intelligence based Design platform (AID).
AID can make a mathematical model based on a data pattern then it conducts an optimization using Genetic Algorithm (GA) [8, 9] or Mixed Integer Linear Programming (MILP) [10] to minimize the value(s) of objectives in terms of single and multi-objective manner. Finally, AID can provide an appropriate and rational decision for complex decision-making problems.
In this paper, AID is used to analyze the statistical correlation between the companies and students having different characteristics. It performs the cluster analysis that can show the trend of each group’s characteristics; it can be seen that the most preference of companies as well as students’. And it conducts an optimization to assign the right student to the right company using MILP. The objective of this application is to let all companies have the right student based on their preference. In other words, AID as a decision support system acts to maximize a perfect matching and also second perfect matching while minimizing miss-matching as described in following Section.
MILP is a modification to a linear program (LP) in which some variables are constrained to take only integer values. Constraints on such variables enable the inclusion of discrete decisions in the optimization [10, 11]. MILP can be expressed in mathematical terms as shown in Equation (1);
In this paper, MILP is used to maximize a perfect matching as a single-objective design problem considering two different groups’ preference.
The problem considered in this paper is to match the final year student’s skill and company’s need based on the industrial engineering subjects especially student’s graduation thesis as described in [7]. Istanbul Technical University Industrial Engineering Department and Istanbul Chamber of Industry have decided to collaborate on solving the industrial problems with the help of students. Each year senior students of the university are assigned to industrial companies to provide practice for the engineering candidates and providing support for the companies. The details of background can be found in reference [7].
An appropriate match will be benefit for both companies and students. The subjects interested for both parties are selected by the Department of Industrial Engineering and then companies and students fill their preferences. For this matching problem, 54 companies and 84 students are considered.
The goal of this research is to assign the right student to the right company in other words all 54 companies have 54 right students with the skills needed by companies. To do so, the paper evaluates and compares the efficiencies of three statistical fitness functions such as least squares, Pearson correlation, and Manhattan distance. The aim of this research is to figure out the alternative method for the appropriate match.
Preference vector
The subjects are chosen from the list of subjects declared by Professors for that year. In 2007–2008 academic year, preference vectors are derived from the global list of nine subjects shown below: Inventory and Materials Management. (IM) Process Management. (PM) Quality Management. (QM) Human Resources Management. (HR) Ergonomics. (Er) Finance/Investments Analysis. (FI) Supply Chain Management/Customer Relationship Management. (SCRM) Knowledge Management. (K) Strategic Managements. (S)
There are two kinds of vector filled by companies and students as show in Table 1. The preferred subject will be marked by 1 or higher otherwise 0.
Preference vectors by companies and students
Preference vectors by companies and students
where Pre C and Pre S represent the preference from the university students and companies. The companies’ and students’ preferences are shown in Tables 2 and 3.
Companies’ preferences
Students’ preferences
There are three types of preference vector for companies as shown in Table 4;
Company-A (C-A) looks for a student having a skill on the subject Finance/Investments Analysis (FI). Company-B’s (C-B) needs a student who can handle mainly Quality Management (QM) and partially on Knowledge Management (K). Company-C (C-C) needs a student who is capable for any of nine subjects.
Example of companies’ preference vector
Example of companies’ preference vector
There are three types of preference vector for students as shown in Table 5;
Student-A (S-A) has the skill on the subject Knowledge Management (K). Student-B’s (S-B) preference is on two subjects Inventory and Materials Management (IM) and Process Management (PM). Student-C (S-C) has two preferences; the major subject on Human Resources Management (HR) and the second subject on Supply Chain Management/Customer Relationship Management (SCRM).
Example of students’ preference vector
All 54 companies desire to hire a student who can fulfill the company’s requirement. It is called as Perfect Match when the company (C) and the student (S) have the same preference as shown in Table 6. The meaning of perfect matching is that student’s preference can fulfill at least one of company’s requirements.
Perfect matching
Perfect matching
where C and S represent company and student.
There is the company (C-B) and none of students cannot fulfill the company’s first preference on the subject QM however, there is the student (S-A) can fulfill the company’s second preference K as shown in Table 7.
Second perfect matching
The matching system should assign the student (S-A) to the company (C-B) since it makes the win-win situation. The C-B would have the S-A which is the better choice than no candidate. It is the same to the S-A that C-B is better than jobless. This matching is called as a Second Perfect Match. And it is assumed as a Miss-Matching if the match does not fit to the perfect or second perfect matching cases.
The preferences from companies and students are reviewed. As shown in Fig. 2, there are a good balance on the subjects Quality Mgmt. (QM) and Human Resource Mgmt. (HR) while there is the lack of students on Inventory and Material Mgmt. (IM) and Ergonomics (Er). There are slightly enough students on Process Mgmt. (PM), SCM & CRM (SCRM), Knowledge Mgmt. (K). However, the problem is that there are too many students on the subject Strategic Mgmt. (S) while there is less that means those students will not be assigned to companies. Those students will be selected only if they have other preferences.

Preference status of companies and students.
Before the analyzation of matching, the preferences from companies and students are investigated using AID. The matching simulation is conducted in the following environment; Intel(R) Core(TM) i5-6500 CPU @ 3.20 GHz with 16 GB RAM.
Cluster analysis of company group’s preference
Cluster analysis is conducted to find out the trend in preference between 54 companies. There are three types (red: Cluster-1, green: Cluster-2, blue: Cluster-3) of company’s preference as shown in Fig. 3. In Cluster-1, the preferences from companies 6 and 8 are distributed from 1 to 3 on the subjects IM, PM, and S as shown in Table 8. Tables 9 and 10 show the preferences for Cluster-2 (companies 26 and 48) are distributed from 1 to 8 while Cluster-3 have the preference value under 5.

Cluster analysis for students’ preference.
Cluster-1 preference trend
where C_ID represents the identification of company.
Cluster-2 preference trend
Cluster-3 preference trend
The preferences from students can be grouped in 9 clusters as shown in Fig. 3. In Clusters 1 -2, the students17 and 71 have various preferences from 1 to 5 on multiple subjects while Cluster 5 (students 8 and 16) has two preferences on two subjects including Inventory and Material Mgmt. (IM) and Strategic Mgmt. (S) as shown in Tables 11 and 12.
Cluster-1 and 2 preference trends
Cluster-1 and 2 preference trends
where S_ID represents the identification of student.
Cluster-5 preference trend
In this section, the career matching optimization is conducted using three statistical fitness models. From the numerical In addition, the drawback of using statistical models are found. a new alternative model is developed. During the optimization, the value obtained by fitness model is minimized as shown Equation (2);
The statistical functions are mostly used to analyze the correlation between groups. In this paper, three statistical functions including Least Squares (LS), Pearson Correlation (PC), Manhattan Distance (MD) are used for matching analysis between 54 companies and 84 students.
Least Squares (LS);
The LS function is a standard approach in regression analysis and is to calculate the sum of the squares of residuals made in the results of every single subject. Lower LS value indicates the better matching between company and student.
Inverse Pearson Correlation (IPC);
The reason why Pearson Correlation function is inversed is to minimize the fitness value obtained during the optimization. The Pearson Correlation is to measure of the correlation between two variables and vectors. Higher Pearson Correlation value will produce the lower IPC value. So lower IPC value indicates the higher correlation between company and student.
Manhattan Distance (MD);
The MD is also called taxicab metric, snake distance, city block distance and is to measure the distance between two points. In this sense, the value of MD represents the preference distance between company and student.
Tables 13–15 show the matching results obtained by the statistical fitness models including LS, IPC and MD. As it can be seen that the results obtained by LS and MD produce the same matching results since the evaluation of preference difference between squares and absolute formulas makes no difference. The perfect matching, second perfect matching and miss-matching are marked by colors; white, yellow and red respectively.
Matching results obtained by LS
where C_ID and S_ID represent the identification of company and student.
Matching results obtained by IPC
Matching results obtained by MD
LS and MD produce 34 perfect matching and 3 second perfect matching while IPC produces 20 perfect matching and 10 second perfect matching among 54 companies. In other words, LS and MD have 26% higher accuracy for the perfect matching when compared to the IPC model. The problem is that miss-matching students (17 cases by LS & MD, 24 cases by IPC) will quit their job since the position at company is not based on their expertise. They will be unemployed again later on.
In addition, the reason why the statistical fitness models produce the miss-matching cases is investigated even though there are good candidates. Table 16 shows a sample miss-matching case for C49 obtained by LS, IPC and MD.
A sample miss-matching case by LS & MD and IPC
It can be seen that the company C49 has the preferences in the order of the subjects; IM, K and HR. The statistical fitness models including LS and MD assign the student S32 (skill preferences in the order of PM and K) and the student S18 (skill preferences in the order of Er & S and HR) is assigned by IPC even though there is a good candidate S9 who is not assigned before. The student S9 has the preferences on the subjects IM and S that makes a perfect matching. The reason why the statistical fitness models LS & MD and IPC assign the students S32, S18 is to lower the fitness value by minimizing difference between the second and the third priority for multiple preference cases. In other words, these statistical models can not verify what are the perfect matching or the second perfect matching or miss-matching.
The number of miss-matching should be minimized to reduce an unemployment rate. A smarter system such as artificial intelligence system which is capable of recognizing the perfect or miss-matching, should be developed and applied. It is essential to develop an artificial intelligent system that can recognize what are the perfect & second perfect matching conditions, while minimizing miss-matching cases. To do so, one of alternative model named Multiple-Preference Matching Algorithm (MPMA) is developed and its computational efficiency and matching quality are compared to the results obtained by LS, MD and IPC.
It has shown that the statistical fitness models LS, MD and IPC are not efficient enough to produce a perfect and second perfect matching when a company’s preferences are on more than two subjects. A method named Multiple Preference Matching Algorithm (MPMA) is developed to prevent the drawback of LS, MD and IPC.
Multiple Preference Matching Algorithm (MPMA);
MPMA is consisting of MP csk WV and Least Squares (LS) to maximize the perfect and the second perfect matching while minimizing all preferences difference. Table 17 shows the matching results by MPMA that produces zero miss-matching. The matching results show that MPMA assigns the students for 49 perfect matchings and 5 second perfect matchings. Table 18 shows a sample miss-matching considered in Table 16 that LS & MD and IPC cannot make either the perfect or second perfect matching. MPMA assigns the student S75 to the company C49 to produce the second perfect matching. It is also checked with the rest of students that there is none of any good candidates beside S75. In other words, MPMA can verify what is the perfect matching or second perfect matching as well as miss-matching.
Matching results obtained by MPMA
where C_ID and S_ID represent the identification of company and student.
A sample miss-matching case by LS & MD and IPC
Figure 4 compares the matching accuracy of statistical fitness functions including LS, MD, IPC and a new matching model named MPMA. As it can be noticed that MPMA produces 91% perfect matching that is improved by 28% and 54% respectively when compared to LS & MD and IPC. The most important fact is that MPMA produces the zero miss-matching while LS & MD, IPC produce 31% and 44% respectively.

Matching quality comparison.
The computational cost of those matching models is compared in Fig. 5. The computational cost of LS & MD is less than one second while IPC records 2.7 seconds. MPMA has an intermediate speed as 1.6 seconds. To conclude these optimizations, MPMA is the most appropriate method with intermediate computational cost and higher accuracy for matching problem.

Computational cost for matching simulation.
Three statistical fitness models and the alternative method named Multiple Preference Matching Algorithm (MPMA) are demonstrated and implemented to solve the matching problems. Numerical results obtained by AID with MPMA and statistical models are compared in terms of the matching quality and computational efficiency. The paper clearly shows the benefit of using AID with MPMA which produces zero miss-matching with maximum perfect matching.
Current research focus on the career matching problems and also single and multi-objective training/education assignment & scheduling optimization is under investigation.
