Soft clustering and interval type-2 fuzzy set based inference strategy for I.T. personnel selection

Abstract

Impreciseness and uncertainty are the fabrics that make life interesting. For decades, human beings have developed strategies to cope with uncertainties and automate them. In personnel selection for the I.T. field, selectors often find it very difficult to select candidates by going through a set of resumes containing similar kinds of skills. Hence the selection task becomes a fuzzy decision making with the uncertainty involved. A combination of fuzzy clustering and Interval Type-2 fuzzy sets (IT2FS) is proposed in such scenarios. An experiment is conducted over a resume dataset containing fifteen hundred resumes for a particular job description. Firstly, Fuzzy C-means clustering (FCM) is applied for selective clustering, while decision-making under uncertainty is carried through IT2FS. The candidates in the selected cluster are given a score for ranking as per the skillset criteria. The final decision for shortlisting the resumes is carried through IT2FS. The model shows an average accuracy of 88.2% with an F1-score of 0.76 compared to (K-means + IT2FS) model with an F1-score of 0.72. Thus, the proposed model performs better while decision-making under uncertainty.

Keywords

Personnel selection fuzzy clustering interval Type-2 fuzzy sets decision making resume shortlisting

1 Introduction

Personnel selection is a process to precisely select candidates with some skill, knowledge, and experience for the desired performance in a specified job [32]. In the hiring process, the selector searches for candidates who have the required professional skills and are suitable enough for their organisation. The process involves resume shortlisting followed by a technical interview round. Often the candidate’s skills do not precisely match the skills needed for the job. Apart from that, they may possess some other skills that may be indirectly crucial for organisational growth. Therefore selector gives weightage both to required professional skills and also to other skills. It provides some scope to the candidates to get shortlisted where they may be missing some required professional skills. These constitute the boundary cases and are hence regarded as uncertain cases. Now it is up to the selector to select the number of uncertain cases. For example, let us assume that a selector is looking for a resume with the required professional skill be: “Java programmer”,“Android”, while the skills mentioned in the candidates resume be: “Object-oriented programming”, “Team player”. In this case, the exact required skill does not match that of the candidates resume but still resume is not irrelevant as the candidate knows the Object-oriented concept (over which Java is based). Apart from that, the candidate has experience working with the team. Hence these skills are also given some weightage for selection, as the candidate seems to have a skill that is helpful in the organisation working culture.

A selector also receives hundreds of resumes for the interview process. It is challenging for him to go through all resumes manually; hence he may choose some starting candidate satisfying most of the requirements and interview them. The major drawback of this approach is that the work becomes very tedious and may leave out many suitable candidates for the post.

The paper provides a solution to semi-automate selecting the most appropriate resumes involving uncertain cases for the interview process. It reduces the burden of the segregating resume and helps the selector give more focus on the interview. A methodology such as selective clustering segregates the eligible resumes from a large set of applications. The fuzziness factor is a must in such models for incorporating the imprecision involved in the job requirement. Some standard information retrieval methods like TF-IDF (Term Frequency-Inverse Document Frequency) [29], LDA (Latent Dirichlet Algorithm) [37] and K-means algorithm are often used to segregate documents. These methods are used to find the similarity of the resume to the skill requirement, or they form a crisp cluster. Overall, these techniques do not help capture the imprecision and uncertainty involved in such human decision making.

The dataset 1 contains collections of approximately 1500 resumes for personnel selection downloaded from the Kaggle website 2 The following are the list of attributes in an applicant’s resume:

Title: [“Seeking innovative and challenging career assignment in an organisation”, “Looking for Job as a Java developer”, etc.]

City: [“Pune”, “Hyderabad”, etc.]

State: [“Haryana”, “Telangana”, etc.]

Description: [“Looking for a challenging career demanding best of my professional skills”, etc.]

Work experiences: [{0: [{‘wtitle’: ‘Java Developer’}, {‘wcompany’: ‘Divergent software labs indore ’},{‘wduration’: ‘January 2018 to Present’}], etc.]

Education: [{0: [{‘e_title’: “B.Tech.”}, {‘e_school’: ‘Calicut University’}, etc.]

Skills: [‘Git’, ‘Maven’, ‘Java’, ‘Jenkins’,etc.]

Links: [‘https://github.com/...’, etc]

Certificates: [{0: [{‘c_title’: ‘Java Developer’}, {‘c_duration’: ‘June 2019 to Present’}], etc.]

Additional information: [“Operating Systems”, “Version Control”, etc.]

All of these ten attributes contain the textual biodata of an applicant. However, some are non-useful attributes in the current process (such as links, city, state) because the selector mostly wants resumes to be segregated based on the skill, knowledge, and experience and not based on city or state. Further preprocessing includes removing noises such as special characters (example: hashtags, punctuation), dates, links, etc.

In order to deal with these boundary-cases (uncertain cases), the FCM (Fuzzy C-means) and IT2FS (Interval Type-2 fuzzy sets) are used in the proposed methodology. The FCM is applied over the preprocessed dataset to create two clusters: one containing the eligible resumes to the requirement and the other containing non-eligible resumes. The IT2FS gives the control to the selector to decide how many boundary cases (uncertain cases) will pass from the eligible resumes. If a selector is very strict on the requirement, he can choose the lower range of IT2FS output, while if the selector is quite flexible on the requirement, he can opt for a upper range of IT2FS. In the default settings, the mid-value of IT2FS is chosen for filtering. Therefore the inclusion of IT2FS provides an advantage to deal with uncertain cases. Hence it gives a chance to candidates to get shortlisted for the interview if they may be missing some required skills.

2 Literature review

Decision-making is involved in every aspect of our life. The decision-making process gets complicated when processing the adjective words like “good”, “cheap”, etc. These words are not only imprecise [40] but also carry two aspects of uncertainties with them: Intra-level uncertainty and inter-level uncertainty [21]. These uncertainties cannot be represented using Type-1 fuzzy sets, as they can only represent imprecision. Hence higher-order fuzzy sets (such as IT2FS) are used to capture the uncertainty involved with imprecision [22].

Personnel selection is a decision-making problem involving uncertainty. In order to solve this personnel selection problem, the literature review carried has been divided to cover the following two aspects: The first part of the literature presents applications involving clustering and classification, where uncertainty is tackled through IT2FS, while the second part presents decision-making approaches applied in personnel selection.

The concept of Computing with words (CWW) was introduced by Zadeh [41]. It is a methodology where the computation is performed using words drawn from the human language. The idea is taken from observing the human way of performing several mental tasks without involving measurements. In daily conversation, humans prefer to use words rather than talking in numbers. These words are imprecise and also carry uncertainty with them. These IT2FS models can represent linguistic words by capturing both imprecision and uncertainty [21 , 24]. Mishra et al. [25] used the IT2FS model to perform CWW for decision-making, where the system suggests the suitability of a restaurant based on the subjective importance given to selection criteria (such as cost, price range and time). Similarly, IT2FS finds application in the following classification task where it inherently incorporates a tolerance range when deciding [30, 34].

Sharma et al. [33] used IT2FS for the classification of vehicles. Attributes such as vehicle body length, chassis height, wheelbase, ground clearance were taken. The type-2 fuzzy system outperformed the adaptive neuro-fuzzy inference system on this task. Chumklin et al. [8] detected micro-calcification in mammograms using an interval type-2 fuzzy system. Similarly, Phong et al. [26] used type-2 Takagi Sugeno Kang (TSK) fuzzy systems for electrocardiogram (ECG) arrhythmic classification. The parameters of the type-2 TSK fuzzy classifier are computed using fuzzy C-mean clustering and back-propagation. An improved clustering method using the IT2FS method was proposed by Zhang et al. [42], where FCM algorithms extract fuzzy rules from IT2FS systems. The algorithm is based on the distance between interval data. Mendoza [23] presented IT2FS logic along with a modular neural network for face recognition. The response was integrated into multi-net neural systems for improvement. IT2FS, in combination with SVM, has also been used for pattern recognition, as shown by Herman et al. [15] in recognition of motor imagery related to EEG. Choi et al. [7] performed pattern recognition using IT2FS and applied heuristic and histogram methods on the image. Similarly, Melin et al. [20] presented a brief review of classification and pattern recognition applications using IT2FS. Hwang et al. [16] used the IT2FS approach to C-means clustering for text data, while, Tunali et al. [35] extended the implementation to various fields. The IT2FS combined with TOPSIS has been used to deal with uncertain situation in evaluating real-time risk status of metro station [28]. This approach has been utilized to evaluate the subway station’s operational risk assessment [43].

Mehtap Dursun et al. [10] provide a decision-making framework for personnel selection involving fuzzy techniques [36]. They used a fuzzy multi-criteria decision making (f-MCDM) algorithm where multiple fuzzy selection criteria are applied. The result is presented in an order based on the similarity to the ideal solution. The proposed method performed well in capturing both numerical and linguistic information in decision-making. Deliktas et al. [9] experimented MCDM with a fuzzy TOPSIS approach for selecting industrial engineering candidates in a manufacturing environment. Due to fewer candidates, pairwise comparison matrices have controlled consistencies and obtained a crisp value for each candidate. Both the weight of decision-makers and weights of selection criteria are taken into account to analyse the performance values.

Bogdanovic et al. [4] used MCDM to evaluate and select the most suitable employees in an organization. The objective followed is to minimize the subjectivism of decision-makers due to their dominance in the field. As a result, they reduced employers from five to three and reorganized the department by obtaining the final ranking. Its result presented that managerial problems can also be solved successfully by this integrated method. Golec et al. [12] presented a general hierarchical framework for the selection and evaluation of an employee, and it uses the competency-based fuzzy model to match a particular job to an employee. The linguistic evaluation was computed on features such as communications skills, interpersonal skills, technical skills, management skills, decision-making ability and self-motivation [24].

Boran et al. [5] carried out personnel selection as a group decision-making problem [17] using TOPSIS method as an extension to the intuitionistic fuzzy set (IFS), which is characterized by a membership function, non-membership function, and hesitation margin to deal with vagueness. Efe et al. [11] also used the TOPSIS method for personnel selection which was extended by the use of interval type-2 trapezoidal fuzzy number. IT2F numbers represented the linguistic terms given by the decision-makers. This evaluation is applied in an assembly line of a textile firm after determining the closeness coefficient of each personnel. Balezentis et al. [2] extended the fuzzy MULTIMOORA (Multi-Objective Optimization by Ratio Analysis) for linguistic reasoning under group decision-making. It considers eight qualitative attributes in the linguistic form and aggregates the decision-makers subjective evaluation, giving a robust method of personnel selection. Hesitant fuzzy linguistic terms were proposed to capture richness in linguistic variation. It provides control in situations where a set of values are possible from the fuzzy membership function [31]. Chen et al. [6] represented 366 linguistic terms using hesitant fuzzy sets in the MCDM approach, while Liu et al. [19] incorporated T2FS with hesitant fuzzy linguistics terms. Afshari et al. [1] introduced a hybrid approach to personnel selection using an expert system involving fuzzy linguistic variables combined with operational research. Dahooie et al. [14] particularly proposed a model for personnel selection in the IT industry. SWARA and ARAS-G decision-making methods were used to select the best candidate for an IT department.

In the proposed method for personnel selection, FCM clustering is performed after pre-processing and vectorising the data. Following Bezdek et al. [3] model, the clustering model used Euclidean distance for distance measurement. The attributes of the model are chosen based on the generalised view presented in Wu et al. [39]. The next phase of the proposed method uses the IT2FS model for final decision-making in personnel selection. The model used a Gaussian membership function with uncertain means to represent the uncertainty of the linguistic term. The expert was consulted to construct both the membership function and the decision-table (i.e. rule-based) for the model. The theoretical design of IT2FS is followed as suggested by Liang et al. [18] using its python toolkit made available by Haghrah et al. [13].

The paper is organised as follows: Section 3 presents a step-by-step methodology for the personnel selection, starting from preprocessing resumes, then followed by clustering and shortlisting candidates for the interview round. Section 4 shows the performance of the model corresponding to evaluation metrics like average accuracy and F1 score. Finally, the paper concludes in Section 5.

3 Methodology

The I.T. resume dataset contains 1500 resumes. Each row presents the resume information of a candidate, whereas the column represents the attributes of the resume. The candidate’s data attributes are merged and then passed to preprocessing step 3 . Fig. 1 presents the dataflow diagram of the model.

Fig. 1

Dataflow diagram of the model.

Step-1: Preprocessing Module: The preprocessing of each row of the dataset involves:

Remove the hyperlinks, symbols, punctuators, brackets, etc.

Remove the stop words(a, an, the, etc.) as they are very common to any text documents and do not have any significance in processing documents.

Convert the resume in lowercase. Further lemmatization is applied to convert the word to its root form.

The cleaned data is represented using a Term-Document matrix of vector space model, where each word is a dimension. The vector corresponding to a resume is an array containing the frequency of each word in the resume.

The total number of unique words in the resumes forms the vocabulary of the resume dataset. There are approximately thirteen thousand unique words in the vocabulary, making it too large to assign each word a separate column. Therefore columns containing a significantly less number of entries are dropped. For that, a threshold value from the mean and median corresponding to the frequency of each word is chosen. Words with a frequency less than the threshold are removed as they don’t significantly lose the information.

Step-2: Vectorisation Module: The selector provides the required skill set, and these skill sets contain domain-specific words to be considered for selecting the resume (i.e., desired skill words). A data frame is maintained to capture the frequencies of the desired skilled words and undesired skilled words in the candidate’s resume. These frequencies are stored in two separate columns representing desired and undesired word frequency. Now, these data frame tuples/vector are plotted as shown in Fig. 2, where the Y-axis presents the frequency of desired skilled words, whereas the X-axis presents the frequency of undesired words.

Fig. 2

Resumes data frame representation after vectorisation.

Each resume is a data point in the graph showing a combination of desired and undesired skill sets. It is apparent from Fig. 2 that the number of desired skill words will be much less than other skill words in the resume. A weight is applied to skill sets in the resume for scaling. The scaling weight factor is computed as the percentage of desired skill words in the resume, giving more weightage to the resume with a higher percentage of these words. The scaled version of the graph is shown in Fig. 3.

Fig. 3

Resumes data frame representation after applying weights to desired and undesired skill words.

Step-3: Clustering Module: The next phase is to cluster the plotted data points into two clusters: eligible and non-eligible candidates for the interview process. A hard clustering model is not preferred because we don’t want to lose the candidate whose resume might have a similar but not the same skill set as per the requirement. Sometimes selectors do put some weightage to have various other skills while choosing the candidate that may not be directly related to their professional work. Therefore, the soft-clustering method is preferred to save similar candidates present in the boundary cases [3]. In the experiment, the Fuzzy c-means clustering technique resolves this issue and deals with the vagueness. It divides the data points into two clusters, and the fuzzifier (m) parameter is set as 2 as in [39]. Fig. 4 presents the output clusters.

Fig. 4

Resumes data frame representation after FCM where value of fuzzifier parameter=2.

All the non-eligible resume data points are discarded while the eligible resumes data point flows to the next module: ‘Filtering and selection module’ as shown in Fig. 1. Two new columns are computed and added to the data frame of the eligible candidate resume: (i) The percentage of the desired words and (ii) The percentage of the undesired words in the filtered data. In the filtering and selection module, IT2FS roles come into the picture.

Step 4: Filtering and a selection module: The idea is first to rank the filtered resumes based on the sum of weighted scores obtained using the percentage of desired and undesired skill words, after which the cutoff is applied to obtain the shortlisted resumes. The weights corresponding to percentage skill words, undesired skill words and cutoff is given by the selector.

In order to handle the uncertainty factor, a decision table (Table 1) consisting of two input linguistic variables (IT2FS membership function) and an output linguistic variable (IT2FS membership function) is used. The input linguistic variables have three possible linguistic values (i.e. Low, Medium and High), while the output linguistic variable supports five linguistic values (i.e. Very Low, Low, Moderate, High and Very High). Each linguistic variable is represented as an IT2FS membership function, particularly Gaussian MF with uncertain means, as shown in Fig. 5.

Table 1

Decision Table: Input-Output fuzzy inference rules

	input ₁	input ₂	output ₁
Rule1	LOW	LOW	VERY LOW
Rule2	LOW	MEDIUM	LOW
Rule3	LOW	HIGH	MODERATE
Rule4	MEDIUM	LOW	LOW
Rule5	MEDIUM	MEDIUM	MODERATE
Rule6	MEDIUM	HIGH	HIGH
Rule7	HIGH	LOW	MODERATE
Rule8	HIGH	MEDIUM	HIGH
Rule9	HIGH	HIGH	VERY HIGH

Fig. 5

Gaussian MF with uncertain mean used as input and output membership function.

The decision table is looked up three times with two different input pairs combination as follows:

input1: percentage of the desired skill words;

input2: selector’s decided weightage for the desired skill words in the range of 0 to 1; to get

output1: an output interval range corresponding to skill words (as shown in Fig. 6).

input1: percentage of the undesired skill words;

input2: selector’s decided weightage for the undesired skilled words in the range of 0 to 1; to get

output2: an output interval range corresponding to undesired skill words.

input1: cutoff ratio (i.e. ratio of filtered candidates to the total candidate during clustering);

input2: selector’s decided weightage for the cutoff ratio in the range of 0 to 1; to get

output3: an output interval range corresponding to cutoff ratio.

Fig. 6

A Data flow of selector weightage for desired word in IT2fs model.

Following the Mamdani method [27], the input and their corresponding weights are passed to the decision table to obtain the corresponding IT2FS output (output1, output2, output3). On the application of KM algorithm [38] to each output gives a range for the output (i.e., [min. value, max. value]).

Output1 (i.e. IT2FS output corresponding to desired skill words) and output2 (i.e. IT2FS output corresponding to undesired skill words) are added to provide a score to each candidate resume (i.e., [min. score, max. score]), resulting in the candidates’ score list. This list is arranged in decreasing order of the maximum scores. Output3 (cutoff interval) is now applied on this list to finally obtain the percentage of the candidates shortlisted for the interview round from the candidates’ score list (i.e., [min. fraction of qualified candidates, max. fraction of qualified candidates]).

4 Result

Now in this section, the experiment results are presented to give a view of fuzzy text clustering with IT2FS application in personnel selection. Three cases are evaluated depending on the number of skills that a selector has given for shortlisting are as follows:

Case-1 (less number of skills): Frontend Web developer, HTML, confident.

Case-2 (moderate number of skills): Java developer, object-oriented programmer, hardworking, adaptable, flexible.

Case-3 (a large number of skills): Software developer, Java, C, CPP, HTML, CSS, DBMS, MySQL, MongoDB, PHP, Operating System, Spring, Hibernate, MVC, team player, positive attitude, hardworking, adaptable.

Firstly, the expert labelled all the 1500 resumes as qualified or not qualified for each of the three cases. The result of the model for each case is presented in the confusion matrix, as shown in Table 2.

Table 2
Confusion matrix of the model, where TP=True-postive, TN=True-negative, FP=False-postive, FN=False-negative

TP TN FP FN

Case 1 46 499 14 41

Case 2 212 283 26 79

Case 3 161 387 16 36

	TP	TN	FP	FN
Case 1	46	499	14	41
Case 2	212	283	26	79
Case 3	161	387	16	36

It clearly shows that the model does not give a significant result whenever the required skill set is significantly less and much restricted. At the same point, whenever the required skill set is extensive and little diversified, the result is outstanding. The proposed model shows an average accuracy of 88.2 %, whereas an average F1 score of 0.76 (Table 3).

Table 3

Result analysis of the model

	Accuracy	Recall	Precision	F1
Case 1	0.908	0.53	0.77	0.63
Case 2	0.825	0.728	0.890	0.80
Case 3	0.913	0.817	0.90	0.856

The result of the model (FCM + IT2FS) is compared with the (K-means + IT2FS). The proposed method shows an F1 score of 0.76 to a later model with F1 score of 0.72. It has been observed that the (FCM+IT2FS) shows a better result compared to (K-means +IT2FS) because FCM is able to allow a large number of boundary cases (uncertain cases) to pass while the K-means perform more strictly over the boundary cases.

Finally, the results of IT2FS is compared to T1FS for each case in Table 4. Considering the Case-1 in the experiment, the cutoff ratio that came after clustering is 0.26, while the weightage given by the selector to cutoff ratio is 0.2. Table 4 shows that the IT2FS returns an interval range of [0.08, 0.62], indicating the minimum and maximum fraction of candidates that can be shortlisted for the interview from the eligible candidates’ score list, whereas the T1FS presents a single discrete value of 23%. Thus, IT2FS provides a qualified range of candidates from score list (i.e. for Case-1 [8%,62% ]). Hence, it gives control to the selector to choose the number of resumes from the given range of shortlisted resumes.

Table 4

Interval type-2 and type-1 result comparison where CR = cutoff ratio and wCR=selector weightage for cutoff ratio

	CR	wCR	IT2FS	T1FS
Case 1	0.26	0.2	[0.08-0.62]	0.23
Case 2	0.34	0.2	[0.15-0.64]	0.29
Case 3	0.38	0.2	[0.21-0.68]	0.34

5 Conclusion

In personnel selection, selectors are interested in candidates with the required professional skills and also give some weightage to other skills. Thus, it allows a candidate to get shortlisted for an interview if he may be missing some skills. These constitute the boundary cases, and hence they are regarded as cases with uncertainty. In order to deal with these boundary-cases (uncertain cases), the FCM and IT2FS are used in the proposed methodology. The FCM algorithm is applied over the preprocessed dataset to create two clusters: eligible and non-eligible resumes to the requirement. The IT2FS gives the control to the selector to decide how many boundary cases (uncertain cases) will pass from the eligible resume. The experimental results give an accuracy of 88.2% and an F1 score of 0.76.

The result comparison of our proposed model FCM + IT2FS is compared with K-means + IT2FS. Our proposed method shows an F1 score of 0.76 to a later model F1 score of 0.72. It indicates that our model is better at dealing with uncertain cases. Thus, it provides some scope to the candidates to get shortlisted where they may be missing some required professional skills. The experiment also presented that using IT2FS provides a range for the selected number of resumes despite some fixed values obtained using T1FS. It gives control to the selector to choose the number of resumes from the given range of shortlisted resumes. Hence, the model provides flexibility to selectors when dealing with uncertainty. Semi-automating the shortlisting of resumes also helps the selector to focus more on the interview task. From the application point of view, this approach can be further extended to another domain for personnel selection. In the colleges, this technique can be used to select SOPs (Statement-of-Purpose) received for their various academic programmes.

Footnotes

Dataset-link:

Kaggle:

Source-Code:

References

Afshari

R.A.

, Nikolić

and Ćoćkalo

, Applications offuzzy decision making for personnel selection problem: A review, Journal of Engineering Management and Competitiveness (JEMC) 4(2) (2014), 68–77.

Baležentis

, Baležentis

and Brauers

W.K.

, Personnelselection based on computing with words and fuzzy multimoora, Expert Systems with Applications 39(9) (2012), 7961–7967.

Bezdek

J.C.

, Ehrlich

and Full

, Fcm: The fuzzy c-meansclustering algorithm, Computers & Geosciences 10(2-3) (1984), 191–203.

Bogdanovic

and Miletic

, Personnel evaluation and selection bymulticriteria decision making method, Economic computation andEconomic Cybernetics Studies and Research 48(3) (2014), 179–196.

Boran

F.E.

, Genç

and Akay

, Personnel selection based onintuitionistic fuzzy sets, Human Factors and Ergonomics inManufacturing & Service Industries 21(5) (2011), 493–503.

Chen

S.-M.

and Hong

J.-A.

, Multicriteria linguistic decision makingbased on hesitant fuzzy linguistic term sets and the aggregation offuzzy sets, Information Sciences 286 (2014), 63–74.

Choi

B.-I.

and Rhee

F.C.-H.

, Interval type-2 fuzzy membershipfunction generation methods for pattern recognition, Information Sciences 179(13) (2009), 2102–2122.

Chumklin

, Auephanwiriyakul

and Theera-Umpon

, Microcalcification detection in mammograms using interval type-2 fuzzy logic system with automatic membership function generation, In International Conference on Fuzzy Systems, pages 1–7. IEEE, (2010).

Deliktaş

and Üstün

Ö.

, Multiple criteria decisionmaking approach for industrial engineer selection using fuzzyahpfuzzy topsis, Anadolu University Journal of Science andTechnology A-Applied Sciences and Engineering 19(1) (2018), 58–82.

10.

Dursun

and Karsak

E.E.

, A fuzzy mcdm approach for personnelselection, Expert Systems with Applications 37(6) (2010), 4324–4330.

11.

Efe

and Kurt

, A systematic approach for an application ofpersonnel selection in assembly line balancing problem, International Transactions in Operational Research 25(3) (2018), 1001–1025.

12.

Golec

and Kahya

, A fuzzy model for competency-based employeeevaluation and selection, Computers & Industrial Engineering 52(1) (2007), 143–161.

13.

Haghrah

A.A.

and Ghaemi

, Pyit2fls:Anewpython toolkit for interval type 2 fuzzy logic systems, (2019).

14.

Heidary Dahooie

, Beheshti Jazan Abadi

, Vanaki

A.S.

and Firoozfar

H.R

, Competency-based it personnel selection using a hybridswara and aras-g methodology, Human Factors and Ergonomics inManufacturing & Service Industries 28(1) (2018), 5–16.

15.

Herman

, Prasad

and McGinnity

T.M.

, Support vector enhanced design of a t2fl approach to motor imagery-relatedeeg pattern recognition, In IEEE International Fuzzy Systems Conference, pages 1–6. IEEE, (2007).

16.

Hwang

and Rhee

F.C.-H.

, Uncertain fuzzy clustering: Intervaltype-2 fuzzy approach to-means, IEEE Transactions onFuzzy Systems 15(1) (2007), 107–120.

17.

Keršulienė

and Turskis

, A hybrid linguistic fuzzymultiple criteria group selection of a chief accounting officer, Journal of Business Economics and Management 15(2) (2014), 232–252.

18.

Liang

and Mendel

J.M.

, Interval type-2 fuzzy logic systems:theory and design, IEEE Transactions on Fuzzy systems 8(5) (2000), 535–550.

19.

Liu

, Rodriguez

R.M.

, Hagras

, Liu

, Qin

and Martínez

, Type-2 fuzzy envelope of hesitant fuzzy linguisticterm set: a new representation model of comparative linguisticexpression, IEEE Transactions on Fuzzy Systems 27(12) (2019), 2312–2326.

20.

Melin

and Castillo

, A review on the applications of type-2fuzzy logic in classification and pattern recognition, ExpertSystems with Applications 40(13) (2013), 5413–5423.

21.

Mendel

and Wu

, Perceptual computing: aiding people in making subjective judgments, volume 13. John Wiley & Sons, (2010).

22.

Mendel

J.M.

, John

R.I.

and Liu

, Interval type-2 fuzzy logicsystems made simple, IEEE transactions on fuzzy systems 14(6) (2006), 808–821.

23.

Mendoza

, Melín

and Castillo

, Interval type-2 fuzzylogic and modular neural networks for face recognition applications, Applied Soft Computing 9(4) (2009), 1377–1387.

24.

Mishra

, Barnwal

S.K.

, Malviya

, Mishra

and Tiwary

U.S.

, Prosodic feature selection of personality traits for job interview performance, In International Conference on Intelligent Systems Design and Applications, pages 673–682. Springer, (2018).

25.

Mishra

, Barnwal

S.K.

, Malviya

, Singh

, Singhand

and Tiwary

U.S.

, Computing with words through interval type-2 fuzzy sets for decision making environment, In International Conference on Intelligent Human Computer Interaction, pages 112–123. Springer, (2019).

26.

Phong

P.A.

and Thien

K.Q.

, Classification of cardiac arrhythmias using interval type-2 tsk fuzzy system, In 2009 International Conference on Knowledge and Systems Engineering, pages 1–6. IEEE, (2009).

27.

Pourjavad

and Mayorga

R. V.

, comparative study andmeasuring performance of manufacturing systems with mamdani fuzzyinference system, Journal of Intelligent Manufacturing 30(3) (2019), 1085–1097.

28.

Qin

, Zhang

, Liu

, Li

and Kou.

, Dynamic riskassessment of metro station with interval type-2 fuzzy set andtopsis method, Journal of Intelligent & Fuzzy Systems 29(1) (2015), 93–106.

29.

Ramos

, et al., Using tf-idf to determine word relevance indocument queries, In Proceedings of the first instructionalconference on machine learning 242 (2003), 29–48.

30.

Rodríguez

R.M.

, Labella

and Martínez

, An overview onfuzzy modelling of complex linguistic preferences in decisionmaking, International Journal of Computational IntelligenceSystems 9(sup1) (2016), 81–94.

31.

Rodriguez

R.M.

and Martinez

, Herrera

, Hesitant fuzzylinguistic term sets for decision making, IEEE Transactions onfuzzy systems 20(1) (2011), 109–119.

32.

Samanlioglu

, Taskaya

Y.E.

, Gulen

U.C.

and Cokcan

, A fuzzyahp–topsis-based group decision-making approach to itpersonnel selection, International Journal of Fuzzy Systems 20(5) (2018), 1576–1591.

33.

Sharma

and Bajaj

, Accuracy comparison of vehicle classification system using interval type-2 fuzzy inference system. In 2010 3rd International Conference on Emerging Trends in Engineering and Technology, pages 85–90. IEEE, (2010).

34.

Slonim

, Tishby

, et al., The power of word clusters for text classification. In 23rd European Colloquium on Information Retrieval Research, volume 1, page 200, (2001).

35.

Tunali

, Bilgin

and Camurcu

, An improved clusteringalgorithm for text mining: Multi-cluster spherical k-means, International Arab Journal of Information Technology (IAJIT) 13(1) (2016).

36.

Wang

, Xu

and Zeng

X.-J.

, Modeling complex linguisticexpressions in qualitative decision making: An overview, Knowledge-Based Systems 144 (2018), 174–187.

37.

Wei

and Croft

W.B.

, Lda-based document models for adhoc retrieval. In Proceedings of the 29th annual internationalACMSIGIR conference on Research and development in information retrieval, (2006), 178–185.

38.

and Mendel

J. M.

, Enhanced karnik–mendel algorithms, IEEE transactions on fuzzy systems 17(4) (2008), 923–934.

39.

K.-L.

, Analysis of parameter selections for fuzzy c-means, Pattern Recognition 45(1) (2012), 407–415.

40.

Zadeh

L.A.

, From computing with numbers to computing with words.from manipulation of measurements to manipulation of perceptions, IEEE Transactions on circuits and systems I: fundamental theoryand applications 46(1) (1999), 105–119.

41.

Zadeh

L.A.

, Fuzzy logic= computing with words. In Computing with Words in Information/Intelligent Systems 1, pages 3–23. Springer, (1999).

42.

Zhang

W.-B.

, Hu

H.-z.

and Liu

W.-j.

, Rules extraction of interval type-2 fuzzy logic system based on fuzzy c-means clustering, In Fourth International Conference on Fuzzy Systems and Knowledge Discovery (FSKD 2007), volume 2, pages 256–260. IEEE, (2007).

43.

Zhang

, Zhao

, Qin

, Si

and Zhou

, Interval type-2 fuzzy topsis approach with utility theory for subway station operational risk evaluation, Journal of Ambient Intelligence and Humanized Computing (2021), 1–15.

44.

Zheng

, Wang

, Zhou

and Zhang

, A similarity measure between interval type-2 fuzzy sets, In 2010 IEEE International Conference on Mechatronics and Automation 191–195. IEEE, (2010).

Soft clustering and interval type-2 fuzzy set based inference strategy for I.T. personnel selection

Abstract

Keywords

1 Introduction

2 Literature review

3 Methodology

Table 2 Confusion matrix of the model, where TP=True-postive, TN=True-negative, FP=False-postive, FN=False-negative TP TN FP FN Case 1 46 499 14 41 Case 2 212 283 26 79 Case 3 161 387 16 36

Footnotes

References

Table 2
Confusion matrix of the model, where TP=True-postive, TN=True-negative, FP=False-postive, FN=False-negative

TP TN FP FN

Case 1 46 499 14 41

Case 2 212 283 26 79

Case 3 161 387 16 36