Two Novel Distances for Neutrosophic Fuzzy Sets with Applications in Decision-Making and Clustering

Abstract

Neutrosophic fuzzy (NF) sets provide a powerful mathematical framework for modeling uncertainty, indeterminacy, and inconsistency features that are inherent in many real-world decision-making and data analysis problems. This study introduces two novel, axiomatically validated distance measures designed to improve analysis within NF environments. Existing distance measures often struggle with high indeterminacy and lack critical axiomatic properties, leading to unreliable outcomes in decision-making and pattern recognition. We benchmark their performance against state-of-the-art alternatives using a new NF pattern recognition algorithm. To demonstrate practical utility, the measures were integrated into the NF-TOPSIS method for multi-criteria decision-making and the NF-CLUSTER algorithm for data clustering, tested on both synthetic and real-world datasets. Experimental results confirm our measures consistently outperform existing ones, producing more discernible rankings and more cohesive clusters with an improved validity index. These findings establish a robust and effective framework for quantifying dissimilarity between NF sets, significantly advancing applications under high uncertainty and indeterminacy.

Keywords

neutrosophic fuzzy sets distance measure pattern recognition multi-criteria decision-making clustering analysis

1. Introduction

Effectively modeling the pervasive uncertainty inherent in real-world systems presents a significant challenge, frequently surpassing the descriptive power of classical set theory's binary logic. Zadeh introduced the concept of fuzzy sets in a foundational effort to address this limitation. These sets utilize graded membership values to quantify degrees of partial belonging, thereby accommodating vagueness and imprecision within system representations (Zadeh, 1965). Building upon this paradigm, Atanassov (1999) proposed intuitionistic fuzzy sets, which incorporate both a degree of membership and a distinct degree of non-membership. The sum of these two degrees is constrained not to exceed unity, implicitly defining a margin for hesitation or indeterminacy (Atanassov, 1999). Subsequently, to explicitly manage situations involving indeterminate and inconsistent information, Smarandache developed neutrosophic sets. These are characterized by three independent components representing degrees of truth, indeterminacy, and falsity (Smarandache, 2005). For enhanced practical applicability and computational manageability, Wang et al. later introduced single-valued neutrosophic sets. This formulation restricts the domains of the truth, indeterminacy, and falsity components to a standard numerical interval, thereby providing a more tractable tool for many applications grappling with complex uncertainties (Wang et al., 2010). This evolutionary trajectory underscores the continuous refinement of mathematical frameworks designed to represent increasingly sophisticated manifestations of uncertainty encountered across diverse scientific and engineering disciplines. Further exemplifying this trend, practical implementations of single-valued neutrosophic formalisms, such as those discussed by Das et al. under the term neutrosophic fuzzy (NF) sets and single-valued neutrosophic fuzzy sets, also adopt this standardized numerical constraint for their truth, indeterminacy, and falsity components, facilitating their application (Das et al., 2020).

The distinctive capacity of NF sets to address vagueness, inconsistency, and indeterminacy simultaneously has fueled their adoption across diverse scientific and engineering disciplines. Their enhanced representational power is particularly valuable in decision-making, notably within multi-criteria (MCDM) and multi-attribute group decision-making (MAGDM) (Das et al., 2020; Nafei et al., 2024). Exemplary applications include optimizing supplier selection under ambiguous criteria (Saeed & Rahman, 2021), selecting industrial machinery with imprecise specifications (Nafei et al., 2024), and analyzing efficiencies in complex systems like water treatment (Majumder et al., 2023). In medicine, they aid diagnosis, with tools like Hausdorff distance facilitating comparisons of uncertain patient data (Mathews et al., 2024). Their descriptive power also benefits pattern recognition and data clustering (DalKılıç & Demirtaş, 2025), and in finance for modeling asset returns and portfolio optimization, incorporating market indeterminacy through concepts like neutrosophic covariance (Boloș et al., 2023). Moreover, NF sets integrate with other theoretical frameworks, including soft set theory (Khalil et al., 2020) and hypersoft set structures for scenarios involving attributes with sub-attributes, as seen in IoT healthcare monitoring (Khalaf et al., 2025; Saeed & Rahman, 2021). These varied applications underscore NF sets’ growing importance and adaptability for tackling multifaceted real-world problems characterized by complex uncertainties. A vast array of tools has been developed, including adaptations of classical MCDM methods like TOPSIS and VIKOR, various aggregation operators to synthesize information, and numerous distance, similarity, and cross-entropy measures to quantify relationships between NF sets (DalKılıç & Demirtaş, 2025; Das et al., 2020). These tools provide a robust foundation for structured decision-making under complex neutrosophic uncertainty.

However, despite the proliferation of these methods, a critical analysis reveals several persistent challenges. While foundational tools like aggregation operators and decision-making frameworks have their own limitations, such as sensitivity to unreliable data, a particularly pressing issue lies within the core task of quantifying the difference between NF sets. Many existing distance and similarity measures, while valuable, suffer from significant weaknesses. They often produce counter-intuitive results in specific edge cases, such as when sets are highly dissimilar or contain conflicting information. Furthermore, they can be insensitive to subtle variations across the truth, indeterminacy, and falsity components, and may fail to satisfy all desirable mathematical properties under rigorous scrutiny. These deficiencies compromise the reliability, accuracy, and robustness of any NF set-based methodology, leading to ambiguous scores and potentially flawed outcomes in applications like medical diagnosis and decision-making (Abed et al., 2023). These shortcomings highlight a clear and urgent need for more theoretically sound, flexible, and intuitive measures.

Motivated by these challenges, this paper aims to address these limitations by introducing a novel approach to NF information processing. The main contributions of this study are as follows:

(1)
Proposing two novel distance measures for NF sets, rigorously validated through axiomatic analysis and comprehensively compared with state-of-the-art (SOTA) measures.
(2)
Developing a new pattern recognition algorithm in the NF context to evaluate the reliability of the proposed distances, with comparative analysis against SOTA distances.
(3)
Demonstrating the practical effectiveness of the proposed distances by integrating them into the NF-TOPSIS method for MCDM.
(4)
Validating the applicability of the proposed distances through their implementation in the NF-CLUSTER algorithm, with benchmarking against SOTA alternatives.

The rest of this paper is structured to systematically address the research problem. We begin in Section 2 by reviewing the fundamental concepts of NF sets and their application in decision-making and clustering. Building upon this foundation, Section 3 introduces our novel distance measure, along with a rigorous demonstration of its metric properties. To validate its effectiveness, Section 4 conducts a comprehensive comparative analysis, evaluating our measure against SOTA distances through TOPSIS and Clustering methodologies. The paper concludes in Section 5 with a summary of our contributions and a discussion of future research directions.
2. Background

Definition 2.1.
(Das et al., 2020) An NF set K over X is defined by $K = {x, \frac{μ_{K} (x)}{T_{K} (x), I_{K} (x), F_{K} (x)} : x \in X}$ where $μ_{K} (x)$ is the membership grade of x in X and $T_{K} (x), I_{K} (x), F_{K} (x)$ are real standard or non-standard subsets of $]^{-} 0; 1^{+} [$ , i.e., the real-function triples $T_{K}, I_{K}, F_{K} : X \to]^{-} 0; 1^{+} [$ indicate truth, indeterminacy, and falsity degrees of elements, respectively, with no particular limitation on their sum, i.e., $0^{-} ⩽ sup T_{K} + sup I_{K} + sup F_{K} ⩽ 3^{+}$ .

In this paper, let $N F S (X)$ be all sets of NF sets in $X = {x_{1}; x_{2}; \dots; x_{n}}$ with $n \in N^{}$ and K, $L, M$ be NF sets over X.
Definition 2.2.
(Bui et al., 2023)
$K \cup L = {⟨ x, \frac{max {μ_{K} (x), μ_{L} (x)}}{max {T_{K} (x), T_{L} (x)}, min {T_{K} (x), I_{L} (x)}, min {F_{K} (x), F_{L} (x)}} ⟩ | \; x \in X}$ ,

$K \cap L = {⟨ x, \frac{min {μ_{K} (x), μ_{L} (x)}}{min {T_{K} (x), T_{L} (x)}, max {I_{K} (x), I_{L} (x)}, max {F_{K} (x), F_{L} (x)}} ⟩ | \; x \in X}$ ,

$K^{c} = {⟨ x, \frac{1 - μ_{K} (x)}{F_{K} (x), 1 - I_{K}, T_{K} (x)} ⟩ | \; x \in X}$ ,

$K \subseteq L \Leftrightarrow \forall x \in X, \frac{μ_{K} (x) \leq μ_{L} (x)}{T_{K} (x) \leq T_{L} (x), I_{K} (x) \geq I_{L} (x), F_{K} (x) \geq F_{L} (x)}$ ,

$K + L = {⟨ x, \frac{μ_{K} (x) + μ_{K} (x) - μ_{K} (x) μ_{K} (x)}{T_{K} (x) + T_{L} (x) - T_{K} (x) T_{L} (x), T_{K} (x) + I_{L} (x) - T_{K} (x) L_{L} (x), V_{K} (x) + P_{L} (x) - P_{K} (x) F_{L} (x)} ⟩ | \; x \in X}$ ,

$K L = {⟨ x, \frac{μ_{K} (x) μ_{L} (x)}{T_{K} (x) T_{L} (x), I_{K} (x) I_{L} (x), F_{K} (x) F_{L} (x)} ⟩ | \; x \in X}$ ,

$λ \cdot K = {⟨ x, \frac{1 - {(1 - μ_{K} (x))}^{λ}}{(1 - {(1 - T_{K} (x))}^{λ}), (1 - {(1 - I_{K} (x))}^{λ}), (1 - {(1 - F_{K} (x))}^{λ})} ⟩ | \; x \in X}, λ > 0$ .

Definition 2.3.
(Nafei et al., 2024) Let $ρ = (\frac{μ}{T, I, F})$ be a single-value NF number. The score function S of $ρ$ is defined by $S (ρ) = 0.25 [4 μ + (2 - I) (T - 0.5 I - F + 3) (2 - F)]$ .
Definition 2.4.
(Bui et al., 2023) The distance measure between K and L is a function $d : N F S (X) \times N F S (X) \to R$ satisfied
$0 \leq d (K, L) \leq 1$ ,

$d (K, L) = d (L, K)$ ,

$d (K, L) = 0 \Rightarrow K = L$ ,

$d (K, L) \leq d (K, M)$ and $d (L, M) \leq d (K, M)$ if $K \subset L \subset M$ .

Definition 2.5.
(Bui et al., 2023) For all $x \in X$ , let
$\begin{aligned} Δ_{1} & = | μ_{K} (x) - μ_{L} (x) |, \\ Δ_{2} & = | T_{K} (x) - T_{L} (x) |, \\ Δ_{3} & = | I_{K} (x) - I_{L} (x) |, \\ Δ_{4} & = | F_{K} (x) - F_{L} (x) | . \end{aligned}$

Without loss of generality, the SOTA measures between NF sets are provided through mathematical formulas in Table 1.

Table 1.
The SOTA Measures in NF Context.

Measures Calculation formula Ref

Hamming $d_{1} (K, L) = \frac{1}{4} [\sum_{x \in X} \sum_{i = 1}^{4} Δ_{i} (x)]$ (Das et al., 2020)

Normalised Hamming $d_{2} (K, L) = \frac{1}{4 | X |} [\sum_{x \in X} \sum_{i = 1}^{4} Δ_{i} (x)]$ (Das et al., 2020)

Euclidean $d_{3} (K, L) = {[\frac{1}{4} \sum_{x \in X} [\sum_{i = 1}^{4} Δ_{i}^{2} (x)]]}^{\frac{1}{2}}$ (Das et al., 2020)

Normalised Euclidean $d_{4} (K, L) = {[\frac{1}{4 | X |} \sum_{x \in X} [\sum_{i = 1}^{4} Δ_{i}^{2} (x)]]}^{\frac{1}{2}}$ (AlShaqsi et al., 2024)

Manthan $d_{5} (K, L) = {[\frac{1}{4} \sum_{x \in X} [\sum_{i = 1}^{4} Δ_{i} (x)]]}^{\frac{1}{2}}$ (AlShaqsi et al., 2024)

Hausdorff $d_{6} (K, L) = \sum_{x \in X} ma x_{i = 1, 2, 3, 4} (Δ_{i} (x))$ (Mathews & Sebastian, 2023)

Normalised Hausdorff $d_{7} (K, L) = \frac{1}{| X |} \sum_{x \in X} ma x_{i = 1, 2, 3, 4} (Δ_{i} (x))$ (Mathews & Sebastian, 2023)

Extended Hausdorff $d_{8} (K, L) = \frac{1}{| X |} \sum_{x \in X} [\frac{\sum_{i = 1}^{4} Δ_{i} (x)}{8} + \frac{ma x_{i = 2, 3, 4} (Δ_{i} (x))}{3}]$ (Nafei et al., 2024)

Cosine $d_{9} (K, L) = 1 - \frac{\sum_{x ϵ X} μ_{K} (x) μ_{L} (x) + T_{K} (x) T_{L} (x) + I_{K} (x) I_{L} (x) + F_{K} (x) F_{L} (x)}{\sqrt{\sum_{x ϵ X} (μ_{K}^{2} (x) + T_{K}^{2} (x) + I_{K}^{2} (x) + F_{K}^{2} (x))} * \sqrt{\sum_{x ϵ X} (μ_{L}^{2} (x) + T_{L}^{2} (x) + I_{L}^{2} (x) + F_{L}^{2} (x))}}$ (Biswas et al., 2015)

Similarity $s_{k} = 1 - d_{k}$ where $k \in \bar{1, 8}$ (Bui et al., 2023)

In today's world, every problem is highly complex and involves multiple factors. To address these problems, we must propose solutions and select the optimal one. However, finding an appropriate solution in a complex environment can be challenging. MCDM in an NF environment is the approach to solving problems characterized by complexity and uncertainty. The application of the TOPSIS method to MCDM problems in an NF environment is widespread in various fields, including supply chain management, financial management, engineering and production, and human resource management. The TOPSIS method selects a solution closest to the positive ideal solution and farthest from the negative ideal solution. It offers high accuracy in decision-making, the ability to process multiple criteria simultaneously, flexibility, ease of application, and time efficiency. Therefore, MCDM in an NF environment using TOPSIS (NF-TOPSIS) is optimal for handling complex decision problems (Nafei et al., 2024). The NF-TOPSIS method is presented in detail in Algorithm 1.

Clustering analysis is a foundational data mining method to group similar observations, thereby uncovering hidden patterns and underlying structures within data (Bui et al., 2021; Ejegwa et al., 2024; Guo & Sengur, 2015; Xu et al., 2008). Its diverse applications span pattern recognition, image analysis, bioinformatics, and machine learning (Hu et al., 2022; Khan et al., 2020). However, traditional algorithms like k-means struggle significantly with uncertain or imprecise datasets. To overcome these limitations, fuzzy clustering techniques were developed (Ciaramella et al., 2020; Zhang & Cai, 2021). These methods generally fall into two categories. Objective function-based approaches, such as the widely used Fuzzy C-means, aim to optimize a mathematical function reflecting clustering criteria, like minimizing intra-cluster distances (Bezdek, 1981; Ciaramella et al., 2020; Dunn, 1973; Zhang & Cai, 2021). Alternatively, relational matrix-based methods evaluate pairwise similarities or dissimilarities between data points (using measures like correlation coefficients Saeed & Rahman, 2021 or similarity metrics Ruspini et al., 2019; Ye, 2017), creating a matrix that is then partitioned into fuzzy clusters (Ciaramella et al., 2020; Zhang & Cai, 2021). While enhancing clustering in uncertain environments, these fuzzy approaches still face challenges when dealing with data containing inconsistencies and indeterminacy.

The NF clustering (NFC) problem is an important research area in the NF set theory, which focuses on clustering objects that are fuzzy and uncertain. Prominent methods for this task include neutrosophic C-means, neutrosophic K-means, and Hierarchical neutrosophic clustering. These methods extend traditional clustering techniques by incorporating the four components of the NF sets, allowing for a more comprehensive and accurate representation of complex and uncertain data. The importance of these methods lies in their ability to improve the accuracy and efficiency of clustering, especially in applications such as pattern recognition, image processing, and data analysis, where uncertainty and ambiguity play an essential role. Bui et al. (Bui et al., 2025) introduced a distance-measure-based algorithm using a relational matrix approach for clustering in the NF environment, which is presented in detail in Algorithm 2.

3. Novel Distances for Neutrosophic Fuzzy Sets

Measures	Calculation formula	Ref
Hamming	$d_{1} (K, L) = \frac{1}{4} [\sum_{x \in X} \sum_{i = 1}^{4} Δ_{i} (x)]$	(Das et al., 2020)
Normalised Hamming	$d_{2} (K, L) = \frac{1}{4 \| X \|} [\sum_{x \in X} \sum_{i = 1}^{4} Δ_{i} (x)]$	(Das et al., 2020)
Euclidean	$d_{3} (K, L) = {[\frac{1}{4} \sum_{x \in X} [\sum_{i = 1}^{4} Δ_{i}^{2} (x)]]}^{\frac{1}{2}}$	(Das et al., 2020)
Normalised Euclidean	$d_{4} (K, L) = {[\frac{1}{4 \| X \|} \sum_{x \in X} [\sum_{i = 1}^{4} Δ_{i}^{2} (x)]]}^{\frac{1}{2}}$	(AlShaqsi et al., 2024)
Manthan	$d_{5} (K, L) = {[\frac{1}{4} \sum_{x \in X} [\sum_{i = 1}^{4} Δ_{i} (x)]]}^{\frac{1}{2}}$	(AlShaqsi et al., 2024)
Hausdorff	$d_{6} (K, L) = \sum_{x \in X} ma x_{i = 1, 2, 3, 4} (Δ_{i} (x))$	(Mathews & Sebastian, 2023)
Normalised Hausdorff	$d_{7} (K, L) = \frac{1}{\| X \|} \sum_{x \in X} ma x_{i = 1, 2, 3, 4} (Δ_{i} (x))$	(Mathews & Sebastian, 2023)
Extended Hausdorff	$d_{8} (K, L) = \frac{1}{\| X \|} \sum_{x \in X} [\frac{\sum_{i = 1}^{4} Δ_{i} (x)}{8} + \frac{ma x_{i = 2, 3, 4} (Δ_{i} (x))}{3}]$	(Nafei et al., 2024)
Cosine	$d_{9} (K, L) = 1 - \frac{\sum_{x ϵ X} μ_{K} (x) μ_{L} (x) + T_{K} (x) T_{L} (x) + I_{K} (x) I_{L} (x) + F_{K} (x) F_{L} (x)}{\sqrt{\sum_{x ϵ X} (μ_{K}^{2} (x) + T_{K}^{2} (x) + I_{K}^{2} (x) + F_{K}^{2} (x))} * \sqrt{\sum_{x ϵ X} (μ_{L}^{2} (x) + T_{L}^{2} (x) + I_{L}^{2} (x) + F_{L}^{2} (x))}}$	(Biswas et al., 2015)
Similarity	$s_{k} = 1 - d_{k}$ where $k \in \bar{1, 8}$	(Bui et al., 2023)

This section introduces two novel distances for NF sets. These new measures are demonstrated to be well-defined, and their properties are validated through mathematical reasoning. Let $K, L, M$ be three NF sets over $X = {x_{1}, x_{2}, \dots, x_{n}}$ .

Definition 3.1.
The real functions $d_{k}^{}, k = 1, 2$ are distance measures where
$\begin{aligned} d_{1}^{} (K, L) & = \frac{1}{| X |} \sum_{x \in X} (\frac{\sqrt{\sum_{i = 1}^{4} Δ_{i}^{2} (x)}}{4} + \frac{max {Δ_{i} (x) : i = \bar{1, 4}}}{2}) \end{aligned}$
(1)

$\begin{aligned} d_{2}^{} (K, L) & = \frac{1}{| X |} \sum_{x \in X} sin (\frac{\sqrt{\sum_{i = 1}^{4} Δ_{i}^{2} (x) π}}{4}) \end{aligned}$
(2)
Proposition 3.2.
For $k = 1, 2$ ,
$0 \leq d_{k}^{} (K, L) \leq 1$ ,

$d_{k}^{} (K, L) = d_{k}^{} (L, K)$ ,

$d_{k}^{} (K, L) = 0$ iff $K = L$ ,

$d_{k}^{} (K, L) \leq d_{k}^{} (K, M)$ and $d_{k}^{} (L, M) \leq d_{k}^{} (K, M)$ if $K \subset L \subset M$ .

Proof.
For $k = 1, 2$ and for all $x \in X$ :
$0 \leq d_{k}^{} (K, L) \leq 1$ as $Δ_{i} (x) \leq 1$ where $i = 1, 2, 3, 4$ .

Obviously $d_{k}^{} (K, L) = d_{k}^{} (L, K)$ based on Definition 3.1.

If $d_{k}^{} (K, L) = 0$ then $Δ_{i} (x) = 0$ where $i = 1, 2, 3, 4$ . Thus, $K = L$ .

If $K \subset L \subset M$ then based on Definition 3.1: $d_{1}^{} (K, L) \leq \frac{1}{| X |} \sum_{x \in X} \times [\frac{\sqrt{{| u K (x) - u_{M} (x) |}^{2} + {| T_{K} (x) - T_{M} (x) |}^{2} + {| I_{K} (x) - I_{M} (x) |}^{2} + {| F_{K} (x) - F_{M} (x) |}^{2}}}{4} \frac{ma x_{i} {Δ_{i} (x)}}{2}]$ So, $\; d_{1}^{} (K, L) \leq d_{1}^{} (K, M) \;$ and similarly, $\; d_{1}^{} (L, M) \leq d_{1}^{} (K, M)$ .
Moreover, $d_{2}^{} (K, L)$

$\leq \frac{1}{| X |} \sum_{x \in X} sin [\frac{π \sqrt{{| u_{K} (x_{i}) - u_{M} (x) |}^{2} + {| T_{K} (x_{i}) - T_{M} (x) |}^{2} + {| I_{K} (x) - I_{M} (x) |}^{2} + {| F_{K} (x) - F_{M} (x) |}^{2}}}{4}]$

So, $d_{2}^{} (K, L) \leq d_{2}^{} (K, M)$ and in the same way $d_{2}^{} (L, M) \leq d_{2}^{} (K, M)$ .

Example 3.3.
Let $X = {x_{1}; x_{2}; x_{3}}$ and
$\begin{aligned} K & = {⟨ x_{1}, \frac{0.35}{0.46, 0.55, 0.6} ⟩; ⟨ x_{2}, \frac{0.1}{0.3, 0.6, 0.57} ⟩; ⟨ x_{3}, \frac{0.72}{0.67, 0.3, 0.4} ⟩}, \\ L & = {⟨ x_{1}, \frac{0.36}{0.51, 0.31, 0.4} ⟩; ⟨ x_{2}, \frac{0.25}{0.3, 0.48, 0.4} ⟩; ⟨ x_{3}, \frac{0.31}{0.6, 0.33, 0.46} ⟩} . \end{aligned}$

The measures between K and L are presented in Table 2.

Table 2.
The Measures Between K and L in Example 3.1.

$d_{1}^{} (K, L)$ $d_{2}^{} (K, L)$ $d_{1} (K, L)$ $d_{2} (K, L)$ $d_{3} (K, L)$

0.2195 0.257 0.3775 0.1258 0.293

$d_{4} (K, L)$ $d_{5} (K, L)$ $d_{6} (K, L)$ $d_{7} (K, L)$ $d_{8} (K, L)$ $d_{9} (K, L)$

0.1692 0.6144 0.82 0.2733 0.1163 0.0483

Example 3.4.
Let $K, L, M$ be NF sets over $X = {x_{1}, x_{2}}$ where
$K = {⟨ x_{1}, \frac{0}{0, 1, 1} ⟩; ⟨ x_{2}, \frac{0.2}{0.1, 0.5, 0.4} ⟩},$

$L = {⟨ x_{1}, \frac{1}{1, 1, 0} ⟩; ⟨ x_{2}, \frac{0.4}{0.2, 0, 0.3} ⟩},$

$M = {⟨ x_{1}, \frac{1}{1, 0, 0} ⟩; ⟨ x_{2}, \frac{0.6}{0.3, 0, 0.2} ⟩} .$
Table 3 summarizes a comparison of the distances calculated between these sets, contrasting our proposed measure with established SOTA methods.

Based on Table 3, the measures $d_{1}, d_{3}, d_{5}, d_{6}$ exhibit values exceeding 1. Consequently, these measures violate a fundamental property of distance metrics and therefore cannot be considered valid. In contrast, all of our proposed distances satisfy the necessary axioms, confirming their validity.
Table 3.
The Measures Between $K, L, M$ in Example 3.2.

$d_{k}$ $d_{1}$ $d_{2}$ $d_{3}$ $d_{4}$ $d_{5}$ $d_{6}$

$d_{k} (K, L)$ 0.9750 0.4875 0.9097 0.6432 0.9874 1 . 5000

$d_{k} (L, M)$ 0.3500 0.1750 0.5148 0.3640 0.5916 1.2000

$d_{k} (K, M)$ 1.3250 0.6625 1.0595 0.7492 1.1511 1.5000

$d_{k}$ $d_{1}^{}$ $d_{2}^{}$ $d_{7}$ $d_{8}$ $d_{9}$

0.6611 0.7007 0.7500 0.4938 0.5712

$d_{k} (L, M)$ 0.4556 0.4492 0.6000 0.2708 0.1755

$d_{k} (K, M)$ 0.7125 0.7612 0.7500 0.5813 0.9071

Example 3.5.
Let $A, B, C \in N F S (X)$ where

$A = {⟨ x, \frac{1.0}{1.0, 0.0, 0.0} ⟩}$ ; $B = {⟨ x, \frac{0.0}{0.0, 1.0, 1.0} ⟩}$ . and $C = {⟨ x, \frac{0.0}{1.0, 1.0, 1.0} ⟩} .$
We have:

$d_{1}^{} (A, B) = 1$ , $\; d_{1}^{} (A, C) = 0.933$ , $d_{1}^{} (B, C) = 0.75$ ;

$d_{2}^{} (A, B) = 1$ , $\; d_{2}^{} (A, C) = 0.9779$ , $d_{2}^{} (B, C) = 0.7071$ .

A key advantage of the proposed measures is their robustness under extreme conditions. Specifically, they maintain their validity by ensuring the distance value remains within the $[0, 1]$ interval, even when the truth (T), indeterminacy (I), or falsity (F) components are maximized. This characteristic is vital as it guarantees stable and meaningful calculations in practical scenarios, particularly those characterized by complete certainty (e.g., T = 1 or F = 1) or complete ambiguity (I = 1).

In scenarios of maximal dissimilarity, where one entity possesses optimal attributes while another possesses the worst, the calculated distance correctly approaches 1. This behavior is not only intuitive but also confirms that the measure is well-behaved, consistently adhering to the $[0, 1]$ boundary. From a practical standpoint, this ensures that two entirely dissimilar alternatives are correctly identified as maximally distant, thereby validating the robustness and reliability of the proposed measures.
Example 3.6.
Let $A, B, C, D, E, F, G, H, I, K$ be NF sets over $X = {x_{1}, x_{2}, x_{3}}$ where
$A = {⟨ x_{1}, \frac{0.5}{0.6, 0.4, 0.3} ⟩; ⟨ x_{2}, \frac{0.3}{0.5, 0.2, 0.1} ⟩; ⟨ x_{3}, \frac{0.5}{0.3, 0.4, 0.3} ⟩},$

$B = {⟨ x_{1}, \frac{0.4}{0.5, 0.31, 0.38} ⟩; ⟨ x_{2}, \frac{0.1}{0.2, 0.41, 0.3} ⟩; ⟨ x_{3}, \frac{0.24}{0.36, 0.33, 0.44} ⟩},$

$C = {⟨ x_{1}, \frac{0.31}{0.52, 0.31, 0.55} ⟩; ⟨ x_{2}, \frac{0.1}{0.4, 0.4, 0.3} ⟩; ⟨ x_{3}, \frac{0.55}{0.65, 0.3, 0.3} ⟩},$

$D = {⟨ x_{1}, \frac{0.36}{0.51, 0.31, 0.438} ⟩; ⟨ x_{2}, \frac{0.25}{0.3, 0.48, 0.4} ⟩; ⟨ x_{3}, \frac{0.4}{0.4, 0.32, 0.44} ⟩},$

$E = {⟨ x_{1}, \frac{0.26}{0.39, 0.18, 0.42} ⟩; ⟨ x_{2}, \frac{0.6}{0.5, 0.32, 0.2} ⟩; ⟨ x_{3}, \frac{0.1}{0.4, 0.4, 0.3} ⟩},$

$F = {⟨ x_{1}, \frac{0.4}{0.4, 0.55, 0.5} ⟩; ⟨ x_{2}, \frac{0.12}{0.49, 0.4, 0.341} ⟩; ⟨ x_{3}, \frac{0.25}{0.5, 0.49, 0.5} ⟩},$

$G = {⟨ x_{1}, \frac{0.2}{0.35, 0.51, 0.44} ⟩; ⟨ x_{2}, \frac{0.5}{0.35, 0.42, 0.255} ⟩; ⟨ x_{3}, \frac{0.4}{0.4, 0.3, 0.2} ⟩},$

$H = {⟨ x_{1}, \frac{0.15}{0.4, 0.6, 0.33} ⟩; ⟨ x_{2}, \frac{0.24}{0.34, 0.33, 0.44} ⟩; ⟨ x_{3}, \frac{0.3}{0.4, 0.5, 0.39} ⟩},$

$I = {⟨ x_{1}, \frac{0.4}{0.4, 0.53, 0.45} ⟩; ⟨ x_{2}, \frac{0.5}{0.69, 0.3, 0.3} ⟩; ⟨ x_{3}, \frac{0.1}{0.4, 0.49, 0.3} ⟩},$

$K = {⟨ x_{1}, \frac{0.59}{0.5, 0.32, 0.2} ⟩; ⟨ x_{2}, \frac{0.8}{0.6, 0.5, 0.4} ⟩; ⟨ x_{3}, \frac{0.59}{0.5, 0.32, 0.2} ⟩} .$

Table 4 provides a comparison of the distances between set K and the other sets, contrasting the results from our proposed measure with those from established SOTA techniques.

Table 4.
The measures Between A and the Other Sets in Example 3.4.

$d_{k}$ $d_{k} (A, B)$ $d_{k} (A, C)$ $d_{k} (A, D)$ $d_{k} (A, E)$ $d_{k} (A, F)$ $d_{k} (A, G)$ $d_{k} (A, H)$ $d_{k} (A, I)$ $d_{k} (A, K)$

$d_{1}$ 0 . 4525 0.4525 0.4270 0.4525 0.5053 0.4813 0.4900 0.4650 0.5100

$d_{2}$ 0.1508 0.1508 0.1432 0.1508 0.1684 0.1604 0.1633 0.1550 0.1700

$d_{3}$ 0.2934 0.3074 0.2792 0.3350 0.3137 0.2996 0.3291 0.3137 0.3671

$d_{4}$ 0.1694 0.1775 0.1612 0.1934 0.1811 0.1730 0.1900 0.1811 0.2119

$d_{5}$ 0.6727 0.6727 0.6535 0.6727 0.7108 0.6937 0.7000 0.6819 0.7141

$d_{6}$ 0.6600 0.8000 0.5800 0.9400 0.6910 0.6200 0.8900 0.8000 0.8000

$d_{7}$ 0.2200 0.2667 0.1933 0.3133 0.2303 0.2067 0.2967 0.2667 0.2667

$d_{8}$ 0.1354 0.1643 0.1354 0.1243 0.1554 0.1435 0.1528 0.1331 0.1517

$d_{9}$ 0.0983 0.0944 0.0848 0.1284 0.0936 0.1011 0.1196 0.0934 0.0912

$d_{k}^{}$ $d_{k}^{} (A, B)$ $d_{k}^{} (A, C)$ $d_{k}^{} (A, D)$ $d_{k}^{} (A, E)$ $d_{k}^{} (A, F)$ $d_{k}^{} (A, G)$ $d_{k}^{} (A, H)$ $d_{k}^{} (A, I)$ $d_{k}^{} (A, K)$

$d_{1}^{}$ 0.1898 0.2200 0.1723 0.2530 0.2056 0.1864 0.2412 0.2230 0.2253

$d_{2}^{}$ 0.2471 0.2750 0.2345 0.2980 0.2802 0.2572 0.2870 0.2778 0.2804

In Example 3.4, set A is considered a fixed reference, against which sets $B, C, D, E, F, G, H, I$ and K are compared. As presented in Table 4, a comparison of the distances between set A and these other sets is conducted. It is observed that when utilizing SOTA measures, the calculated distances between set A and the different sets exhibit substantial similarity. This finding suggests that SOTA measures possess inherent limitations, potentially leading to inaccurate results and complicating decision-making. Conversely, the computation of distances from set A to the other sets using the proposed distances yields distinctly different outcomes. These novel distances effectively address several limitations associated with SOTA approaches and demonstrate superior efficiency.

The identical results produced by our two proposed measures in Example 3.4 are not coincidental but rather an intentional demonstration of their behavior under the specific symmetric conditions of the T, I, and F components. This scenario was deliberately constructed to illustrate this theoretical property. Crucially, this response to symmetry distinguishes our measures from other SOTA distances; under the same conditions, the Hamming ( $d_{1}$ , $d_{2}$ ) and Euclidean ( $d_{3}$ , $d_{4}$ ) measures do not yield equal values. Therefore, while our formulas produce distinct results in general, non-symmetric cases, they share a unique property of converging under symmetry that is not present in other established metrics.
Proposition 3.7.
For $k = 1, 2$

$d_{k} (K, M) \leq d_{k} * (K, L) + d_{k} * (L, M)$ .
Proof.
For all $x \in X,$ let $a_{i} (x) = Δ_{i} (x)_{K, L}$ ; $b_{i} (x) = Δ_{i} (x)_{L, M}$ ; $c_{i} (x) = Δ_{i} (x)_{K, M}$ . By numerical calculations, we get
$max_{i} {c_{i} (x)} \leq max_{i} {a_{i} (x)} + max_{i} {b_{i} (x)}$
and
$\sqrt{\sum_{i = 1}^{4} c_{i}^{2} (x)} \leq \sqrt{\sum_{i = 1}^{4} a_{i}^{2} (x)} + \sqrt{\sum_{i = 1}^{4} b_{i}^{2} (x)} .$
Thus, for all $x \in X,$
$\begin{array}{l} \nabla_{1} (K, M; x) = \frac{\sqrt{\sum c_{i}^{2} (x)}}{4} + \frac{ma x_{i} {c_{i} (x)}}{2} \\ \leq \frac{\sqrt{\sum a_{i}^{2} (x)} + \sqrt{\sum b_{i}^{2} (x)}}{4} + \frac{ma x_{i} {a_{i} (x)} + ma x_{i} {b_{i} (x)}}{2} \\ = (\frac{\sqrt{\sum a_{i}^{2} (x)}}{4} + \frac{ma x_{i} {a_{i} (x)}}{2}) + (\frac{\sqrt{\sum b_{i}^{2} (x)}}{4} + \frac{ma x_{i} {b_{i} (x)}}{2}) \\ = \nabla_{1} (K, L; x) + \nabla_{1} (L, M; x) . \end{array}$
in which $\nabla_{1} (K, L; x) = \frac{\sqrt{\sum_{i = 1}^{4} Δ_{i}^{2} {(x)}_{K, L}}}{4} + \frac{max_{i} {Δ_{i} {(x)}_{K, L}}}{2}$ .

Hence, $d_{1}^{} (K, M) \leq d_{1}^{} (K, L) + d_{1}^{} (L, M)$ .
$\begin{array}{l} \nabla_{2} (K, M; x) = sin (\frac{\sqrt{\sum c_{i}^{2} (x)} π}{4}) \\ \leq sin (\frac{\sqrt{\sum a_{i}^{2} (x)} π}{4}) + sin (\frac{\sqrt{\sum b_{i}^{2} (x)} π}{4}) \\ = \nabla_{2} (K, L; x) + \nabla_{2} (L, M; x) . \end{array}$

In which, $\nabla_{2} (K, L; x) = sin (\frac{π \sqrt{\sum_{i = 1}^{4} Δ_{i}^{2} {(x)}_{K, L}}}{4})$ .

Hence, $d_{2}^{} (K, M) \leq d_{2}^{} (K, L) + d_{2}^{} (L, M)$ .
Proposition 3.8.
For $k = 1, 2$ and $K, L, M \in N F S (X)$
$d_{k}^{} (K, L) = d_{k}^{} (K^{c}, L^{c}) \;$ ,

$d_{k}^{} (K \cap M, L \cap M) \leq d_{k}^{} (K, L)$ ,

$d_{k}^{} (K \cup M, L \cup M) \leq d_{k}^{} (K, L)$ .

Proof.

The Results are Directly Deduced from the Definitions 2.2, 2.4 and 3.1.

The Results are Directly Deduced from the Definitions 2.2, 2.4, 3.1 and the Following lemma: for any Real Numbers
$| min (a, c) - min (b, c) | \leq | a - b |$ and $| max (a, c) - max (b, c) | \leq | a - b |$ .

The proof of (3) is similar to that of (1).

Definition 3.9
Two measures $D_{1}$ and $D_{2}$ on $N F S (X)$ are topologically equivalent if there exist positive constants $c_{1}, c_{2}$ such that for all $X, Y \in N F S (X) :$

$c_{1} D_{1} (X, Y) \leq D_{2} (X, Y) \leq c_{2} D_{1} (X, Y)$ .
Proposition 3.10.
With the SOTA measures in Table 1 (Definition 2.5),
$d_{1}^{} \;$ is equivalent to $d_{2}^{}$ .

$d_{1}^{} \;$ is equivalent to $d_{p}$ for $\; p = 1, 2, 3, 4, 6, 7, 8.$

$d_{2}^{} \;$ is equivalent to $d_{p}$ for $\; p = 1, 2, 3, 4, 6, 7, 8.$

$d_{5}$ is not equivalent to $d_{1}^{}$ , $d_{2}^{}$ .

Proof.
For $K, L \in N F S (X)$ ,
For all $x \in X$ , define: $A (x) = max_{i} Δ_{i} (x), B (x) = \sqrt{\sum_{i = 1}^{4} Δ_{i}^{2} (x)}$ . As $Δ_{i} (x) \leq A (x)$ , then $\sum_{i = 1}^{4} Δ_{i}^{2} (x) \leq 4 A {(x)}^{2}$ . Thus, $A (x) \leq B (x) \leq 2 A (x)$ . We define $t = \frac{π B (x)}{4}, B (x) \in [0; 2]$ and inequality $s i n t \leq t, s i n t \geq \frac{2}{π} t$ , we can deduce that: $\frac{B (x)}{2} \leq s i n (\frac{π B (x)}{4}) \leq \frac{π B (x)}{4}$ .
We have: $B (x) \leq 2 A (x)$

$\begin{aligned} \Rightarrow \frac{B (x)}{4} & \leq \frac{A (x)}{2} \end{aligned}$

$\begin{aligned} \Rightarrow \frac{B (x)}{2} & \leq \frac{B (x)}{4} + \frac{A (x)}{2} \end{aligned}$

$\begin{aligned} \Rightarrow \frac{π B (x)}{4} & \leq \frac{π}{2} (\frac{B (x)}{4} + \frac{A (x)}{2}) \end{aligned}$

Thus, $d_{2}^{} (K, L) \leq \frac{π}{2} d_{1}^{} (K, L)$

On the other hand, $\frac{2}{3} (\frac{B (x)}{4} + \frac{A (x)}{2}) \leq \frac{B (x)}{2}$ . Thus, $\frac{2}{3} d_{1}^{} (K, L) \leq d_{2}^{} (K, L)$ .

Through numerical calculations, we get $\frac{2}{3} d_{1}^{} (K, L) \leq d_{2}^{} (K, L) \leq \frac{π}{2} d_{1}^{} (K, L)$ . 2.
Through numerical calculations, we get:
$\frac{1}{| X |} d_{1} (K, L) \leq d_{1}^{} (K, L) \leq \frac{3}{| X |} d_{1} (K, L)$

$d_{2} (K, L) \leq d_{1}^{} (K, L) \leq 3 d_{2} (K, L)$

$\frac{1}{| X |} d_{3} (K, L) \leq d_{1}^{} (K, L) \leq \frac{3}{2 \sqrt{| X |}} d_{3} (K, L)$

$\frac{1}{\sqrt{| X |}} d_{4} (K, L) \leq d_{1}^{} (K, L) \leq \frac{3}{2} d_{4} (K, L)$

$\frac{1}{4 | X |} d_{6} (K, L) \leq d_{1}^{} (K, L) \leq \frac{3}{4 | X |} d_{6} (K, L)$

$\frac{1}{4} d_{7} (K, L) \leq d_{1}^{} (K, L) \leq \frac{3}{4} d_{7} (K, L)$

$\frac{6}{11} d_{8} (K, L) \leq d_{1}^{} (K, L) \leq 6 d_{8} (K, L)$
.
$\frac{1}{2} d_{9} (K, L) \leq d_{1}^{} (K, L) \leq \frac{3}{4} d_{9} (K, L)$

3.
The results are directly deduced from ideals (1) and (2).
4.
Let $X = {x_{0}}$ and $L = ⟨ x_{0}, \frac{0.5}{0.5, 0.5, 0.5} ⟩$ . Consider a sequence of NF sets $K_{n}$ defined as $K_{n} = ⟨ x_{0}, \frac{0.5 + 1 / n}{0.5, 0.5, 0.5} ⟩$ where $n \in Z^{+}$ such that $n \geq 2$ . Then $d_{1}^{} (K_{n}, L) = \frac{3}{4 n}$ and $d_{5} (K_{n}, L) = \frac{1}{2 \sqrt{n}}$ . As $n \to \infty, 1 / n \to 0$ . This implies $\lim_{n \to \infty} \frac{d_{5} (K_{n}, L)}{d_{1}^{} (K_{n}, L)} = \lim_{n \to \infty} \frac{2}{3 \sqrt{n}} = + \infty$ .

Corollary 3.11.
Let $K \in NFS (X),$
If $K \neq \emptyset_{N F}$ , then $d_{1}^{} (K, \emptyset_{N F}) > 0$ and $d_{2}^{} (K, \emptyset_{N F}) > 0$ .

$d_{1}^{} (K, \emptyset_{N F}) = d_{1}^{} (K^{c}, X_{N F})$

$d_{2}^{} (K, \emptyset_{N F}) = d_{2}^{*} (K^{c}, X_{N F})$
in which $\emptyset_{NF} = {⟨ x, \frac{0}{0, 1, 1} ⟩ : x \in X}$ and $X_{NF} = {⟨ x, \frac{1}{1, 0, 0} ⟩ : x \in X}$ .
Proof.
The results are directly deduced from the Definitions 2.2 (3), 3.1 and Proposition 3.2.
Corollary 3.12.
For p $= 1, 2, \dots, 8$ and $K, L, M \in N F S (X)$

$d_{p} (K, M) \leq d_{p} (K, L) + d_{p} (L, M)$ .
Proof.
The results are directly deduced from the Definition 2.5 and Proposition 3.1.
4. Applications Proposed Distances in Pattern Recognition, Decision-Making and Data Clustering

$d_{1}^{*} (K, L)$	$d_{2}^{*} (K, L)$	$d_{1} (K, L)$	$d_{2} (K, L)$	$d_{3} (K, L)$
0.2195	0.257	0.3775	0.1258	0.293
$d_{4} (K, L)$	$d_{5} (K, L)$	$d_{6} (K, L)$	$d_{7} (K, L)$	$d_{8} (K, L)$	$d_{9} (K, L)$
0.1692	0.6144	0.82	0.2733	0.1163	0.0483

$d_{k}$	$d_{1}$	$d_{2}$	$d_{3}$	$d_{4}$	$d_{5}$	$d_{6}$
$d_{k} (K, L)$	0.9750	0.4875	0.9097	0.6432	0.9874	1 . 5000
$d_{k} (L, M)$	0.3500	0.1750	0.5148	0.3640	0.5916	1.2000
$d_{k} (K, M)$	1.3250	0.6625	1.0595	0.7492	1.1511	1.5000
$d_{k}$	$d_{1}^{*}$	$d_{2}^{*}$	$d_{7}$	$d_{8}$	$d_{9}$
	0.6611	0.7007	0.7500	0.4938	0.5712
$d_{k} (L, M)$	0.4556	0.4492	0.6000	0.2708	0.1755
	$d_{k} (K, M)$	0.7125	0.7612	0.7500	0.5813	0.9071

$d_{k}$	$d_{k} (A, B)$	$d_{k} (A, C)$	$d_{k} (A, D)$	$d_{k} (A, E)$	$d_{k} (A, F)$	$d_{k} (A, G)$	$d_{k} (A, H)$	$d_{k} (A, I)$	$d_{k} (A, K)$
$d_{1}$	0 . 4525	0.4525	0.4270	0.4525	0.5053	0.4813	0.4900	0.4650	0.5100
$d_{2}$	0.1508	0.1508	0.1432	0.1508	0.1684	0.1604	0.1633	0.1550	0.1700
$d_{3}$	0.2934	0.3074	0.2792	0.3350	0.3137	0.2996	0.3291	0.3137	0.3671
$d_{4}$	0.1694	0.1775	0.1612	0.1934	0.1811	0.1730	0.1900	0.1811	0.2119
$d_{5}$	0.6727	0.6727	0.6535	0.6727	0.7108	0.6937	0.7000	0.6819	0.7141
$d_{6}$	0.6600	0.8000	0.5800	0.9400	0.6910	0.6200	0.8900	0.8000	0.8000
$d_{7}$	0.2200	0.2667	0.1933	0.3133	0.2303	0.2067	0.2967	0.2667	0.2667
$d_{8}$	0.1354	0.1643	0.1354	0.1243	0.1554	0.1435	0.1528	0.1331	0.1517
$d_{9}$	0.0983	0.0944	0.0848	0.1284	0.0936	0.1011	0.1196	0.0934	0.0912
$d_{k}^{*}$	$d_{k}^{*} (A, B)$	$d_{k}^{*} (A, C)$	$d_{k}^{*} (A, D)$	$d_{k}^{*} (A, E)$	$d_{k}^{*} (A, F)$	$d_{k}^{*} (A, G)$	$d_{k}^{*} (A, H)$	$d_{k}^{*} (A, I)$	$d_{k}^{*} (A, K)$
$d_{1}^{*}$	0.1898	0.2200	0.1723	0.2530	0.2056	0.1864	0.2412	0.2230	0.2253
$d_{2}^{*}$	0.2471	0.2750	0.2345	0.2980	0.2802	0.2572	0.2870	0.2778	0.2804

To empirically validate the effectiveness and applicability of our proposed distance measures, we conduct a comprehensive evaluation across three key tasks: pattern recognition, MCDM, and clustering. The evaluation employs a new custom-designed pattern recognition algorithm. Furthermore, we also apply the NF-TOPSIS method for the MCDM problem, and the NF-CLUSTER algorithm for the clustering task. These experiments are benchmarked on two well-established datasets from the literature: Machine Selection in (Nafei et al., 2024) and Educational Support in (Bui et al., 2023).

4.1. Pattern Recognition with NF-PATTERN

Pattern recognition stands as a cornerstone in numerous information processing systems, particularly those grappling with ambiguity, vagueness, and incomplete information inherent in real-world data. In such contexts, the ability to accurately classify an unknown pattern by comparing it against a set of predefined prototypes is paramount. Distance or similarity measures serve as the fundamental mathematical tools for quantifying the degree of resemblance or dissimilarity between patterns. The advent of neutrosophic sets and subsequently NF-sets as an extension has provided a sophisticated framework for representing complex patterns. NF-sets are particularly adept at capturing nuanced information by explicitly modeling truth, indeterminacy, and falsity components associated with pattern characteristics, along with a membership grade in the universe of discourse. This enriched representation necessitates the development and application of robust distance measures tailored for the NF-set domain to enhance the precision and reliability of pattern recognition tasks.

4.1.1. Proposed Algorithm

The proposed algorithm builds upon well-established pattern recognition techniques from fuzzy (Yawei et al., 2005), intuitionistic fuzzy (Hatzimichailidis A et al., 2012) and NF domains (Luo et al., 2022). This algorithm provides a structured approach to classify an unknown pattern by evaluating its proximity to a collection of known patterns. The detailed procedural steps of this algorithm are delineated in Algorithm 3.

The proposed NF-PATTERN algorithm consists of the following three stages:

- Distance Computation in Step 1: At first, a vector of distances is computed by applying a distance measure between the input dataset and each dataset in the reference set. Each element in this vector represents the dissimilarity of the input to a specific reference class.

- Classification via Minimum Distance in Step 2: The classification is performed by identifying the reference dataset corresponding to the minimum distance, $D_{min}$ . A key condition for a successful classification is the uniqueness of this minimum. If two or more reference datasets are equidistant to the input (i.e., a non-unique minimum), the classification is deemed ambiguous and fails.

- Reliability Assessment in Step 3: For each successful classification, we quantify its confidence using a Degree of Confidence (DOC) index. The DOC is calculated as the sum of absolute differences between all computed distances and the uniquely identified $D_{min}$ . This value reflects the margin of confidence in the decision; a larger DOC indicates a more robust classification, as the closest match is significantly more distinct from the other candidates. Consequently, the DOC is only defined for unambiguous classifications.

4.1.2. Illustration and Comparison

The NF-PATTERN algorithm is versatile and can be implemented with any suitable distance measure defined for NF-sets. In the subsequent sections, we will illustrate its application using the newly proposed distances ( $d_{1}^{*}$ , $d_{2}^{*}$ ), and provide a comparative analysis against a selection of existing distance measures to highlight their performance characteristics.

The primary objective of this section is to empirically demonstrate the operational behavior of the NF-PATTERN algorithm and, more significantly, to undertake a comparative assessment of the proposed distances ( $d_{1}^{*}$ , $d_{2}^{*}$ ), against a suite of SOTA measures. This evaluation will be conducted through meticulously constructed numerical examples. In this context, superior performance or effectiveness of a distance is not solely judged by its capacity to correctly identify the closest prototype pattern but, crucially, by its ability to yield a higher DOC. A higher DOC intrinsically implies a more decisive and reliable classification, which is a desirable attribute in practical pattern recognition systems.

The Examples 4.1 and 4.2 are designed to create scenarios where the nuanced characteristics of $d_{1}^{*}$ , $d_{2}^{*}$ might offer advantages in discerning the correct pattern with greater confidence.

Example 4.1.
Let $C, K_{1}, K_{2}, K_{3}$ be NF sets over $X = {x_{1}, x_{2}, x_{3}}$ where
$C = {x_{1}, \frac{0.1}{0.2, 0.3, 0.4}; x_{2}, \frac{0.2}{0.25, 0.4, 0.15}; x_{3}, \frac{0.28}{0.2, 0.32, 0.44}},$

$K_{1} = {x_{1}, \frac{0.27}{0.24, 0.4, 0.38}; x_{2}, \frac{0.34}{0.1, 0.47, 0.3}; x_{3}, \frac{0.3}{0.46, 0.33, 0.56}},$

$K_{2} = {x_{1}, \frac{0.24}{0.21, 0.21, 0.45}; x_{2}, \frac{0.5}{0.3, 0.5, 0.3}; x_{3}, \frac{0.4}{0.29, 0.3, 0.3}},$

$K_{3} = {x_{1}, \frac{0.25}{0.22, 0.26, 0.45}; x_{2}, \frac{0.15}{0.1, 0.3, 0.3}; x_{3}, \frac{0.1}{0.4, 0.5, 0.44}} .$
The pattern results based on Algorithm 3 are presented in Table 5.
Table 5.
The Pattern Results for C in Example 4.1.

$d_{k}$ $d_{k} (C, K_{1})$ $d_{k} (C, K_{2})$ $d_{k} (C, K_{3})$ Selection $D O C^{i}$

$d_{1}$ 0.3125 0.3150 0.3175 $K_{1}$ 0.0075

$d_{2}$ 0.1042 0.1050 0.1058 $K_{1}$ 0.0025

$d_{3}$ 0.2196 0.2224 0.2175 $K_{3}$ 0.0069

$d_{4}$ 0.1268 0.1284 0.1256 $K_{3}$ 0.0040

$d_{5}$ 0.5590 0.5612 0.5635 $K_{1}$ 0.0067

$d_{8}$ 0.1088 0.0947 0.0974 $K_{2}$ 0.0167

$d_{1}^{}$ 0.1594 0.1578 0.1440 $K_{3}$ 0 . 0293

$d_{2}^{}$ 0.1958 0.1905 0.1891 $K_{3}$ 0.0081

Example 4.2.
Let $C, K_{1}, K_{2}, K_{3}, K_{4}, K_{5}, K_{6}$ be NF sets over $X = {x_{1}, x_{2}, x_{3}}$ where
$C = {x_{1}, \frac{0.1}{0.4, 0.5, 0.7}; x_{2}, \frac{0.4}{0.6, 0.7, 0.8}; x_{3}, \frac{0.35}{0.27, 0.22, 0.21}},$

$K_{1} = {x_{1}, \frac{0.2}{0.3, 0.6, 0.6}; x_{2}, \frac{0.7}{0.1, 0.5, 0.7}; x_{3}, \frac{0.3}{0.2, 0.4, 0.5}},$

$K_{2} = {x_{1}, \frac{0.56}{0.23, 0.35, 0.22}; x_{2}, \frac{0.31}{0.37, 0.38, 0.45}; x_{3}, \frac{0.22}{0.45, 0.21, 0.78}},$

$K_{3} = {x_{1}, \frac{0.21}{0.2, 0.4, 0.68}; x_{2}, \frac{0.1}{0.5, 0.5, 0.6}; x_{3}, \frac{0.5}{0.21, 0.56, 0.61}},$

$K_{4} = {x_{1}, \frac{0.22}{0.3, 0.2, 0.6}; x_{2}, \frac{0.479}{0.6, 0.4, 0.3}; x_{3}, \frac{0.2}{0.6, 0.2, 0.4}},$

$K_{5} = {x_{1}, \frac{0.25}{0.257, 0.33, 0.5}; x_{2}, \frac{0.167}{0.43, 0.35, 0.5}; x_{3}, \frac{0.45}{0.3, 0.32, 0.55}},$

$K_{6} = {x_{1}, \frac{0.4}{0.3, 0.6, 0.63}; x_{2}, \frac{0.65}{0.8, 0.73, 0.36}; x_{3}, \frac{0.14}{0.28, 0.32, 0.49}}$
. The pattern results based on Algorithm 3 are presented in Table 6.
Table 6.
The Pattern Results for C in Example 4.2.

$d_{k}$ $d_{k} (C, K_{1})$ $d_{k} (C, K_{2})$ $d_{k} (C, K_{3})$ $d_{k} (C, K_{4})$ $d_{k} (C, K_{5})$ $d_{k} (C, K_{6})$ Selection $D O C^{i}$

$d_{1}$ 0 . 5225 0.7850 0.5450 0.5473 0.5715 0.5225 N/A N/A

$d_{2}$ 0.1742 0.2617 0.1817 0.1824 0.1905 0.1742 N/A N/A

$d_{3}$ 0.3721 0.5370 0.3688 0.3995 0.3688 0.3690 N/A N/A

$d_{4}$ 0.2148 0.3101 0.2130 0.2307 0.2129 0.2130 N/A N/A

$d_{5}$ 0.7228 0.886 0.7382 0.7398 0.756 0.7228 N/A N/A

$d_{6}$ 0.8900 1.4000 0.9000 1.1300 0.8900 1.0200 N/A N/A

$d_{7}$ 0.2967 0.4667 0.3000 0.3767 0.2967 0.3400 N/A N/A

$d_{8}$ 0.1860 0.2864 0.1797 0.2168 0.1941 0.1782 $K_{6}$ 0.1175

$d_{9}$ 0.1004 0.2257 0.0994 0.1179 0.0928 0.0891 $K_{6}$ 0.1581

$d_{1}^{}$ 0.2464 0.3874 0.2520 0.3009 0.2523 0.2740 $K_{1}$ 0.1466

$d_{2}^{}$ 0.3001 0.4647 0.3134 0.3452 0.3200 0.3201 $K_{1}$ 0.1778

The evidence presented in Tables 5 and 6 reveals that the proposed measures, $d_{1}^{}$ and $d_{2}^{}$ , exhibit the highest DOCs, indicating their superior reliability and effectiveness. This advantage is further corroborated by Examples 3.3, 4.1, and 4.2, which illustrate $d_{1}^{}$ ‘s capacity to surpass current SOTA measures by addressing their limitations. These improvements directly translate to enhanced accuracy in multi-criteria decision-making and more robust clustering in neutrosophic fuzzy environments, especially under significant complexity and uncertainty conditions.

A key contribution of our proposed measures, $d_{1}^{}, \; d_{2}^{}$ , lies in their enhanced robustness for pattern recognition under uncertainty. Our experiments reveal that both measures consistently outperform traditional methods, yielding higher classification accuracy in two critical situations: (1) separating classes with closely related characteristics, and (2) analyzing datasets with significant fuzziness or incomplete information. This enhanced performance is crucial because it demonstrates the measures’ practical viability for real-world applications where data is rarely perfect. Ultimately, these results establish that $d_{1}^{}, \; d_{2}^{*}$ not only advance the SOTA in terms of accuracy but also make neutrosophic fuzzy models more reliable and applicable to complex, uncertain environments.

4.2. Decision-Making with NF-TOPSIS

$d_{k}$	$d_{k} (C, K_{1})$	$d_{k} (C, K_{2})$	$d_{k} (C, K_{3})$	Selection	$D O C^{i}$
$d_{1}$	0.3125	0.3150	0.3175	$K_{1}$	0.0075
$d_{2}$	0.1042	0.1050	0.1058	$K_{1}$	0.0025
$d_{3}$	0.2196	0.2224	0.2175	$K_{3}$	0.0069
$d_{4}$	0.1268	0.1284	0.1256	$K_{3}$	0.0040
$d_{5}$	0.5590	0.5612	0.5635	$K_{1}$	0.0067
$d_{8}$	0.1088	0.0947	0.0974	$K_{2}$	0.0167
$d_{1}^{*}$	0.1594	0.1578	0.1440	$K_{3}$	0 . 0293
$d_{2}^{*}$	0.1958	0.1905	0.1891	$K_{3}$	0.0081

$d_{k}$	$d_{k} (C, K_{1})$	$d_{k} (C, K_{2})$	$d_{k} (C, K_{3})$	$d_{k} (C, K_{4})$	$d_{k} (C, K_{5})$	$d_{k} (C, K_{6})$	Selection	$D O C^{i}$
$d_{1}$	0 . 5225	0.7850	0.5450	0.5473	0.5715	0.5225	N/A	N/A
$d_{2}$	0.1742	0.2617	0.1817	0.1824	0.1905	0.1742	N/A	N/A
$d_{3}$	0.3721	0.5370	0.3688	0.3995	0.3688	0.3690	N/A	N/A
$d_{4}$	0.2148	0.3101	0.2130	0.2307	0.2129	0.2130	N/A	N/A
$d_{5}$	0.7228	0.886	0.7382	0.7398	0.756	0.7228	N/A	N/A
$d_{6}$	0.8900	1.4000	0.9000	1.1300	0.8900	1.0200	N/A	N/A
$d_{7}$	0.2967	0.4667	0.3000	0.3767	0.2967	0.3400	N/A	N/A
$d_{8}$	0.1860	0.2864	0.1797	0.2168	0.1941	0.1782	$K_{6}$	0.1175
$d_{9}$	0.1004	0.2257	0.0994	0.1179	0.0928	0.0891	$K_{6}$	0.1581
$d_{1}^{*}$	0.2464	0.3874	0.2520	0.3009	0.2523	0.2740	$K_{1}$	0.1466
$d_{2}^{*}$	0.3001	0.4647	0.3134	0.3452	0.3200	0.3201	$K_{1}$	0.1778

Our experimental methodology adopts the standard NF-TOPSIS framework. To specifically isolate and evaluate the impact of the distance measure, the central focus of this study, we introduce only one modification to the original procedure. The core change occurs in Step 4, where the standard distance calculation is replaced by our proposed formulas and, for benchmarking purposes, by several SOTA measures. All other components of the algorithm, including the procedural logic of Steps 1–3 and 5 and the parameter settings, are maintained in strict accordance with the original source publications. This approach ensures that any observed differences in performance can be directly attributed to the choice of distance measure.

4.2.1. Machine Selection Scenario

The strategic selection and prioritization of manufacturing equipment constitute a pivotal decision-making process for optimizing operational performance and resource allocation within industrial enterprises. This investigation addresses such a scenario, focusing on a manufacturing facility equipped with four distinct production units: Machine 1 $(M_{1})$ , Machine $2 \; (M_{2})$ , Machine $3 \; (M_{3})$ and Machine 4 $(M_{4})$ . The core of this study involves a systematic and comprehensive performance evaluation of these machines, which is predicated upon three key operational attributes: efficiency (E), flexibility (F), and reliability (R). These criteria are subsequently delineated as follows:

Efficiency: Measures the machine's overall productivity and resource utilization. Key factors include processing speed, output capacity, product yield, material waste, and its direct contribution to profitability.

Flexibility: Evaluates the machine's capacity to adapt to operational changes. This encompasses its ability to switch between different products, workflows, or batch sizes, a critical factor for responding to dynamic market demands and maintaining a competitive edge.

Reliability: Quantifies the consistency and dependability of the machine's performance. It is technically defined as the probability of failure-free operation for a specified duration under given conditions, directly impacting maintenance costs and production uptime.

The decision-making process involves a panel of three experts, EX1, EX2 and EX3, who are tasked with assessing the machines to determine the most preferable option. The input data for this MCDM problem includes individual evaluation matrices provided by each expert and a set of weights assigned to the aforementioned performance attributes. The relative influence of the decision-makers in the aggregation of judgments is defined by weights of $0.5, \; 0.2$ and 0.3 for three experts respectively. The comprehensive evaluation data, including the scores assigned by each expert to the machines against each criterion, is meticulously presented in Table 7. Utilizing this consolidated dataset, the primary objective for management is to identify the machine demonstrating superior operational effectiveness, which should subsequently be prioritized for deployment and utilization to optimize production outcomes.

Table 7.
Decision Values from Expert 1, 2, 3 and Weight for Attributes (Nafei et al., 2024).

EX1 E F R EX3 E F R

M1 $\frac{0.4}{0.9, 0.9, 0.1}$ $\frac{0.7}{0.0, 0.2, 0.9}$ $\frac{0.0}{0.1, 0.4, 1.0}$ M1 $\frac{0.6}{0.5, 0.8, 0.3}$ $\frac{0.7}{0.2, 0.4, 0.9}$ $\frac{0.3}{1.0, 0.7, 0.9}$

M2 $\frac{0.3}{0.5, 0.7, 0.3}$ $\frac{0.1}{0.7, 0.8, 0.0}$ $\frac{0.1}{0.8, 1.0, 0.7}$ M2 $\frac{0.5}{0.8, 0.4, 0.1}$ $\frac{0.9}{0.7, 0.6, 0.4}$ $\frac{0.9}{0.2, 0.4, 0.5}$

M3 $\frac{0.2}{0.3, 0.8, 0.1}$ $\frac{0.3}{0.4, 0.9, 0.3}$ $\frac{0.4}{0.3, 0.1, 0.0}$ M3 $\frac{0.1}{0.3, 0.3, 0.5}$ $\frac{0.2}{0.9, 0.6, 0.3}$ $\frac{0.1}{0.3, 0.5, 0.2}$

M4 $\frac{0.5}{0.7, 0.2, 0.3}$ $\frac{0.4}{0.5, 0.1, 0.6}$ $\frac{0.7}{0.1, 0.6, 0.7}$ M4 $\frac{0.4}{0.2, 0.5, 0.1}$ $\frac{0.0}{0.5, 0.4, 0.7}$ $\frac{0.5}{0.7, 0.7, 0.7}$

EX2 E F R Weights E F R

M1 $\frac{0.4}{0.1, 0.6, 0.2}$ $\frac{0.0}{0.1, 0.2, 0.3}$ $\frac{0.5}{0.5, 0.2, 0.6}$ EX1 $\frac{0.2}{0.9, 0.0, 0.7}$ $\frac{0.4}{0.6, 0.1, 0.2}$ $\frac{0.8}{1.0, 0.3, 0.7}$

M2 $\frac{0.4}{0.5, 0.5, 0.4}$ $\frac{0.4}{0.8, 0.6, 0.2}$ $\frac{0.3}{0.7, 1.0, 0.5}$ EX2 $\frac{0.1}{0.5, 0.2, 0.8}$ $\frac{0.9}{0.5, 0.8, 0.1}$ $\frac{0.5}{0.1, 0.4, 0.1}$

M3 $\frac{0.2}{0.9, 0.3, 0.5}$ $\frac{0.6}{0.4, 0.4, 0.8}$ $\frac{0.3}{0.5, 1.0, 0.5}$ EX3 $\frac{0.6}{0.3, 0.7, 0.6}$ $\frac{0.0}{0.6, 0.3, 0.4}$ $\frac{0.3}{0.7, 0.4, 0.2}$

M4 $\frac{0.3}{0.1, 0.4, 0.9}$ $\frac{0.6}{0.4, 0.0, 0.2}$ $\frac{0.5}{0.1, 1.0, 1.0}$

EX1	E	F	R	EX3	E	F	R
M1	$\frac{0.4}{0.9, 0.9, 0.1}$	$\frac{0.7}{0.0, 0.2, 0.9}$	$\frac{0.0}{0.1, 0.4, 1.0}$	M1	$\frac{0.6}{0.5, 0.8, 0.3}$	$\frac{0.7}{0.2, 0.4, 0.9}$	$\frac{0.3}{1.0, 0.7, 0.9}$
M2	$\frac{0.3}{0.5, 0.7, 0.3}$	$\frac{0.1}{0.7, 0.8, 0.0}$	$\frac{0.1}{0.8, 1.0, 0.7}$	M2	$\frac{0.5}{0.8, 0.4, 0.1}$	$\frac{0.9}{0.7, 0.6, 0.4}$	$\frac{0.9}{0.2, 0.4, 0.5}$
M3	$\frac{0.2}{0.3, 0.8, 0.1}$	$\frac{0.3}{0.4, 0.9, 0.3}$	$\frac{0.4}{0.3, 0.1, 0.0}$	M3	$\frac{0.1}{0.3, 0.3, 0.5}$	$\frac{0.2}{0.9, 0.6, 0.3}$	$\frac{0.1}{0.3, 0.5, 0.2}$
M4	$\frac{0.5}{0.7, 0.2, 0.3}$	$\frac{0.4}{0.5, 0.1, 0.6}$	$\frac{0.7}{0.1, 0.6, 0.7}$	M4	$\frac{0.4}{0.2, 0.5, 0.1}$	$\frac{0.0}{0.5, 0.4, 0.7}$	$\frac{0.5}{0.7, 0.7, 0.7}$
EX2	E	F	R	Weights	E	F	R
M1	$\frac{0.4}{0.1, 0.6, 0.2}$	$\frac{0.0}{0.1, 0.2, 0.3}$	$\frac{0.5}{0.5, 0.2, 0.6}$	EX1	$\frac{0.2}{0.9, 0.0, 0.7}$	$\frac{0.4}{0.6, 0.1, 0.2}$	$\frac{0.8}{1.0, 0.3, 0.7}$
M2	$\frac{0.4}{0.5, 0.5, 0.4}$	$\frac{0.4}{0.8, 0.6, 0.2}$	$\frac{0.3}{0.7, 1.0, 0.5}$	EX2	$\frac{0.1}{0.5, 0.2, 0.8}$	$\frac{0.9}{0.5, 0.8, 0.1}$	$\frac{0.5}{0.1, 0.4, 0.1}$
M3	$\frac{0.2}{0.9, 0.3, 0.5}$	$\frac{0.6}{0.4, 0.4, 0.8}$	$\frac{0.3}{0.5, 1.0, 0.5}$	EX3	$\frac{0.6}{0.3, 0.7, 0.6}$	$\frac{0.0}{0.6, 0.3, 0.4}$	$\frac{0.3}{0.7, 0.4, 0.2}$
M4	$\frac{0.3}{0.1, 0.4, 0.9}$	$\frac{0.6}{0.4, 0.0, 0.2}$	$\frac{0.5}{0.1, 1.0, 1.0}$

The evaluation data from the three experts (EX1, EX2, and EX3) are compiled into the decision matrices shown in Table 7. Within each matrix, the four machine alternatives (M₁–M₄) are evaluated against the three key criteria. Each assessment is captured as a neutrosophic fuzzy number to effectively model the inherent uncertainty and imprecision in the experts’ judgments. To account for the subjective nature of expert evaluations, the table also incorporates the distinct attribute weights assigned by each expert (EX1, EX2, and EX3). These weights explicitly capture each expert's individual priorities, reflecting their differing views on the relative importance of the criteria.

The empirical dataset, as meticulously detailed in Table 7, served as the primary input for applying the NF-TOPSIS. The procedural framework for this application is formally described in Algorithm 1. In the initial analytical phase, the NF-TOPSIS methodology was executed utilizing the specifically proposed distance $d_{1}^{*} .$ The quantitative outcomes stemming from this implementation are systematically compiled and presented in Table 8. (Steps 1 and 2 are skipped with detailed results as in Nafei et al., 2024).

Table 8.

The Steps of NF-TOPSIS 1 by Applying Proposed Distance $d_{1}^{*}$ .

Step 1.	–*
Step 2.	–
Step 3.	$P I S = [\frac{0.24}{0.38, 0.06, 0.07}, \frac{0.06}{0.03, 0.07, 0.05}, \frac{0.18}{0.25, 0.19, 0.1}]$
	$N I S = [\frac{0.19}{0.59, 0.07, 0.52}, \frac{0.2}{0.06, 0.11, 0.24}, \frac{0.15}{0.36, 0.32, 0.22}]$
Step 4.	$d^{+} (G, P I S) = [0.4723, 0.4426, 0.2386, 0.4816]$
	$d^{-} (G, N I S) = [0.1555, 0.4030, 0.5703, 0.2650]$
Step 5.	$R_{M_{1}} = 0.2477; R_{M_{2}} = 0.4766; R_{M_{3}} = 0.705; R_{M_{4}} = 0.3549$ .
	Ranking: $M_{1} < M_{4} < M_{2} < M_{3}$

*Symbol “–” indicates the corresponding step/result was skipped in this situation.

The NF-TOPSIS framework was subsequently applied using SOTA, and distance measures were proposed to investigate the model's behavior further and for comparative assessment. This extension allowed for a broader examination of the solution space under varied computational assumptions. The comprehensive results derived from these supplementary applications are thoroughly documented in Tables 9 and 10, with salient trends and comparative performance visualized in Figure 1.

Figure 1.

Classify Groups Based on Table 7 Using NF-TOPSIS With Measures.

Table 9.

Results with NF-TOPSIS When Applying SOTA and Proposed Distances.

	$d_{1}$	$d_{2}$	$d_{3}$	$d_{4}$	$d_{5}$	$d_{6}$
$M_{1}$	0.2557	0.2557	0.3025	0.3025	0.3695	0.2471
$M_{2}$	0.5093	0.5093	0.4274	0.4274	0.5047	0.4614
$M_{3}$	0 . 6793	0.6793	0.6756	0.6756	0.5927	0.7110
$M_{4}$	0.3612	0.3612	0.3894	0.3894	0.4292	0.3525
	$d_{7}$	$d_{8}$	$d_{9}$		$d_{1}^{*}$	$d_{2}^{*}$
$M_{1}$	0.2471	0.2507	0.3003		0.2477	0.2514
$M_{2}$	0.4614	0.4860	0.6317		0.4766	0.5006
$M_{3}$	0.7110	0.7413	0.7097		0.7050	0.6930
$M_{4}$	0.3525	0.3076	0.2322		0.3549	0.3601

Table 10.

Rank with SOTA and Proposed Distances Based on Table 7.

Measures	Ranks by NF-TOPSIS	Output
$d_{1}, d_{2}, d_{3}, d_{4}, d_{5}, d_{6}, d_{7}, d_{8}$	$M_{1} < M_{4} < M_{2} < M_{3}$	$M_{3}$
$d_{9}$	$M_{4} < M_{1} < M_{2} < M_{3}$	$M_{3}$
$d_{1}^{}, d_{2}^{}$ (proposed)	$M_{1} < M_{4} < M_{2} < M_{3}$	$M_{3}$

4.2.2. Education Decision-Making Scenario

The 2021 Vietnamese National High School Graduation Examination (NHSGE) required students to take three compulsory subjects (Mathematics, Literature, and English) and select one of two elective streams: Natural Sciences (Physics, Chemistry, Biology) or Social Sciences (History, Geography, Civic Education). Passing this examination was a prerequisite for high school graduation. Subsequently, for university admission, candidates select specific subject combinations, known as admission groups, which are aligned with their intended fields of study. These groups include Group A (Math, Physics, Chemistry), Group A1 (Math, Physics, English), Group B (Biology, Math, Chemistry), Group C (Literature, History, Geography), and Group D (Math, Literature, English).

Upper secondary education, typically three years comprising six semesters, provides the framework where each semester is an assessment criterion. End-of-semester evaluation reflect student cognitive development, effort, and diligence. These assessments aim to identify the optimal university admission subject combination (exam group) for students using a MCDM approach based on performance across all six semesters. In this context, the set of potential university admission subject combinations is defined as the alternatives, denoted by $X = {A, A_{1}, B, C, D}$ . Concurrently, the six academic semesters, serving as the evaluative periods, constitute the set of attributes (criteria), denoted by $S = {S_{1}; S_{2}, S_{3}, S_{4}, S_{5}, S_{6}}$ .

This case study, adapted from (Bui et al., 2023), addresses the MCDM problem of selecting the optimal high school graduation exam subject group for an illustrative student, P. The relevant performance data is structured into a decision matrix in Table 11. In this matrix, the alternatives are the five available subject groups (A, A1, B, C, D), represented by the rows. The evaluation is based on six criteria (S₁–S₆), shown in the columns, which are formulated based on the student's historical academic scores. Each cell value quantifies the student's aptitude for an alternative against a specific criterion, represented as an NF number to model the ambiguities and inconsistencies inherent in academic performance.

Table 11.

The P's Learning Outcome Assessment Data Over 6 Semesters (Bui et al., 2023).

	$S_{1}$	$S_{2}$	$S_{3}$	$S_{4}$	$S_{5}$	$S_{6}$	Actual score
$A$	$\frac{0.94}{0.7, 0.4, 0.6}$	$\frac{0.94}{0.8, 0.4, 0.5}$	$\frac{0.94}{0.8, 0.3, 0.4}$	$\frac{0.98}{0.9, 0.2, 0.4}$	$\frac{0.99}{0.8, 0.3, 0.4}$	$\frac{0.98}{0.9, 0.2, 0.3}$	26.60
$A_{1}$	$\frac{0.94}{0.8, 0.3, 0.5}$	$\frac{0.96}{0.9, 0.3, 0.4}$	$\frac{0.95}{0.8, 0.4, 0.5}$	$\frac{0.99}{0.9, 0.3, 0.4}$	$\frac{0.98}{1.0, 0.2, 0.2}$	$\frac{0.98}{0.9, 0.2, 0.3}$	27.90
$B$	$\frac{0.92}{0.7, 0.6, 0.7}$	$\frac{0.93}{0.7, 0.6, 0.7}$	$\frac{0.94}{0.8, 0.4, 0.5}$	$\frac{0.97}{0.8, 0.4, 0.6}$	$\frac{1.0}{0.6, 0.6, 0.6}$	$\frac{0.97}{0.8, 0.5, 0.5}$	24.85
C	$\frac{0.91}{0.8, 0.5, 0.5}$	$\frac{0.93}{0.6, 0.5, 0.7}$	$\frac{0.92}{0.7, 0.4, 0.5}$	$\frac{0.9}{0.7, 0.6, 0.6}$	$\frac{0.99}{0.6, 0.6, 0.6}$	$\frac{0.94}{0.8, 0.5, 0.5} \;$	N/A
D	$\frac{0.89}{0.7, 0.4, 0.6}$	$\frac{0.93}{0.8, 0.4, 0.5}$	$\frac{0.91}{0.8, 0.4, 0.5}$	$\frac{0.95}{0.9, 0.4, 0.5}$	$\frac{0.99}{0.8, 0.5, 0.5}$	$\frac{0.94}{0.9, 0.3, 0.4} \;$	27.40

Based on data in Table 11, the NF-TOPSIS methodology utilized the SOTA and proposed distances. The results are presented in Tables 12, 13 and Figure 2. Significantly, evaluations conducted using the NF-TOPSIS across SOTA and proposed distances yielded results that were largely consistent with student P's actual performance in the 2021 NHSGE (Table 11).

Figure 2.

Classify Groups Based on Table 11 Using NF-TOPSIS With Measures.

Table 12.

Experimental Results with NF-TOPSIS Using SOTA and Proposed Distances.

	$d_{1}$	$d_{2}$	$d_{3}$	$d_{4}$	$d_{5}$	$d_{6}$
$A$	0.7489	0.7489	0.6876	0.6876	0.6333	0.7895
$A_{1}$	0 . 9295	0.9295	0.8617	0.8617	0.7841	0.8947
$B$	0.1173	0.1173	0.1977	0.1977	0.2671	0.1753
C	0.1292	0.1292	0.2081	0.2081	0.2781	0.1500
D	0.5000	0.5000	0.4955	0.4955	0.5000	0.5500
	$d_{7}$	$d_{8}$	$d_{9}$		$d_{1}^{*}$	$d_{2}^{*}$
$A$	0.7895	0.7704	0.8246		0.7749	0.7522
$A_{1}$	0.8947	0.9112	0.9770		0.9009	0.9077
$B$	0.1753	0.1387	0.4474		0.5126	0.5088
C	0.1500	0.1403	0.0532		0.5012	0.5030
D	0.5500	0.5269	0.4682		0.6502	0.6415

Table 13.

Rank Results with SOTA and Proposed Distances Based on Table 11.

Measures	Ranks by NF-TOPSIS	Output
$d_{1}, d_{2}, d_{3}, d_{4}, d_{5}, d_{8}$	$B < C < D < A < A_{1}$	$A_{1}$
$d_{6}, d_{7}$ , $\; d_{9}$	$C < B < D < A < A_{1}$	$A_{1}$
$d_{1}^{}, d_{2}^{}$ (proposed)	$C < B < D < A < A_{1}$	$A_{1}$

4.2.3. Discussions

Based on the experimental results in Tables 9, 12, 10, 13, it can be seen that the proposed distances, when applied in the NF-TOPSIS, still maintain the ranking order of the alternatives. This indicates that the proposed distance measures do not distort the evaluation results compared to SOTA measures, while ensuring stability and reliability in decision-making. However, when comparing the instrumental values of the measures (Tables 10, 13), the new ones show a more precise separation between the alternatives, especially between the choice types such as D, B and C, which tend to be close to each other in traditional measures. This increase in resolution improves the evaluation capability in scenarios with high competition or closely ranked priorities.

Therefore, the proposed distances not only preserve the correctness of the ranking order but also improve the analytical efficiency and practical applicability in multicriteria decision-making models under the neutrosophic fuzzy set environment.

The results affirm that the proposed new distance formulas have achieved two key objectives. First, these measures preserve consistency with existing methods, ensuring that the priority order of the alternatives is not arbitrarily altered. Second, the new measures enhance the ability to distinguish between alternatives, particularly in situations where the distances between them are very small. This improvement is significant in practical contexts, where alternatives have nearly equivalent priority levels, making the identification of the optimal alternative clearer and more convincing. The development of these new measures aims to address the limitations of existing SOTA measures, which sometimes lack sensitivity in distinguishing closely related alternatives while maintaining reliability and stability in ranking results. By combining both accuracy and stability, the proposed distance formulas provide high application value in MCDM problems under a neutrosophic fuzzy environment, assisting analysts and managers in making more informed choices in conditions of uncertainty and complexity.

4.3. Data Clustering with NF-CLUSTER

The next experiment uses a real dataset randomly selected from 20 high school students who graduated in the 2020–2021 academic year in My Tho City, Tien Giang Province, Vietnam (Bui et al., 2023). Before the NHSGE, a MCDM method was required to rapidly identify appropriate subject groups for these 20 students, utilizing the collected data. When considering the dataset comprising these 20 students, applying the NF-TOPSIS would likely involve a considerable computational duration for processing and deriving conclusions. Consequently, within this section, we opt to employ the NF-CLUSTER, to achieve more rapid result generation. This methodological choice also holds the potential for effective extension to manage and process substantially larger datasets with greater efficiency.

The dataset consists of 20 students, with academic performance recorded over 3 school years. In each year, the performance is assessed across 8 subjects, namely Mathematics (M), Physics (P), Chemistry (C), English (E), Biology (B), Literature (L), History (H), and Geography (G). Each cell in the table is represented in the form of a neutrosophic fuzzy number, reflecting the student's evaluation outcome in the corresponding subject.

Our experimental methodology for clustering utilizes the standard NF-CLUSTER framework. To specifically isolate and evaluate the performance of different distance measures, we introduce a single, targeted modification to the original algorithm. The core change occurs in Step 1, where the standard distance calculation is substituted with our proposed formulas and, for benchmarking, several SOTA measures. All subsequent stages of the algorithm (Steps 2–4), as well as the parameter settings, are maintained in strict accordance with the source publication. This controlled approach ensures that any variations in clustering outcomes can be directly attributed to the specific distance measure being tested.

4.3.1. Results

Based on the data collected, the NF-CLUSTER methodology was executed utilizing the specifically proposed distance $d_{2}^{*}$ . The quantitative outcomes stemming from this implementation are systematically compiled and presented in Table 14.

Table 14.

Experimental Results with NF-CLUSTER When Applying Distance $d_{2}^{*}$ .

Range of $λ$	Clustering Partition C	Case
[0; 0.158)	{1}; {2}; {3}; {4}; {5}; {6}; {7}; {8}; {9}; {10}; {11}; {12}; {13}; {14}; {15}; {16}; {17}; {18}; {19}; {20}	1
[0.158; 0.171)	{6,18}; {1}; {2}; {3}; {4}; {5}; {7}; {8}; {9}; {10}; {11}; {12}; {13}; {14}; {15}; {16}; {17}; {19}; {20}	2
[0.171; 0.221)	${3, 12}; {6, 18}; {1}; {2}; {5}; {4}; {15}; {7}; {8}; {9}; {10}; {11};$ ${13}; {14}; {16}; {17}; {19}; {20}$	3
[0.221; 0.229)	${3, 12}, {4, 15}; {6, 18}; {1}; {2}; {5}; {7}; {8}; {9}; {10}; {11};$ ${13}; {14}; {16}; {17}; {19}; {20}$	4
[0.229; 0.235)	${3, 12, 14}; {4, 15} {6; 18}; {1}; {2}; {5}; {7}; {8}; {9}; {10}; {11};$ ${13}; {16}; {17}; {19}; {20}$	5
[0.235; 0.253)	${3, 10, 12, 14}; {4, 15}; {6, 18}; {1}; {2}; {5}; {7}; {8}; {9}; {11};$ ${13}; {16}; {17}; {19}; {20}$	6
[0.253; 0.261)	${3, 10, 12, 14, 20}; {4, 15} {6, 18}; {1}; {2}; {5}; {7}; {8}; {9}; {11};$ ${13}; {16}; {17}; {19}$	7
[0.261; 0.269)	${3, 10, 12, 14, 20}; {4, 13, 15} {6, 18}; {1}; {2}; {5}; {7}; {8}; {9}; {11};$ ${13}; {16}; {17}; {19}$	8
[0.269; 0.287)	${3, 10, 12, 14, 20}, {6, 18}; {17, 19}; {8, 11}; {4, 13, 15}; {1}; {2}; {5}; {7};$ ${9}; {16}$	9
[0.287; 0.288)	${3, 6, 10, 12, 14, 18, 20}; {17, 19}; {8, 11}; {4, 13, 15}; {1}; {2}; {5}; {7};$ ${9}; {16}$	10
[0.288; 0.295)	${3, 6, 10, 12, 14, 17, 18, 19, 20}; {8, 11}; {4, 13, 15}; {1}; {2}; {5}; {7};$ ${9}; {16}$	11
[0.295; 0.297)	${3, 6, 7, 10, 12, 14, 17, 18, 19, 20}; {8, 11}; {4, 13, 15}; {1}; {2}; {5}; {9};$ ${16}$	12
[0.297; 0.301)	${3, 6, 7, 10, 12, 14, 16, 17, 18, 19, 20}; {8, 11}; {4, 13, 15}; {1}; {2}; {5}; {9}$	13
[0.301; 0.308)	${1, 3, 6, 7, 10, 12, 14, 16, 17, 18, 19, 20}, {8, 11}; {4, 13, 15}; {2}; {5}; {9}$	14
[0.308; 0.31)	${1, 3, 6, 7, 10, 12, 14, 16, 17, 18, 19, 20}; {4, 8, 11, 13, 15}; {2}; {5}; {9}$	15
[0.31; 0.314)	${1, 3, 6, 7; 9; 10; 12; 14; 16; 17; 18; 19; 20}; {4; 8; 11; 13; 15}; {2}; {5}$	16
[0.314; 0.326)	${1, 3, 4, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20}; {2}; {5}$	17
[0.326; 0.405)	${1, 2, 3, 4, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20}; {5}$	18
[0.405; 1]	{1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20}	19

The NF-CLUSTER framework was subsequently applied using proposed distances to further investigate the model's behavior and for comparative assessment, $d_{1}^{*}, d_{2}^{*}$ . This extension allowed for a broader examination of the solution space under varied computational assumptions. The comprehensive results derived from these supplementary applications are thoroughly documented in Tables 15 and 16. Through NF-CLUSTER by applying the proposed measures, the clustering result of the proposed measure is similar to the original result.

Table 15.

Results with NF-CLUSTER When Applying Proposed Distances $d_{1}^{*}, d_{2}^{*}$ .

Number of Clusters	Range of Values $λ$		Number of Clusters	Range of Values $λ$
	$d_{1}^{*}$	$d_{2}^{*}$		$d_{1}^{*}$	$d_{2}^{*}$
20	[0;0.136)	[0;0.158)	10	[0.243;0.244)	[0.287;0.288)
19	[0.136;0.143)	[0.158;0.171)	9	[0.244;0.252)	[0.288;0.295)
18	[0.143;0.180)	[0.171;0.221)	8	N/A	[0.295;0.297)
17	[0.180;0.183)	[0.221;0.229)	7	[0.252;0.256)	[0.297;0.301)
16	[0.183;0.192)	[0.229;0.235)	6	[0.256;0.259)	[0.301;0.308)
15	[0.192;0.207)	[0.235;0.253)	5	[0.259;0.267)	[0.308;0.31)
14	[0.207;0.211)	[0.253;0.261)	4	[0.267;0.278)	[0.31;0.314)
13	[0.211;0.227)	[0.261;0.269)	3	[0.278;0.399)	[0.314;0.326)
12	N/A	N/A	2	[0.399;0.423)	[0.326;0.405)
11	[0.227;0.243)	[0.269;0.287)	1	[0.423;1]	[0.405;1]

Table 16.

Confidence Level $λ$ and Number of Clusters with NF-CLUSTER When Applying Distances $d_{1}^{*}, d_{2}^{*}$ .

Range of Value $λ$	Number of Clusters		Range of Value	Number of Clusters		Range of Value	Number of Clusters		Range of Value	Number of Clusters
	$d_{1}^{*}$	$d_{2}^{*}$	$λ$	$d_{1}^{*}$	$d_{2}^{*}$	$λ$	$d_{1}^{*}$	$d_{2}^{*}$	$λ$	$d_{1}^{*}$	$d_{2}^{*}$
[0; 0.136)	20	20	[0.211; 0.221)	13	18	[0.256; 0.259)	6	14	[0.297; 0.301)	3	7
[0.136; 0.143)	19	20	[0.221; 0.227)	13	17	[0.259; 0.261)	5	14	[0.301; 0.308)	3	6
[0.143; 0.158)	18	20	[0.227; 0.229)	11	17	[0.261; 0.267)	5	13	[0.308; 0.31)	3	5
[0.158; 0.171)	18	19	[0.229; 0.235)	11	16	[0.267; 0.269)	4	13	[0.31; 0.314)	3	4
[0.171; 0.18)	18	18	[0.235; 0.243)	11	15	[0.269; 0.278)	4	11	[0.314; 0.326)	3	3
[0.18; 0.183)	17	18	[0.243; 0.244)	10	15	[0.278; 0.287)	3	11	[0.326; 0.399)	3	2
[0.183; 0.192)	16	18	[0.244; 0.252)	9	15	[0.287; 0.288)	3	10	[0.399; 0.405)	2	2
[0.192; 0.207)	15	18	[0.252; 0.253)	7	15	[0.288; 0.295)	3	9	[0.405; 0.423)	2	1
[0.207; 0.211)	14	18	[0.253; 0.256)	7	14	[0.295; 0.297)	3	8	[0.423; 1]	1	1

4.3.2. Discussions

From Table 14, it is observed that in scenario 19, only a single cluster is formed. Scenario 18, however, yields two clusters, with student 5 being isolated into a separate cluster. This process of cluster identification was extended through subsequent scenarios. For instance, scenario 9 reveals a more detailed partitioning into several student groups: students 3, 10, 12, 14, and 20; students 6 and 18; students 17 and 19; students 8 and 11; and students 4, 13, and 15. Scenario 8 presents a slightly less granular clustering, with groups comprising: students 3, 10, 12, 14, and 20; students 6 and 18; and students 4, 13, and 15. The analysis was iteratively performed until a configuration of 20 distinct clusters was achieved, where each student effectively constituted their cluster. Throughout this iterative analysis, specific student cohorts consistently emerged: Students 3, 10, 12, 14, and 20; students 6 and 18; students 17 and 19; students 8 and 11; and students 4, 13, and 15.

Analysis of actual scores from Table 17:

Student 1 achieved a highest score of 26.6, tying in both Group B and Group D, a unique pattern not seen in other students. Student 2's highest score was 22.65 in Group D, which is comparatively lower than the top scores of other students.

The score differences between the primary and secondary choices for several students are as follows:

- Student 3 (Group B): Highest score of 27.05, with the next highest of 26.00 in Group D (a difference of 1.05).

- Student 4 (Group A1): Highest score of 27.9, with the next highest of 27.4 in Group D (a difference of 0.5).

- Student 5 (Group A): Highest score of 25.5, with the next highest of 25.05 in Group A1 (a difference of 0.5).

- Student 6 (Group D): Highest score of 23.2, with the next highest of 22.7 in Group A1 (a difference of 0.5).

Similarly, the score differences for other students include:

- Student 9 (Group D): Highest score of 26.8, with the next highest of 26.05 in Group A. The difference between their scores in Group D and Group A is 0.75.

- Student 11 (Group B): Highest score of 27.6, with the next highest of 25.85 in Group A (a difference of 1.75).

A notable pattern emerges for students whose secondary choice was Group C:

- Student 17 (Group D): Highest score of 26.85, followed by 23.5 in Group C (a difference of 3.35).

- Student 18 (Group D): Highest score of 26.06, followed by 21.5 in Group C (a difference of 4.56).

- Student 19 (Group D): Highest score of 26.5, followed by 23.25 in Group C (a difference of 3.25).

Based on these observations, several student cohorts emerge: Students 5, 6, 7, 8, and 10; students 3, 4, 12, 13, and 15; students 3, 9, and 11; and students 14, 17, 18, 19, and 20.

Table 17.

Actual Scores of 20 Students from NHSGE in 2021 (Bui et al., 2023).

Students	Group $A$	Group $A_{1}$	Group $B$	Group D
1	25.35	25.85	26.6	26.6
2	19.8	22.15	20.55	22.65
3	24.05	23.50	27.05	26.00
4	26.60	27.90	24.85	27.40
5	25.55	25.05	23.55	23.80
6	20.8	22.70	20.80	23.20
7	23.80	24.60	22.30	23.35
8	22.90	23.90	21.40	23.40
9	24.15	26.05	22.40	26.80
10	21.50	21.45	24.00	22.95
11	25.60	25.85	27.60	25.60
12	23.95	27.05	22.20	26.55
13	23.30	26.50	20.55	26.75
14	20.30	22.80	19.55	26.30
15	24.35	26.25	21.10	26.50
16	23.20	23.75	26.20	26.50
Students	Group C		Group D
17	23.50		26.85
18	21.50		26.06
19	23.25		26.50
20	20.25		23.55

Based on Tables 14 and 17, we observe that Students 3 and 12 are grouped. Similarly, Students 17 and 19 form a distinct pairing, and Students 4, 13, and 15 constitute another group.

The two measures, $d_{1}^{*}, d_{2}^{*}$ , represent the difference or separation between clusters in the data clustering process. When the confidence level is low, it indicates that the data clusters are separated, making it easier for the model to distinguish between different groups. Conversely, as the confidence level increases, the boundaries between clusters become more blurred, making classification and evaluation more challenging. In the data table, the confidence level corresponding to each specific number of clusters shows that the separation between clusters increases-as the number of clusters decreases, a trend that reflects the model's generalizability.

It is worth noting that although both $d_{1}^{*}$ and $d_{2}^{*}$ measure the separation between clusters, they may reflect different criteria: $d_{1}^{*}$ typically measure the internal distance within a cluster, while $d_{2}^{*}$ relating to the distance between clusters. When both confidence levels are low, we can conclude that the data exhibits a good clustering structure; the clusters are highly homogeneous and separated. However, if only one of them is low while the other is high, the clustering model may be biased, potentially leading to inaccurate or ineffective evaluation.

In addition, experimental data shows that as the number of clusters decreases, the degree of separation between the clusters increases. This indicates that the remaining clusters become clearer and more distinct. This can help improve the ability to classify and identify groups within the data, while also minimizing overlap between clusters. From the above analysis, it can be concluded that the pairs $d_{1}^{*}, \; d_{2}^{*}$ are effective in evaluating the quality of clustering. They not only quantitatively reflect the state of the clusters but also help identify the weaknesses of the model to improve the algorithm or adjust the number of clusters.

4.4. Discussions of the Proposed Study

Despite the promising results, this study is not without its limitations. Firstly, the validation was conducted on two specific, albeit established, datasets. While this ensures comparability, the performance of the proposed measures and algorithm on larger, more heterogeneous, or real-time streaming data has not yet been explored. Secondly, our work concentrates on single-valued neutrosophic sets. The direct applicability and performance of these distance measures on other NS variants, such as interval-valued or bipolar neutrosophic sets, remain an open question that requires further investigation. Finally, a formal analysis of the computational complexity of the NF-PATTERN algorithm, particularly in comparison to other pattern recognition techniques, was beyond the scope of this paper.

Overall, Table 18 synthesizes the performance of our measures against established benchmarks and ensuring the results are more transparent.

Table 18.
Comparison of the Proposed Measures and SOTA Measures.

NF-TOPSIS Algorithm the Same Result as the Original

Measures A Metric on $N F S (X)$ Topologically Equivalent to Other Distances NF-PATTERN Algorithm (DOC) Machine Selection Scenario Education Decision-Making Scenario ( $A_{1})$

$d_{1}$ Unsatisfied Satisfied No Same Different

$d_{2}$ Satisfied Satisfied No Same Different

$d_{3}$ Unsatisfied Satisfied No Same Different

$d_{4}$ Satisfied Satisfied No Same Different

$d_{5}$ Unsatisfied Unsatisfied No Same Different

$d_{6}$ Unsatisfied Satisfied No Same Same

$d_{7}$ Satisfied Satisfied No Same Same

$d_{8}$ Satisfied Satisfied 0.1175 Same Different

$d_{9}$ Satisfied N/A 0.1581 Different Same

$d_{1}^{}$ (proposed) Satisfied Satisfied 0.1466 Same Same

$d_{2}^{}$ (proposed) Satisfied Satisfied 0.1778 Same Same

				NF-TOPSIS Algorithm the Same Result as the Original
$d_{1}$	Unsatisfied	Satisfied	No	Same	Different
$d_{2}$	Satisfied	Satisfied	No	Same	Different
$d_{3}$	Unsatisfied	Satisfied	No	Same	Different
$d_{4}$	Satisfied	Satisfied	No	Same	Different
$d_{5}$	Unsatisfied	Unsatisfied	No	Same	Different
$d_{6}$	Unsatisfied	Satisfied	No	Same	Same
$d_{7}$	Satisfied	Satisfied	No	Same	Same
$d_{8}$	Satisfied	Satisfied	0.1175	Same	Different
$d_{9}$	Satisfied	N/A	0.1581	Different	Same
$d_{1}^{*}$ (proposed)	Satisfied	Satisfied	0.1466	Same	Same
$d_{2}^{*}$ (proposed)	Satisfied	Satisfied	0.1778	Same	Same

5. Conclusion

This research introduces two novel NF distance measures and the NF-PATTERN algorithm, significantly advancing the handling of uncertainty. The effectiveness of these contributions was validated through successful integration into NF-TOPSIS and NF-CLUSTER, with experimental results on benchmark datasets confirming their stability and reliability. Theoretically, this work enriches the mathematical toolkit for NF sets. Practically, it provides decision-makers with more robust analytical tools, improving the quality of outcomes in applications like MCDM and data clustering.

The practical advantages are tangible, enhancing applications such as supplier selection by providing more reliable analysis of indeterminate information. However, the study has clear limitations. Its validation used specific datasets, its scope was limited to single-valued neutrosophic sets, and it lacked a formal computational complexity analysis of the NF-PATTERN algorithm. These factors constrain its proven generalizability and scalability.

Building on these limitations, future research should focus on three key areas. First, the proposed measures should be extended to other NS variants, including interval-valued and bipolar sets, to broaden their applicability. Second, the scalability and performance of the NF-PATTERN algorithm must be tested on large-scale, real-world datasets from domains like finance or medicine. Finally, a promising direction is to integrate these methods into hybrid intelligent systems, such as combining them with deep learning models, to create more powerful and sophisticated predictive tools that can simultaneously manage data uncertainty and learn complex patterns.

Footnotes

Acknowledgements

The authors express their sincere gratitude to Tien Giang University (TGU) for their enthusiastic support and for providing a conducive and inspiring research environment.

ORCID iDs

Thanh Nha Nguyen

Kieu Vy Cao

Quang-Thinh Bui

Funding

The authors received no financial support for the research, authorship, and/or publication of this article.

Declaration of Conflicting Interests

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

References

Abed

M. M.

Jarwan

D. A.

Salih

M. A.

(2023). On neutrosophic relations in group theory. International Journal of Mathematics, Statistics, and Computer Science, 1(1), 48–52. https://doi.org/10.59543/ijmscs.v1i.7739

AlShaqsi

Wang

Drogham

, et al. (2024). Quantitative and qualitative similarity measure for data clustering analysis. Cluster Comput, 27(10), 14977–15002. https://doi.org/10.1007/s10586-024-04664-4

Atanassov

K. T.

(1999) Intuitionistic fuzzy sets. In Atanassov

K. T.

(Ed.), Intuitionistic fuzzy sets: Theory and applications (pp. 1–137). Physica-Verlag HD.

Bezdek

J. C.

(1981). Pattern Recognition with Fuzzy Objective Function Algorithms. Springer US. https://doi.org/10.1007/978-1-4757-0450-1. Epub ahead of print.

Biswas

Pramanik

Giri

B. C.

(2015). Cosine similarity measure based multi-attribute decision-making with trapezoidal fuzzy neutrosophic numbers. Neutrosophic Sets and Systems, 8(1), 46–56. https://doi.org/10.5281/zenodo.22446

Boloș

M.-I.

Bradea

I.-A.

Delcea

(2023). Modeling the covariance of financial assets using neutrosophic fuzzy numbers. Symmetry, 15(2), 320. https://doi.org/10.3390/sym15020320

Bui

Q.-T.

Ngo

M.-P.

Snasel

, et al. (2023). Information measures based on similarity under neutrosophic fuzzy environment and multi-criteria decision problems. Engineering Applications of Artificial Intelligence, 122, 106026. https://doi.org/10.1016/j.engappai.2023.106026

Bui

Q.-T.

Snasel

Pedrycz

, et al. (2025). Novel neutrosophic fuzzy max-min-based similarities and their application in clustering for educational decision-making support. J Appl Math Comput, 1–36. https://doi.org/10.1007/s12190-025-02520-1

Bui

Q.-T.

Snasel

, et al. (2021). SFCM: A fuzzy clustering algorithm of extracting the shape information of data. IEEE Transactions on Fuzzy Systems, 29(1), 75–89. https://doi.org/10.1109/TFUZZ.2020.3014662

10.

Ciaramella

Nardone

Staiano

(2020). Data integration by fuzzy similarity-based hierarchical clustering. BMC Bioinformatics, 21(1), 350. https://doi.org/10.1186/s12859-020-03567-6

11.

DalKılıç

Demirtaş

(2025). Similarity measures of neutrosophic fuzzy soft set and its application to decision making. Journal of Experimental & Theoretical Artificial Intelligence, 37(4), 513–529. https://doi.org/10.1080/0952813X.2023.2222720

12.

Das

Roy

B. K.

Kar

M. B.

, et al. (2020). Neutrosophic fuzzy set and its application in decision making. J Ambient Intell Human Comput, 11(11), 5017–5029. https://doi.org/10.1007/s12652-020-01808-3

13.

Dunn

J. C.

(1973). A fuzzy relative of the ISODATA process and its use in detecting compact well-separated clusters. Journal of Cybernetics, 3(3), 32–57. https://doi.org/10.1080/01969727308546046

14.

Ejegwa

P. A.

Wanzenke

T. D.

Ogwuche

I. O.

, et al. (2024). A robust correlation coefficient for fermatean fuzzy sets based on spearman’s correlation measure with application to clustering and selection process. J Appl Math Comput, 70(1), 1747–1770. https://doi.org/10.1007/s12190-024-02019-1

15.

Guo

Sengur

(2015). NCM: Neutrosophic c-means clustering algorithm. Pattern Recognition, 48(10), 2710–2724. https://doi.org/10.1016/j.patcog.2015.02.018

16.

Hatzimichailidis A

Papakostas G

Kaburlasos V

(2012). A novel distance measure of intuitionistic fuzzy sets and its application to pattern recognition problems. International Journal of Intelligent Systems, 27(5), 396–409. https://doi.org/10.1002/int.21529

17.

Jia

, et al. (2022). A novel two-stage unsupervised fault recognition framework combining feature extraction and fuzzy clustering for collaborative AIoT. IEEE Transactions on Industrial Informatics, 18(2), 1291–1300. https://doi.org/10.1109/TII.2021.3076077

18.

Khalaf

O. I.

Natarajan

Mahadev

, et al. (2025). Blinder oaxaca and wilk neutrosophic fuzzy set-based IoT sensor communication for remote healthcare analysis. IEEE Access, 13, 12178–12189. https://doi.org/10.1109/ACCESS.2022.3207751

19.

Khalil

A. M.

Cao

Azzam

Smarandache

Alharbi

W. R.

(2020). Combination of the Single-Valued Neutrosophic Fuzzy Set and the Soft Set with Applications in Decision-Making, https://www.mdpi.com/2073-8994/12/8/1361 (accessed 1 June 2025).

20.

Khan

Chen

Yan

(2020). Co-Clustering to reveal salient facial features for expression recognition. IEEE Transactions on Affective Computing, 11(2), 348–360. https://doi.org/10.1109/TAFFC.2017.2780838

21.

Luo

Zhang

(2022). A novel distance between single valued neutrosophic sets and its application in pattern recognition. Soft Comput, 26(22), 11129–11137. https://doi.org/10.1007/s00500-022-07407-y

22.

Majumder

Das

Hezam

I. M.

, et al. (2023). Integrating trapezoidal fuzzy best–worst method and single-valued neutrosophic fuzzy MARCOS for efficiency analysis of surface water treatment plants. Soft Comput. https://doi.org/10.1007/s00500-023-08532-y. Epub ahead of print 31 May 2023

23.

Mathews

Sebastian

(2023). Hausdorff distance of neutrosophic fuzzy sets for medical diagnosis. AIP Conference Proceedings, 2875, 040001. https://doi.org/10.1063/5.0154022

24.

Mathews

Sebastian

Thankachan

(2024). Neutrosophic fuzzy score matrices: A robust framework for advancing medical diagnostics. International Journal of Neutrosophic Science, 23(3), 08–17. https://doi.org/10.54216/IJNS.230301

25.

Nafei

Huang

C.-Y.

Javadpour

, et al. (2024). Neutrosophic fuzzy decision-making using TOPSIS and autocratic methodology for machine selection in an industrial factory. Int J Fuzzy Syst, 26(4), 860–886. https://doi.org/10.1007/s40815-023-01640-9

26.

Ruspini

E. H.

Bezdek

J. C.

Keller

J. M.

(2019). Fuzzy clustering: A historical perspective. IEEE Computational Intelligence Magazine, 14(1), 45–55. https://doi.org/10.1109/MCI.2018.2881643

27.

Saeed

Rahman

A. U.

(2021). Optimal Supplier Selection Via Decision-Making Algorithmic Technique Based on Single-Valued Neutrosophic Fuzzy Hypersoft Set. Infinite Study.

28.

Smarandache

(2005). A Unifying Field in Logics: Neutrosophic Logic. Neutrosophy, Neutrosophic Set, Neutrosophic Probability (fourth edition): Neutrsophic Logic. Neutrosophy, Neutrosophic Set, Neutrosophic Probability (fourth Edition). Infinite Study.

29.

Wang

Smarandache

Zhang

, et al. (2010). Single Valued Neutrosophic Sets. Infinite Study.

30.

Chen

(2008). Clustering algorithm for intuitionistic fuzzy sets. Information Sciences, 178(17), 3775–3790. https://doi.org/10.1016/j.ins.2008.06.008

31.

Yawei

Shouyu

Xiangtian

(2005). Fuzzy pattern recognition approach to construction contractor selection. Fuzzy Optim Decis Making, 4(1), 103–118. https://doi.org/10.1007/s10700-004-5867-4

32.

(2017). Single-Valued Neutrosophic Clustering Algorithms Based on Similarity Measures. Journal of Classification, 34(1), 148–162. https://doi.org/10.1007/s00357-017-9225-y

33.

Zadeh

L. A.

(1965). Fuzzy sets. Information and Control, 8(3), 338–353. https://doi.org/10.1016/S0019-9958(65)90241-X

34.

Zhang

Cai

(2021). Fuzzy clustering based on automated feature pattern-driven similarity matrix reduction. IEEE Transactions on Computational Social Systems, 8(4), 1203–1212. https://doi.org/10.1109/TCSS.2020.3011471

				NF-TOPSIS Algorithm the Same Result as the Original
Measures	A Metric on $N F S (X)$	Topologically Equivalent to Other Distances	NF-PATTERN Algorithm (DOC)	Machine Selection Scenario	Education Decision-Making Scenario ( $A_{1})$
$d_{1}$	Unsatisfied	Satisfied	No	Same	Different
$d_{2}$	Satisfied	Satisfied	No	Same	Different
$d_{3}$	Unsatisfied	Satisfied	No	Same	Different
$d_{4}$	Satisfied	Satisfied	No	Same	Different
$d_{5}$	Unsatisfied	Unsatisfied	No	Same	Different
$d_{6}$	Unsatisfied	Satisfied	No	Same	Same
$d_{7}$	Satisfied	Satisfied	No	Same	Same
$d_{8}$	Satisfied	Satisfied	0.1175	Same	Different
$d_{9}$	Satisfied	N/A	0.1581	Different	Same
$d_{1}^{*}$ (proposed)	Satisfied	Satisfied	0.1466	Same	Same
$d_{2}^{*}$ (proposed)	Satisfied	Satisfied	0.1778	Same	Same