Abstract
When taking into consideration user’s authentication technologies, systems can be divided into two main categories: (i) those that use passwords, keys, such as cards and tags and (ii) those that use biometric data, such as signature, fingerprint and voice. In this context, it is well understood that biometric systems, which were building based on biometric authentication technologies, emerge as safer authentication alternatives. In order to guarantee the security and thus the data stored accuracy, protection schemes for biometric authentication, such as the Fuzzy Vault and Fuzzy Commitment, have been adopted. In this work, we performed some improvements in the functioning of two well known fuzzy protection schemes, which are: Fuzzy Vault and Fuzzy Commitment, using the Papilio encryption algorithm. This combination leads to the proposal of new protection schemes called Papilio Fuzzy Vault and Papilio Fuzzy Commitment. We carried out an exploratory study using a suite of well-known biometric datasets. Our findings indicated gains in terms of security when comparing the proposed schemes to the original ones. However, the proposed schemes keep the same accuracy level of the original ones.
Introduction
As an important part of security, software systems are secured with a strong authentication mechanisms. According to [1], there are mainly two major categories of systems authentication approaches: (i) those where a user can access the system through passwords, logins, tokens, among others and (ii) those where users can access the system using one or more biometric modalities, such as voice and fingerprint. In this context, it is well understood that the use of biometric-based authentications is a synonym of better systems reliability [2].
However, this reliability comes at a cost: investing in system’s security without compromising data stored accuracy. According to [5], from the user’s point of view, an error of accuracy occurs when the system fails to authenticate the identity of a registered person or when the system erroneously authenticates the identity of an intruder.
There are several protection methods applied to sets of biometric or multibiometrics data, such as the Fuzzy Vault [6] and Fuzzy Commitment [7], which are biometric data protection schemes for authenticating systems. These protection methods are applied in other biometrics data, for example, in [3, 4] where the authors use Fuzzy protection Systems to guarantee competitive recognition accuracy while providing protection to the employed biometric data. However, the data storage and security levels provided by these methods and techniques still lead to system’s vulnerability. Thus, the current methods for authentication need to be rethought, facing with the risk of compromising system’s accuracy.
In an initial attempt of the authors, reported in [9], we tried to apply some traditional cryptography algorithms directly in biometric data. The obtained results were drastically affected by these algorithms, decreasing considerably the accuracy of the biometric systems. Nevertheless, this was an expected outcome, since it is well known that the cryptography algorithms were able to break the relationship between the values of each attribute of the original dataset, making classification a very difficult task. For this reason, we decided that the Papilio algorithm should not be used directly in biometric data, but inside the functioning of some transformation functions, aiming to maintain the performance and increase the security of the stored data.
Therefore, in this paper, our goal is to propose two new biometric protection schemes, Papilio Fuzzy Vault (PFV) and Papilio Fuzzy Commitment (PFC), that rely on the integration of the Papilio encryption algorithm with Fuzzy Commitment and Fuzzy Vault protections schemes, respectively. The Papilio’s cryptosystem was submitted to Differential Cryptanalysis and according to [8], the results show that Papilio is as strong as other symmetric algorithms. Thus, the idea is to increase data accuracy and to provide an additional level of security through the integration of the Papilio in biometric-based authentication systems by means of the evolution of these two well-known biometric protection schemes.
In order to assess the use of PFV and PFC schemes, we carried out an exploratory study to analyse accuracy of the proposed methods in different scenarios of biometric authentication systems (original and transformed spaces, one and multibiometric contexts). In this study, we selected two biometric modalities to be used as basis, voice and fingerprint. Our findings indicated that the adoption of PFV and PFC provide similar levels of accuracy when compared to their original versions, while increasing the security of the biometric-based authentication systems.
The rest of this paper is organized as follows. Section 2 introduces the different schemes and techniques for system authentication and data protection. In Section 3 we present the evolved protection schemes. In the Section 4, the methodology and complementary techniques used in study are described. Our findings are discussed in Section 5. Finally, in Section 6 we provided our final remarks.
Background
This section presents the target data protection techniques and how they can be applied to biometric systems. More specifically, we will present the Papilio Cryptosystem and Cancellable Techniques in Sections 2.1 and 2.2 respectively. The fuzzy schemes used as the basis for this work are introduced in Section 2.3.
Papilio cryptosystem
The Papilio algorithm is a symmetric cryptographic cipher [10]. Symmetric algorithms (or cipher) are characterized by using the same key for encryption and decryption process. As other symmetric algorithms, Papilio is based on Feistel structures that define their encryption process round (cycles) and division of the texts in blocks. Each block, called
As a Feistel-based algorithm, Papilio works on the top of a function
The process of Papilio decryption is similar to the encryption process, except for the fact that the sub-keys are employed in reverse order. For both processes, there is a function
The size (number of bits) of the resulting encrypted text is the same of the plain text, which is an advantage of the Papilio method as there is no need of increasing the size of the encryption output text. Therefore, it is possible to achieve a chiphertext, an encrypted text, for each completed round with the same size.
Cancellable techniques
In this paper, we will use two well-known cancellable techniques, which are: BioConvolving [11] and BioHashing [12]. Both techniques will be applied to two biometric datasets (FingerPrint and Voice), as presented in Section 4.2.
The BioConvolving transformation method aims at dividing each original biometric dataset into
Randomly select a number ( Convert the values Divide the original sequence Apply the linear convolution of the functions
As it can be seen in these steps, due to the convolution operation in Eq. (1), the length of the transformed functions is equal to
Biohashing consists of random bits
Employ the input token to generate a set of pseudo random vectors, Apply the Gram-Schmidt process to Calculate the dot product of Use a threshold
where
This section presents Fuzzy Vault (Section 2.3.1) and Fuzzy Commitment (Section 2.3.2) schemes. Both are used for biometric data protection and authentication.
Fuzzy Vault
The Fuzzy Vault scheme is a protection type based on link key method [6]. It can be understood as a scheme that is able to work with variable data in authentication systems, where it can authenticate a human with biometric data which can vary between themselves. For example, applying Fuzzy Vault scheme, it is possible to accept the same iris at different times and/or ways of collecting. Using fuzzy scheme, it is possible to accept a small user signal distortion in an authentication system. The Fuzzy Vault scheme is conceptually simple and can be implemented using an underlying error correcting code, as shown in [6]. In this case, the authors use the Reed-Solomon method [14].
In [6], there is a reference about a simple case to represent the Fuzzy Vault Scheme. Basically, the essence of Fuzzy Vault can be hypothetically described in the following way: let us suppose that a user called Alice wants to hide a secret
In the context of authentication process [6], proposes some scenarios, where another user named Bob aims to unlock
This security system is the infeasibility of the polynomial reconstruction. In other words, the Fuzzy Vault as well as very suitable for working with set of biometric data. It ensures a level of protection to biometrics. This makes it possible to block an attacker to access user authentication systems with no entries or even authorized users of data belonging to other users.
Fuzzy Commitment
Another protection scheme commonly used in biometric systems is Fuzzy Commitment, that has been proposed in [15] and has been used to protect biometric data [16]. Fuzzy Commitment is described as a biometric encryption scheme that can be used to protect the biometric features represented in the form of binary vectors.
It is possible to propose a biometric template
In the enrolment environment, the value to be stored for each user is extracted from the scheme
Using Fuzzy Commitment scheme, the authentication success is obtained by this scheme
Proposed fuzzy schemes
The main goal of our study was to investigate means to increase accuracy of biometric systems. Therefore, a well-known encryption algorithm called Papilio [10] was integrated to the two most popular protection schemes: Fuzzy Commitment (FC) and Fuzzy Vault (FV). This integration originated the Papilio Fuzzy Vault (PFV) and Papilio Fuzzy Commitment (PFC) Schemes presented in Sections 3.2 and 3.3, respectively.
Papilio Fuzzy Vault Scheme – Enrollment: a) A User provides a link key 
In this work, two versions of the Papilio algorithm (Section 2.1) are used. In this case, we will use the modified version of Papilio, called as MP, in Fuzzy Vault (FV) and its original version in Fuzzy Commitment (FC), labelled OP. The main changes included in MP are related to the link key format. The changes in the link key format were necessary because the Papilio algorithm is used in the polynomial generation process of FV and the polynomial indices are represented in the decimal format. In contrast, Papilio was integrated to the Fuzzy Commitment (FC) without requiring any change. In this sense, the Papilio algorithm is used in its original form in FC and it is used to replace the hash function during the processing of FC.
The Fuzzy Vault scheme works with decimal values that represent the original data (biometric data) as well as the set of false values (Vaults). These values are points (coordinates) that are obtained by applying polynomials
Let us consider that an user sets a password (considered as a link key) in FV enrolment step as a numerical sequence, 20102014. This password will generate the following standard text output Papilio: “*@RfeTukeJ2k*!swa”. However, its output will be represented by the decimal
Papilio Fuzzy Vault
In this section, we will describe the Papilio Fuzzy Vault (PFV) scheme, one of the proposed methods of this paper. The main change that we propose in the Fuzzy Vault (FV) scheme is the use of the modified Papilio algorithm in its functioning. The modified Papilio algorithm (Section 3.1) was integrated to FV scheme to enhance security and thus the accuracy biometric systems. We aim to maintain the same level of accuracy of the original FV scheme in a new and safer setting for the use of biometric data.
Papilio Fuzzy Vault Scheme – Authentication: The user candidate biometric traits (Biometric Collection) are transformed into the following format 
Figures 1 and 2 illustrate the PFV implementation, where the black boxes indicate the actions performed during the PFV functioning and the gray ones indicate the output of the corresponding actions. The PFV implementation is described according to the following 1–13 steps and the steps labelled as
Papilio Fuzzy Commitment Scheme – PFC: In Enrollment step each user characteristic is represented by binary vectors. The Reed-Solomon is uses on link key 
The Fuzzy Commitment scheme (Section 2.3.2) uses a hash function in its protection procedure (Section 2.3.2). Although hash functions are considered safe techniques, they cannot be compared to cryptographic algorithms, in terms of complexity of their respective protection approaches. These functions can be mapped (collision attack2) by attackers, and thus an attacker can find a way to obtain the original data, from a transformed data using hash functions. In this sense, the use of hash function by the authentication systems can also show problems related to collision attack and thus it is possible to discover protected keys. In this case, it is possible to recover the original biometric data more easily. In order to solve the hash function limitations, a new scheme is proposed called Fuzzy Commitment scheme (PFC). PFC uses the Papilio encryption algorithm to replace the hash function. This modification aims at preventing or even reducing the risk of discovery the link key as well as the risk to obtain the original biometric data. Figure 3 illustrates the PFC Enrolment and Authentication scheme. In this figure, the black boxes indicate the actions performed during the PFC functioning and the gray ones indicate the output of the corresponding actions.
The PFC implementation may be described as follows:
In this section, we describe how our study was planned. Section 4.1 presents our goal, research question and hypotheses. The adopted methodology is described in Section 4.3. In Section 4.2.3 the used datasets are presented and finally, in Section 4.4, system’s accuracy quantification is discussed.
Aim, research question and hypotheses
The goal of this study was to evaluate to what extent the integration of traditional protection schemes with cryptographic algorithms are correlated with biometric data storage. In order to achieve this goal, we performed a comparative analysis of how these schemes are correlated with data accuracy (Section 4.4).
Our research aims were twofold. First, we aim at evaluating whether the proposed approach (Section 3), deal with accuracy of authentication systems in a better way when compared to traditional techniques (Section 2). Second, we also aim at discussing some implementation factors that were detrimental to accuracy.
In this sense, we aim at answering the follow research question: Does the integration of different protection techniques promote an increase in system’s data accuracy? This investigation relies on the analysis of three hypotheses (H1, H2 and H3), whose null (0) and alternative (1) definitions are as follows:
H10: Data accuracy variation does not depend on protection technique used. H11: Data accuracy variation depend on protection technique used. H20: The use of Papilio Fuzzy Vault scheme does not promote gains in terms of data accuracy when compared to Fuzzy Vault. H21: The use of Papilio Fuzzy Vault scheme promotes gains in terms of data accuracy when compared to Fuzzy Vault. H30: The use of Papilio Fuzzy Commitment scheme does not promote gains in terms of data accuracy when compared to Fuzzy Commitment. H31: The use of Papilio Fuzzy Commitment scheme promotes gains in terms of data accuracy when compared to Fuzzy Commitment.
In this section, it is discussed the nature of the datasets used in this work. We grouped datasets into two categories: original and transformed. The first category consists of well-known datasets, while transformed datasets refers to those that were generated from the original sets (see Sections 4.2.1, 4.2.2 and 4.2.3). In both groups, we are considering unibiometric and multibiometric datasets. According to [18], unibiometric datasets rely on the evidence of a single source of information for authentication (e.g., single fingerprint and face). On the other hand, multibiometrics datasets denote the fusion of different types of information (e.g., fingerprint and face of the same person, or fingerprints from two different fingers of a person) [18].
Original datasets
The original biometric dataset includes voice and fingerprint data. The voice dataset used in this work was found in TIMIT dataset [19]. According to [20], this voice biometric dataset was represented by two energy spectra representation by two algorithms (MFCC – Mel-Frequency Cepstral Coefficients and LPCC – Linear Prediction Cepstral Coefficients), thus constituting two separate datasets: MFCC and LPCC. [20] also verified that MFCC is more stable and widely used in other studies than LPCC voice biometric dataset, when the ensembles are used on it.
Due to the large number of attributes, it was decided to apply a pre-processing filter supported by WEKA tool in MFCC dataset. This way, it was created a biometric dataset using only 33 attributes. The attribute selection criteria was based on the results acquired through the same tool, using classifiers such as IBK (Knn) [21].
According to [20], fingerprint biometric dataset was collected in the verification competition fingerprints, in 2004. The target dataset contains 800 images of fingerprints, divided into 100 classes (users). The fingerprint features were extracted using the NFIS2 (NIST Fingerprint Image Software 2). The set of features is formed by coordinate values
In this study, the multibiometric datasets were created using the combination by column and line combinations. In the column combination,we put together all attributes of both datasets in one instance. Therefore, we will have the same number of instances and the sum of attributes of both datasets. On the other hand, in the line combination, we put together all instances of both datasets (in case of different number of attributes, the smallest number of attributes is considered). As all considered attributes were normalized real numbers, we could put together all instances. In this combination, we will have the same number of attributes and the sum of instances of both datasets.
Transformed datasets
Original voice dataset (Section 4.2.1) was transformed by applying the Cancellable technique BioHashing described in Section 2.2. Original FingerPrint dataset 4.2.1 was transformed by applying the technique of BioConvolving showed in Section 2.2. Table 1 summarizes the configuration of each voice and fingerprint dataset. Using the Weka Tools and applying the technique of pre-processing (Selecting or Filtering Attributes), it was generated a transformed biometric dataset, which has from 100 to 800 users with 50 selected attributes in the Voice dataset.
The original and transformed fingerprint dataset have 71 and 50 attributes, respectively. The columns # Attributes and # Classes represents number of attributes and number of classes for each biometric dataset respectively.
Setup – Dataset – Voice – FingerPrint
Setup – Dataset – Voice – FingerPrint
The configuration of the original and transformed multibiometric dataset used in this study is presented in Table 2.
Multibiometrics dataset – original and transformed data
Multibiometrics dataset – original and transformed data
For the original dataset were developed four datasets: (1) [FVCO – Finger Voice Original Column], (2) [VFCO – Voice and Finger Column Original], (3) [FVLO – Finger Voice Line Original] and (4) [VFLO – Voice and Finger original Line]. These acronyms represent the data combination conducted to create the multibiometric datasets (see Section 4.2.1). The dataset FVCO contains, for instance, uses Fingerprint and Voice, using the Column combination in the Original dataset. On the other hand, VFCO uses Voice and Fingerprint, using the Column combination in the Original dataset.
We also generated 4 transformed datasets, which are (1) [FVCT – Finger Voice Column processed], (2) [VFCT – Voice and Finger Column Transform], (3) [FVLT – Finger Voice Line Transform] and (4) [VFLT – Voice and Finger (printing) Line Transform]. Once again, each abbreviation refers to the the combination technique of each multibiometric transformed dataset. For example, FVLT dataset is composed by the line combination of the transformed fingerprint and voice datasets.
Figure 4 illustrates the acronym assignment of all multibiometric datasets, for both original and transformed scenarios.

Architecture of our study, which is composed by fuzzy schemes which are applied to biometric and multibiometric datasets.
The analysis structure is illustrated in Fig. 5. The original and modified fuzzy schemes were applied to twelve biometric datasets (6 for the original scenario and 6 for the transformed scenario). Thus, we run 48 experiments. For each case, we collected accuracy and standard deviation of each scheme for their respective biometric dataset.
The methodology of the experiments conducted in this work will be described in the following steps.
To modify the Papilio algorithm to be applied in the Papilio Fuzzy Vault scheme, according to Section 3.1; To implement the Papilio Fuzzy Commitment (PFC) and Papilio Fuzzy Vault (PFV) schemes; To use biometric dataset Voice and Fingerprint together with the cancellable techniques in order to generate the processed datasets; To generate multibiometric datasets from voice and fingerprints biometric datasets; To apply the schemes FC and FV on all 12 biometric datasets, as described in Section 4.2; To apply the modified schemes PFC and PFV on all 12 biometrics datasets; To quantifify the accuracy of each scheme applied on all biometric datasets; To apply statistical tests and compare the result of the schemes (FC, FV, PFC, PFV); To analyze the results (accuracy level) as well as comparisons between fuzzy schemes and Papilio Fuzzy Shcemes proposed in this work.
The data accuracy of each protection scheme (Sections 2 and 3) is calculated by dividing the number of correctly classified patterns by the total of biometric data patterns. Let us consider that a given dataset contains biometric data for
The schemes proposed in Sections 2 and 3 are applied to our datasets (Section 4.2). The accuracy for each user in the dataset is calculated following the equation
Based on this equation and considering that Fig. 6 contains the storage biometric data for a given user, we can say that the accuracy for each user is calculating dividing the number of hits (The user successfully authentication is labelled as – “Macth”) by the total of instances related to the user (number of user instances). In the case, where the user is not successfully authenticated is associated the “Not Macth”.
Generalizing this reasoning, the dataset accuracy for all users can be calculated by sum from each user accuracy and described by the equation
Accuracy Measurement Illustration: The user authentication step output is labelled by “match” and “not match”.
In this section, we present and discuss our findings using the modified versions of the template protect schemes presented in Section 3. Given the need of establishing a reliable statistical basis for our discussion, we applied a statistical test (see Section 5.1) to the obtained results from the Papilio Fuzzy Vault (see Section 5.2) and Papilio Fuzzy Commitment (see Section 5.3) schemes. In these sections the accuracy levels of the modified versions of the template protected schemes are compared to the original ones. Finally, in Section 5.4 we summarize our findings.
Statistical tests
For the statistical tests performed in Sections 5.2 and 5.3, we used the BioEstat 5.0 tool [22]. This tool is widely used in the literature. It has numerous statistical test implementations and it provides effective assistance in the results analysis (Sections 5.2 and 5.3). We applied the Wilcoxon test [23] and the goal was to obtain evidence about the superiority. We define
Papilio Fuzzy Vault performance
We applied the PFV scheme to all 12 biometric datasets, as described in Section 4.2. In Table 3, we illustrate the results obtained of PFV, in terms of data accuracy (third column) along with its corresponding standard deviation. Other columns (first, second and fourth) respectively represent the biometric dataset name, the number of attributes of each biometric dataset and standard deviation for each biometric dataset.
Results – Papilio Fuzzy Vault
Results – Papilio Fuzzy Vault
Comparing the results of the original datasets with the transformed ones (Table 3), it is possible to observe that the accuracy level among the attribute values was maintained in the transformed scenarios. This happens because the fuzzy schemes have achieved a similar performance for both scenarios. On the other hand, the use of multibiometric datasets (the last 8 lines of Table 3) did not promote accuracy improvement when compared to mono biometric datasets’ best results. Nevertheless, the voice dataset achieve
As illustrated in Table 3 the accuracy levels of both schemes were quite similar. This is an expected result because PFV and PFC schemes apply similar protection techniques during their processing (enrolment and authentication phases). The only difference between these two schemes is related to the encryption of the link key and the formulation of the polynomial used on the stored biometric data. As discussed in Section 3, the link key is encrypted in a parallel process in the PFV scheme (different from Fuzzy Vault Scheme). This difference implies that the link key is not an understandable sequence in the PFV scheme’s inner process. In contrast, it is an understandable sequence in the FV scheme. Therefore, if it is lost or stolen, the attacker will not understand the information stored on it. Even if the used polynomial equation is discovered, the key link is still protected by Papilio algorithm, which was used at the initial stage of the PFV. In this sense, the proposed PFV scheme maintains the accuracy results, while increasing the security of the biometric data.
As both fuzzy schemes obtained the same accuracy levels, we did not apply the statistical test since it is clear that there is no improvement, in terms of accuracy, from a statistical point of view. The obtained results had an expected pattern of behaviour. As mentioned previously, the Papilio Fuzzy Vault and Fuzzy Vault schemes obtained similar results. This occurs because the changes on PFV are related solely to the scope of protection and these changes do not affect the PFV level of accuracy. In other words, there is no difference in the matching process. However we applied the Papilio algorithm in order to increase protection technique used on biometric dataset by diffusion and confusion characteristics into Papilio process. In this sense, the security levels of stored biometric data are increased.
Similar to PFV, we apply the PFC scheme to the same 12 datasets previously mentioned (see Section 5.2). In Table 4, we illustrate the results of PFV in terms of accuracy (third column), along with its corresponding standard deviation.
Results – Papilio Fuzzy Commitment
Results – Papilio Fuzzy Commitment
When comparing the performance of the original and transformed dataset in Table 4, we can observe that fuzzy schemes on Transformed Voice Dataset accuracy decreased when compared to the Original Voice Dataset. However, the opposite behaviour was observed in the Fingerprint dataset results, in which the use of a transformation function caused an increase in the accuracy. In the multibiometric datasets, the original multibiometric datasets had slightly higher accuracy than the corresponding original ones. When analysing the effect of the multibiometric datasets, we can also observe that the results on multibiometric datasets had a similar behaviour to FV, in which its use did not cause improvement of the accuracy level, when compared with the best individual biometric datasets.
Comparative Fuzzy Commitment and Papilio Fuzzy Commitment – Voice, FingerPrint and Multibiometrics
Table 5 shows the comparison between PFC and FC, for all datasets. Based on the obtained results, it is possible to see that FC managed to provide slightly higher accuracy level than FC, for almost all datasets. Therefore, it is possible to conclude that FC was slightly better that PFC, in terms of accuracy.
When we compare these schemes (FC and PFC) in each dataset, we can see that the PFC scheme obtained a slightly higher accuracy when applied to only one transformed biometric dataset (Trans. Finger). For the other datasets, including the one in the multibiometric context (the last 8 lines of Table 5), FC provided slightly higher accuracy levels in all 8 multibiometric datasets. It is possible show that the proposed scheme (PFC) cause a slightly decrease in the accuracy levels of the original fuzzy scheme. However, the caused decrease is very small and this occurs because the PFC was not strongly modified in its matching process, when compared to FC. In this way, it is possible to identify the existing relationships almost in the same way as FC. In addition, PFC increased security of the stored data by replacing the hash algorithm by the Papilio algorithm. Therefore, we can state that the obtained results showed a good performance of the proposed scheme.
Table 5 also illustrates the results of the statistical tests. The best result (highest accuracy level) achieved by the fuzzy schemes is highlighted in the shaded cells. In addition, the bold numbers correspond to the best results that are statistically significant, as a result of the application of the Wilcoxon test with a significance level of 0.05.
According to results of Table 5, we can observe that in only one case, the difference in accuracy of both fuzzy schemes proved to be statistically significant. The interesting fact is that this only case happened when PFC had higher accuracy level than FC. Therefore, we can state that although the use of PFC caused a decrease in the accuracy level of FC, this decrease is not to be statistically significant (both accuracies are nearly identical). When PFC caused a increase in the accuracy level, this improvement was proved to be significant, from a statistical point of view.
In summary, the modifications proposed and implemented in the Fuzzy Commitment scheme aimed to increase security in the original systems, while maintaining their respective accuracy. After an analysis of the obtained results, it was possible to state this aim was achieved since PFC maintained the accuracy level of original scheme, while inserting a layer of security in its processing (using the Papilio algorithm).
In general, the obtained results indicate that both proposed fuzzy schemes achieved similar accuracy levels, for all analysed biometric datasets. Based on the results we can state that hypothesis H11 was accepted in relation to the transformed datasets, since it was possible to apply protection schemes directly into biometric data without incurring a significant decrease in the accuracy of these systems. Thus, based on the obtained results, we can state that it is possible to use a protected biometric dataset into authentication system. In addition, hypotheses H20 and H30 were accepted, because the obtained results showed that the changes made in the proposed PFV and PFC were not able to promote gains in the accuracy of the original schemes. However, they also did not promote loss in the accuracy levels, since they provided similar accuracy levels, when compared to the original ones.
When analysing the multibiometric context, we had to do some adjustments made in each dataset, such as: match the same number of attributes for FVCO dataset and VFCO, as well as FVCT and VFCT. Then, we observed the same level of accuracy presented on the original individual datasets (Original Voice and Original Fingerprint). However, the accuracy level decreased slightly, when compared to the best individual datasets.
PFC and PFV Advantages
The addition of the safety level presented some advantages for the proposed methods. First, as the original methods, it is possible to store biometric data in a protected way. In this case, if the dataset was stolen, some malicious individuals fail to understand the stolen information and, in this way, the process to create a new biometric template will be easier. Second, unlike the original fuzzy schemes, the proposed schemes (PFC and PFV) offer greater complexity in the pre-processing of biometric data, making the stored data templates safer as well as the protection schemes more reliable. In increasing the protection of the stored data, it is more difficult for a template protection method to suffer different types of attacks. Third, when compared to the original FV, in the PFV method, an extra layer of protection has been added, using the Papilio algorithm to decrypt the user link key. In this sense, in case of being lost or stolen, the user’s link key can not be recovered, making the system more secure. Finally, when analysing the replacement of a hash function for the Papilio algorithm in the Fuzzy Commitment scheme (FC), it makes the system more difficult for attackers, since it eliminates the possibility of correlation attacks in FC, which is one known disadvantage of hash functions. In this way, it consequently reduces the possibility of obtaining the original data from the stored (encrypted) data.
PFC and PFV Disadvantage
Despite the changes in both fuzzy schemes improved the safety level in relation to the original ones, the main disadvantage of the use of the Papilio algorithm is related to the increase in the time processing of the the fuzzy schemes. The Papilio algorithm causes the addition of one more component in the processing of the fuzzy schemes, increasing the processing time of the proposed fuzzy schemes.
Conclusion
In this work, we proposed two templates protection methods based on Fuzzy Vault and Fuzzy Commitment schemes. The first one, called PFV (Papilio Fuzzy Vault), was modified in order to use the Papilio algorithm to protect a user link key and biometric traits, and to generate polynomial terms to be used in the original Fuzzy Vault scheme. Thus, the link key of the protected data is preserved in the enrolment step. In case of finding the generator polynomial by brute-force attack methods, the binding key remains unknown by the attacker. The second proposed method, called PFC (Papilio Fuzzy Commitment), also applies the Papilio algorithm in a data protection method (Fuzzy Commitment), but it is applied twice. Firstly, the Papilio algorithm is applied to protect the user’s link key which is inserted in the application environment and, subsequently, in the registration and the authentication environment.
The Papilio algorithm is also applied to replace a hash function in order to decode the returned value of detector and error correction code (Reed-Solomon). This change reduces the risk of mapping input and output values, that exist in a hash function.
In order to assess the use of PFV and PFC schemes, we carried out an exploratory study to analyse accuracy of voice and fingerprint datasets. We also used the multibiometric context, which was obtained by the parallel use of two or more biometric modalities and aimed to increase the reliability of a biometric authentication system. This exploratory study aimed to evaluate the impact of using the proposed fuzzy schemes, in comparison with the original ones.
The obtained results showed that it was possible to increase an extra layer of security in fuzzy schemes, while still maintaining a satisfactory performance at the same level of the original schemes. Similarly, the inclusion of Papilio solved the problem of the link key protection in Fuzzy Vault as well as fixed the problem in the use of existing hashes functions in Fuzzy Commitment. Therefore, the development of these proposed schemes increased the security for an user in biometric-based authentication systems.
As future work, it is possible to use other cryptographic algorithms and propose different changes according to the type of biometric traits. In this sense, it is interesting to have a scenario where the protection schemes provide higher levels of protection, as well as provide greater reliability in terms of accuracy for the validation of the biometric characteristics of the users. According to the results presented in this work, we can achieve better rates of (i) security and (ii) accuracy when we using more robust cryptographic algorithms to increases the protection on stored dataset. in order to better schemes accuracy indices, it is necessary to improve the treatment (pre-proceeding) applied to the biometric datasets. Thus, we need to find a better way to numerically represent the biometric characteristics of a user. Another possibility would be to work with multibiometric systems (three or more biometric characteristics).
Therefore, these new scenarios, and the possibility of future changes settings, figures used alternative biometric data, among others, show that there is still a long way to find an ideal scheme, that is safe and efficient regardless of the biometric modality used.
Footnotes
Entropy of a biometric template that can be understood as a measure of the number of different identities that are distinguishable by a biometric system.
Attempt to find two inputs that produce the same hash value.
