Abstract
Bayesian network classifiers (BNCs) provide a sound formalism for representing probabilistic knowledge and reasoning with uncertainty. Explicit independence assumptions can effectively and efficiently reduce the size of the search space for solving the NP-complete problem of structure learning. Strong conditional dependencies, when added to the network topology of BNC, can relax the independence assumptions, whereas the weak ones may result in biased estimates of conditional probability and degradation in generalization performance. In this paper, we propose an extension to the
Keywords
Introduction
Classification is one of the key issues in data mining and machine learning [1]. The classifier learned from data can be described as the mapping relationship between predictive attributes
Naive Bayes (NB) [13] is an extremely simple and remarkably effective BNC. Its explicit independence assumption determines the network topology and the form of factorization of joint probability. However, its independence assumption is often violated in practice and as a result its probability estimates are often suboptimal. A large literature explores the approaches, e.g., attribute weighting [14, 15], attribute selection [16, 17], instance weighting [18, 19], instance selection [20, 21] and structure extension [1, 22, 23], to improving the classification performance while handling real-world problems with complex dependencies. Among these approaches, structure extension is the most natural and direct way to alleviate NB’s unrealistic independence assumption by adding augmented edges between attributes to the network topology.
Among the BNCs based on structure extension, the
To the best of our knowledge, previous research focuses on the study of (in)dependency relationships learned from training data only. However, if the relationships can not be mapped into the ones implicated in the testing instance, the learned network topology may be biased and the generalization performance will be degraded, whereas the inter-operability between these two kinds of relationships has not been investigated fully in previous studies. To achieve the trade-off between data fitting and classification, information-theoretic metrics should be redefined to mine significant instantiated (in)dependency relationships from specific testing instance. A general learning framework is urgently needed to identify the significant conditional (in)dependencies respectively implicated in labeled and unlabeled data. The main contributions are listed as follows:
Information-theoretic and probability-theoretic metrics are introduced to measure explicit dependency relationships or to identify implicit independency relationships. By inferring the implicit independence assumption from the approximate expression of conditional probability, we extend KDB to learn conditional dependence conditioned on class variable or predictive attribute. The resulting highly scalable algorithm combines the low variance of instance learning with the low bias of ensemble learning. We compare our proposed
Section 2 reviews the state-of-the-art BNCs and then discusses the issue of implicit independence assumption based on the analysis of explicit independence assumption. Our novel techniques for identifying implicit independence implicated in labeled and unlabeled data are described in Section 3 where we also discuss their connection with KDB in terms of information-theoretic metrics. Section 4 presents the experimental evaluation of our proposed algorithm with related approaches. Section 5 shows the main conclusions and outlines future work.
Related work
A BNC provides a framework for encoding a joint probability distribution over a set of finite attributes
where
Explicit independence assumption directly defines the independency relationships under certain conditions without any prior domain knowledge. NB takes the independence assumption to the extreme by assuming the attributes are conditionally independent given the class, and as shown in Fig. 1a the network topology of NB can be learned from the independence assumption rather than training data. The estimate of
NB’s unrealistic assumption can be described in the probabilistic form as
An example of (a) NB and (b) SPODE
Superparent-one-dependence estimator (SPODE) assumes that all attributes depend on the same attribute, namely the superparent attribute, in addition to the class [30]. As shown in Fig. 1b, the SPODE with the superparent attribute
An ensemble of SPODEs performs much better than one single SPODE more often than not. Averaged one-dependence estimators (AODE) [30] uses uniform weights to ensemble all qualified SPODE members and estimates posterior probability by averaging them. For different SPODE members in AODE, the independence assumptions are different due to the variations in the superparent attributes. Thus they can not fit training data to the same extent and demand differential treatment. By assigning distinctive weights to the SPODE members, model weighting can help calibrate the joint probability of the final weighted AODE. Duan et al. [31] propose an instance-based weighting approach, for which the weights are defined by instantiated information-theoretic metrics and may vary from instance to instance. Wang et al. [32] propose to assign each SPODE a discriminative weight by identifying the differences among these SPODEs in terms of log likelihood. Jiang et al. [33] propose to respectively apply area under the ROC curve (AUC), classification accuracy, conditional log likelihood and mutual information as the weighting metrics of AODE.
By adding augmented edges to the network topology of NB, its independence assumption can be alleviated to a certain extent. However, no BNC can fully represent the dependency relationships in practice due to the restriction in computational complexity and precision of the probability estimates, and implicit independence assumptions are introduced accordingly. Given the network topology
where
Thus the network topology
where
An example of (a) TAN and (b) KDB with 
TAN can obviously improve the prediction accuracy of NB when its assumption is violated [1], but it ignores the influences from other attributes due to its restriction in topology complexity. Hence, when stronger and more complex attribute dependencies do exist, some dependencies have to be discarded. KDB avoids TAN’s restriction by allowing each attribute to have at most arbitrary
where
KDB can tune the parameter
KDB simply applies
The prior and joint probabilities can be estimated using the
where
KDB assumes that the attributes that are more correlated with the class are preferable, and the attributes are sorted in descending order of
Given attribute order
For simplicity, the criterion
The attributes enter into the network topology of KDB according to a pre-determined attribute order, e.g.,
Starting from
The results of
The learning process of KIBC with 
The learning process of KIBC
Calculate MI for all attributes. // See Eq. (12)Let
Sort attributes in
How to improve the generalization performance of BNCs learned from training data is always a challenging issue in data mining [40]. For one specific testing instance, KIBC
The criterion for identifying implicit independence assumption turns to be
The learning process of KIBC
Calculate instance-based MI for all attributes. // See Eq. (13)Let
Sort attributes in
The architecture of KIBC.
Since the joint probability distributions encoded in base learners are approximations of the true one, it is natural to consider aggregating them together to yield a much more accurate probability distribution estimation [42]. Meanwhile, to achieve a better result, the base learners should be as accurate as possible, and as diverse as possible [43]. As shown in Fig. 4, KIBC
where
When
Descriptions of 40 UCI datasets used in the experiments
The datasets with missing values are denoted with the symbol “*”.
To evaluate the performance of our proposed KIBC, we conduct a group of experiments on 40 UCI datasets1 in terms of ZOL, RMSE, bias and variance. Meanwhile, the Friedman and Nemenyi tests are used to explore the statistical significance of the experimental results. The details of all datasets are shown in Table 2, including the number of instances, attributes and classes. These 40 datasets except lymphography, house-votes-84, chess, crx, tic-tac-toe, led, kr-vs-kp, phoneme, mushrooms and connect-4 contain at least one numeric attribute. For each dataset, numeric attributes are discretized using Minimum Description Length (MDL) discretization [44]. The missing values for qualitative attributes and those for quantitative attributes are respectively replaced with the value that appears most frequently and the mean value in all cases. Each algorithm is processed with 10 rounds of 10-fold cross-validation. We compare KIBC with other seven competitors including two single models and five ensemble models, which are shown as follows:
SKDB [38], selective KDB with CFWNB [15], correlation-based feature weighting filter for NB. WATAN [36], weighted averaged TAN. IWAODE [31], instance-based weighting AODE. WAODE-MI [33], weighted AODE by assigning weights using MI. TAODE [32], targeted AODE. DWAODE [45], double weighting schema of AODE.
Note that, to achieve the trade-off between efficiency and classification accuracy, we restrict the structure complexity of KIBC to be two-dependence (i.e.,
Zero-one loss and RMSE
ZOL [47] is a commonly used loss function to validate the prediction accuracy. Table 3 shows the ZOL experimental results of the one-tailed
Comparisons for KIBC and the alternative classifiers in terms of ZOL
Comparisons for KIBC and the alternative classifiers in terms of ZOL
To further demonstrate the advantages of KIBC intuitively, Fig. 5 shows the scatter plots of the comparison results of KIBC against other algorithms in terms of ZOL. Points that fall close to the diagonal line indicate that KIBC has very close performance to the alternative algorithms. As we can observe that most data points are under the diagonal line, which means that KIBC performs much better than other algorithms, and the advantages are significant and obvious.
Scatter plot of comparisons in terms of ZOL.
Scatter plot of comparisons in terms of RMSE.
Comparisons for KIBC and the alternative classifiers in terms of RMSE
RMSE is usually used to measure the deviation between the observed value and the true value [48]. In this section, we use RMSE to measure the calibration of class probability predictions of a model. The comparison results with other 7 algorithms in terms of RMSE are shown in Table 4. As can be seen, KIBC enjoys obvious advantages over SKDB (36 wins and 4 losses), CFWNB (28 wins and 11 losses) and WATAN (35 wins and 5 losses). Meanwhile, KIBC performs slightly better than AODE’s variants including IWAODE (27 wins and 13 losses), WAODE-MI (26 wins and 14 losses), TAODE (26 wins and 14 losses) and DWAODE (29 wins and 11 losses). The scatter plots in Fig. 6 show the comparison results of KIBC against other classifiers in terms of RMSE. A diamond symbol indicates that KIBC performs better than alternative classifiers over the corresponding dataset in terms of RMSE, and a cross under the dotted line means worse results for KIBC. Note that some outliner points are removed for significance analysis. As can be seen, most data points are diamond symbols, which indicates that KIBC performs much better than other algorithms.
In this section, the bias-variance decomposition is used to further analyze the performance of models. Bias measures how closely the model can describe the decision surfaces, and variance reflects the model’s sensitivity to variations in the training set [49]. The experimental results in terms of bias and variance are shown in Table 5. As can be observed, KIBC performs the best among all algorithms in terms of bias. For example, KIBC beats CFWNB on 32 datasets and loses on 8, and KIBC beats WATAN on 35 datasets and loses on 5. Meanwhile, KIBC also performs much better than AODE’s variants including IWAODE (32 wins and 8 losses), WAODE-MI (29 wins and 10 losses), TAODE (28 wins and 11 losses) and DWAODE (27 wins and 12 losses). The experimental results demonstrate that KIBC fits the datasets better than other algorithms. Variance-wise, we can observe that KIBC performs better than SKDB (33 wins and 6 losses) and WATAN (30 wins and 10 losses). Note that, CFWNB and AODE’s variants have excellent performance since their structures are definite, that is, they are not sensitive to variations in datasets.
Comparisons for KIBC and the alternative classifiers in terms of bias and variance
Comparisons for KIBC and the alternative classifiers in terms of bias and variance
Average ranks of the algorithms
Average ranks of the algorithms
The comparison results of the Nemenyi test in terms of (a) ZOL, (b) RMSE, (c) bias and (d) variance on 40 datasets. CD 
To explore the statistical significance of the experimental results, we perform the Friedman test [50] in terms of ZOL, RMSE, bias and variance. The null hypothesis of the Friedman test is that there is no difference in average ranks. With 8 classifiers and 40 datasets, the Friedman statistic is distributed according to the
To further explore which classifier is significantly different from others, we conduct the Nemenyi test [51] and show the results in terms of ZOL, RMSE, bias and variance in Fig. 7. The classifiers are plotted on the left line and their corresponding average ranks are plotted on the right line. If the difference between a pair of classifiers is greater than the Critical Difference (CD) [51], the difference is supposed to be significant. With 8 classifiers and 40 datasets, the CD for
Independence assumption is one of the most direct and promising ways to address the issue of the NP-hard problem for learning an optimal BNC. The BNCs except FBC assume independence assumption, explicitly or implicitly. Exploring the reasonableness of the independence assumption is one of the key issues for learning robust BNCs from data. We prove theoretically that the information-theoretic metrics applied by high-dependence BNCs (e.g., KDB) are not strictly appropriate to measure the extents to which the learned joint probability fits data. Thus we propose to verify the implicit independence assumption behind the learned network topology, and that can help build a robust BNC and improve the generalization performance. By aggregating the predictions of KIBC
Footnotes
Acknowledgments
This work is supported by the National Key Research and Development Program of China (No. 2019YFC1804804), Open Research Project of the Hubei Key Laboratory of Intelligent Geo-Information Processing (No. KLIGIP-2021A04), and the Scientific and Technological Developing Scheme of Jilin Province (No. 20200201281JC) and High Performance Computing Center of Jilin University, China.
Appendix A
Experimental results of ZOL The value in boldface indicates the classifier with the best performance.
Dataset
SKDB
CFWNB
WATAN
IWAODE
WAODE-MI
TAODE
DWAODE
KIBC
lymphography
0.2365
0.1486
0.1689
0.1554
0.1554
0.1554
0.1486
iris
0.0867
0.0800
0.0867
0.0867
0.0867
0.0733
0.0667
teaching-ae
0.5364
0.5099
0.5364
0.4570
0.4503
0.4636
0.4702
wine
0.0225
0.0337
0.0169
0.0169
0.0281
0.0225
0.0169
glass-id
0.2196
0.2196
0.2196
0.2570
0.2523
0.2103
0.2009
primary-tumor
0.5723
0.5634
0.5457
0.5752
0.5782
0.5929
0.5575
ionosphere
0.0912
0.0854
0.0684
0.0712
0.0712
0.0741
0.0741
dermatology
0.0628
0.0191
0.0328
0.0191
0.0191
0.0191
0.0191
horse-colic
0.2446
0.2120
0.2011
0.2011
0.2092
0.2092
0.2011
house-votes-84
0.0552
0.0781
0.0529
0.0483
0.0506
0.0529
0.0483
chess
0.0998
0.1379
0.0926
0.1034
0.0944
0.0799
0.0907
credit-a
0.1464
0.1507
0.1391
0.1362
0.1507
0.1507
0.1420
crx
0.1565
0.1478
0.1319
0.1377
0.1391
0.1406
0.1319
vehicle
0.2943
0.3711
0.2943
0.2896
0.2872
0.2766
0.2849
anneal
0.0100
0.0534
0.0100
0.0178
0.0089
0.0089
tic-tac-toe
0.2035
0.3100
0.2265
0.2662
0.2724
0.2630
0.2568
vowel
0.1818
0.3050
0.1263
0.1697
0.1949
0.1323
0.1788
led
0.2620
0.2630
0.2660
0.2700
0.2680
0.2690
0.2690
contraceptive-mc
0.5003
0.4895
0.4942
0.4922
0.4902
0.4915
0.4874
mfeat-mor
0.3085
0.3060
0.3120
0.3130
0.3105
0.3075
0.3030
segment
0.0459
0.0640
0.0394
0.0333
0.0338
0.0346
0.0355
hypothyroid
0.0107
0.0139
0.0104
0.0123
0.0104
0.0111
0.0107
kr-vs-kp
0.0644
0.0776
0.0826
0.0576
0.0773
0.0726
0.0457
dis
0.0138
0.0156
0.0154
0.0127
0.0143
0.0125
0.0278
hypo
0.0114
0.0121
0.0130
0.0114
0.0101
0.0119
0.0148
sick
0.0259
0.0257
0.0260
0.0244
0.0249
0.0294
0.0228
spambase
0.0641
0.0858
0.0669
0.0646
0.0648
0.0602
0.0659
phoneme
0.1916
0.2407
0.2345
0.2104
0.2308
0.2427
0.2444
wall-following
0.0315
0.0720
0.0550
0.0464
0.0367
0.0361
0.0372
page-blocks
0.0391
0.0416
0.0418
0.0325
0.0347
0.0327
0.0327
satellite
0.1726
0.1207
0.1117
0.1148
0.1147
0.1125
0.1124
mushrooms
0.0080
0.0001
0.0002
0.0002
0.0002
thyroid
0.0683
0.0817
0.0723
0.0706
0.0655
0.0629
0.0593
sign
0.2539
0.3700
0.2752
0.2789
0.2768
0.2743
0.2748
magic
0.1637
0.2033
0.1674
0.1744
0.1762
0.1725
0.1721
letter-recog
0.0986
0.2479
0.1300
0.0854
0.0853
0.0838
0.0925
adult
0.1383
0.1499
0.1380
0.1502
0.1445
0.1558
0.1601
shuttle
0.0008
0.0020
0.0014
0.0011
0.0009
0.0008
0.0007
connect-4
0.2283
0.2847
0.2354
0.2409
0.2406
0.2374
0.2357
localization
0.4936
0.3575
0.3593
0.3566
0.3544
0.3721
0.3064
Experimental results of RMSE The value in boldface indicates the classifier with the best performance.
Dataset
SKDB
CFWNB
WATAN
IWAODE
WAODE-MI
TAODE
DWAODE
KIBC
lymphography
0.3031
0.2419
0.2705
0.2496
0.2501
0.2522
0.2419
iris
0.1973
0.1958
0.2024
0.2091
0.2077
0.2132
0.1919
teaching-ae
0.4804
0.4762
0.4689
0.4668
0.4644
0.4734
0.4728
wine
0.1214
0.1416
0.1001
0.0983
0.1021
0.1038
0.1042
glass-id
0.3387
0.3315
0.3237
0.3422
0.3409
0.3364
0.3263
primary-tumor
0.1851
0.1790
0.1812
0.1855
0.1864
0.1885
0.1810
ionosphere
0.2822
0.2765
0.2613
0.2546
0.2489
0.2464
0.2521
dermatology
0.1207
0.0850
0.0661
0.0688
0.0698
0.0660
0.0794
horse-colic
0.4348
0.4215
0.3990
0.4022
0.4008
0.4020
0.4040
house-votes-84
0.2107
0.2558
0.2181
0.1998
0.1927
0.1968
0.1960
chess
0.2615
0.3208
0.2594
0.2835
0.2603
0.2502
0.2611
credit-a
0.3480
0.3407
0.3271
0.3236
0.3350
0.3389
0.3286
crx
0.3525
0.3415
0.3259
0.3219
0.3322
0.3355
0.3205
vehicle
0.3123
0.3611
0.3103
0.3095
0.3099
0.3083
0.3080
anneal
0.0519
0.1240
0.0538
0.0699
0.0536
0.0529
0.0560
tic-tac-toe
0.3772
0.4334
0.4023
0.3992
0.4085
0.3984
0.3925
vowel
0.1583
0.1982
0.1463
0.1633
0.1324
0.1297
0.1538
led
0.2007
0.2163
0.1991
0.1975
0.1980
0.1996
0.1990
contraceptive-mc
0.4485
0.4392
0.4392
0.4385
0.4394
0.4410
0.4405
mfeat-mor
0.1978
0.1943
0.1979
0.1983
0.1980
0.1989
0.1951
segment
0.1033
0.1195
0.0968
0.0879
0.0881
0.0873
0.0914
hypothyroid
0.0937
0.1065
0.0951
0.0994
0.0967
0.0974
0.0969
kr-vs-kp
0.1867
0.2779
0.2358
0.2635
0.2343
0.2561
0.2506
dis
0.1024
0.1130
0.1098
0.1058
0.1046
0.1047
0.1466
hypo
0.0671
0.0739
0.0723
0.0698
0.0685
0.0751
0.0660
sick
0.1382
0.1498
0.1426
0.1547
0.1452
0.1511
0.1571
spambase
0.2293
0.2657
0.2402
0.2317
0.2301
0.2239
0.2266
phoneme
0.0754
0.0806
0.0844
0.0795
0.0871
0.0891
0.0912
wall-following
0.1097
0.1744
0.1570
0.1433
0.1293
0.1298
0.1299
page-blocks
0.1128
0.1117
0.1187
0.0986
0.1025
0.1013
0.1021
satellite
0.1778
0.2316
0.1849
0.1774
0.1800
0.1799
0.1799
mushrooms
0.0857
0.0081
0.0114
0.0062
0.0121
0.0129
0.0004
thyroid
0.0731
0.0789
0.0742
0.0734
0.0715
0.0706
0.0701
sign
0.3334
0.3929
0.3504
0.3516
0.3519
0.3487
0.3494
magic
0.3470
0.3709
0.3461
0.3534
0.3526
0.3519
0.3501
letter-recog
0.0768
0.1139
0.0859
0.0693
0.0695
0.0737
adult
0.3089
0.3150
0.3076
0.3250
0.3197
0.3297
0.3344
shuttle
0.0140
0.0270
0.0177
0.0159
0.0131
0.0125
0.0135
connect-4
0.3632
0.3315
0.3359
0.3356
0.3339
0.3324
0.3259
localization
0.2402
0.2095
0.2093
0.2087
0.2081
0.2128
0.1971
Experimental results of bias The value in boldface indicates the classifier with the best performance.
Dataset
SKDB
CFWNB
WATAN
IWAODE
WAODE-MI
TAODE
DWAODE
KIBC
lymphography
0.1041
0.1647
0.0978
0.0857
0.0951
0.0931
0.0959
iris
0.0560
0.0618
0.0664
0.0656
0.0592
0.0776
0.0570
teaching-ae
0.4606
0.3989
0.4990
0.4616
0.3984
0.4198
0.4504
wine
0.0508
0.0531
0.0317
0.0381
0.0376
0.0322
0.0259
glass-id
0.2713
0.2748
0.2818
0.2780
0.2785
0.2969
0.2714
primary-tumor
0.4143
0.4224
0.4188
0.4247
0.4324
0.4323
0.4212
ionosphere
0.0826
0.0813
0.0823
0.0881
0.0751
0.0764
0.0787
dermatology
0.0449
0.0114
0.0263
0.0065
0.0061
0.0058
0.0134
horse-colic
0.1689
0.1816
0.1899
0.2007
0.1897
0.1937
0.1911
house-votes-84
0.0229
0.0575
0.0393
0.0493
0.0406
0.0429
0.0428
chess
0.1119
0.1398
0.1397
0.1286
0.1230
0.1143
0.1110
credit-a
0.1137
0.1301
0.1123
0.0900
0.0940
0.0940
0.0995
crx
0.1197
0.1332
0.1148
0.0953
0.0985
0.0970
0.0991
vehicle
0.2485
0.3016
0.2435
0.2398
0.2412
0.2394
0.2425
anneal
0.0610
0.0194
0.0181
0.0194
0.0214
0.0185
0.0135
tic-tac-toe
0.1367
0.2257
0.1742
0.1994
0.2104
0.2008
0.1901
vowel
0.1755
0.2487
0.1842
0.2249
0.1811
0.1698
0.1803
led
0.2317
0.2387
0.2327
0.2331
0.2325
0.2327
0.2340
contraceptive-mc
0.3702
0.3759
0.3781
0.3766
0.3735
0.3643
0.3497
mfeat-mor
0.2136
0.2455
0.2492
0.2464
0.2431
0.2445
0.2166
segment
0.0452
0.0540
0.0489
0.0436
0.0357
0.0353
0.0427
hypothyroid
0.0096
0.0133
0.0106
0.0093
0.0099
0.0099
0.0093
kr-vs-kp
0.0583
0.0700
0.0763
0.0518
0.0688
0.0613
0.0442
dis
0.0191
0.0194
0.0168
0.0179
0.0178
0.0173
0.0186
hypo
0.0077
0.0114
0.0119
0.0080
0.0078
0.0079
0.0109
sick
0.0198
0.0211
0.0206
0.0220
0.0216
0.0228
0.0257
spambase
0.0750
0.0567
0.0602
0.0574
0.0541
0.0505
0.0533
phoneme
0.1584
0.2003
0.1982
0.1829
0.2172
0.2186
0.2008
wall-following
0.0592
0.0482
0.0360
0.0253
0.0260
0.0256
0.0175
page-blocks
0.0280
0.0331
0.0305
0.0257
0.0248
0.0251
0.0258
satellite
0.1560
0.0945
0.0884
0.0902
0.0897
0.0876
0.0841
mushrooms
0.0103
0.0001
0.0004
0.0002
0.0004
0.0004
thyroid
0.0531
0.0694
0.0584
0.0648
0.0561
0.0550
0.0533
sign
0.2161
0.3435
0.2419
0.2510
0.2461
0.2446
0.2382
magic
0.1898
0.1252
0.1595
0.1541
0.1546
0.1426
0.1320
letter-recog
0.0806
0.2133
0.1033
0.0877
0.0823
0.0814
0.0792
adult
0.1220
0.1461
0.1312
0.1437
0.1387
0.1459
0.1516
shuttle
0.0007
0.0024
0.0009
0.0007
connect-4
0.2740
0.2253
0.2255
0.2237
0.2153
0.2115
0.2042
localization
0.4746
0.3105
0.3126
0.3068
0.3010
0.3062
0.2190
Experimental results of variance The value in boldface indicates the classifier with the best performance.
Dataset
SKDB
CFWNB
WATAN
IWAODE
WAODE-MI
TAODE
DWAODE
KIBC
lymphography
0.1408
0.0568
0.1084
0.0478
0.0498
0.0412
0.0653
iris
0.0400
0.0522
0.0436
0.0364
0.0388
0.0364
0.0430
teaching-ae
0.1798
0.1770
0.1564
0.1776
0.1622
0.1624
0.1636
wine
0.0644
0.0486
0.0141
0.0246
0.0251
0.0153
0.0385
glass-id
0.1189
0.1069
0.0999
0.1051
0.1004
0.0946
0.1089
primary-tumor
0.2450
0.2117
0.2413
0.1859
0.1880
0.1934
0.2248
ionosphere
0.0584
0.0399
0.0238
0.0368
0.0381
0.0332
0.0361
dermatology
0.0674
0.0240
0.0483
0.0189
0.0242
0.0213
0.0316
horse-colic
0.1384
0.1027
0.0420
0.0464
0.0514
0.0557
0.0682
house-votes-84
0.0157
0.0108
0.0172
0.0083
0.0081
0.0089
0.0086
chess
0.0531
0.0507
0.0504
0.0379
0.0448
0.0463
0.0420
credit-a
0.0768
0.0555
0.0276
0.0321
0.0360
0.0412
0.0418
crx
0.0663
0.0500
0.0240
0.0264
0.0310
0.0361
0.0365
vehicle
0.1288
0.0763
0.1294
0.1276
0.1273
0.1287
0.1263
anneal
0.0173
0.0273
0.0158
0.0161
0.0174
0.0146
0.0142
tic-tac-toe
0.1125
0.0550
0.0819
0.0529
0.0604
0.0642
0.0978
vowel
0.2285
0.2361
0.2463
0.2310
0.2284
0.2257
0.2237
led
0.0565
0.0502
0.0530
0.0398
0.0408
0.0466
0.0483
contraceptive-mc
0.1705
0.1641
0.1086
0.1106
0.1238
0.1437
0.1723
mfeat-mor
0.1047
0.1020
0.0676
0.0686
0.0730
0.0725
0.0952
segment
0.0386
0.0290
0.0204
0.0255
0.0262
0.0248
0.0250
hypothyroid
0.0024
0.0029
0.0026
0.0033
0.0030
0.0033
0.0028
kr-vs-kp
0.0112
0.0169
0.0152
0.0185
0.0119
0.0208
0.0209
dis
0.0011
0.0050
0.0036
0.0021
0.0040
0.0069
0.0010
hypo
0.0069
0.0063
0.0068
0.0056
0.0055
0.0089
0.0058
sick
0.0043
0.0048
0.0037
0.0057
0.0045
0.0068
0.0037
spambase
0.0218
0.0160
0.0094
0.0111
0.0124
0.0136
0.0173
phoneme
0.0961
0.1541
0.1270
0.1311
0.1355
0.1356
0.0898
wall-following
0.0247
0.0285
0.0283
0.0242
0.0245
0.0246
0.0165
page-blocks
0.0177
0.0145
0.0113
0.0130
0.0122
0.0125
0.0121
satellite
0.0479
0.0368
0.0325
0.0364
0.0362
0.0345
0.0416
mushrooms
0.0001
0.0001
0.0002
0.0001
0.0001
0.0002
0.0001
thyroid
0.0273
0.0253
0.0202
0.0239
0.0243
0.0251
0.0237
sign
0.0596
0.0385
0.0380
0.0403
0.0406
0.0445
0.0514
magic
0.0491
0.0490
0.0291
0.0289
0.0313
0.0359
0.0410
letter-recog
0.0709
0.0498
0.0588
0.0455
0.0457
0.0467
0.0637
adult
0.0285
0.0165
0.0109
0.0113
0.0174
0.0184
0.0156
shuttle
0.0006
0.0004
0.0004
0.0004
connect-4
0.0309
0.0149
0.0209
0.0215
0.0301
0.0320
0.0301
localization
0.1099
0.0594
0.0577
0.0632
0.0657
0.0836
0.1124
