Counting the number of words and lines read by fusing eye tracking and character recognition data: A Bayes factor approach

Abstract

Counting the number of words and lines that a user reads is important for many educational purposes – e.g., the reading speed is a key factor to improve learning, intelligent systems can suggest text that must be read to achieve a determined learning objective. The eye tracking technology is commonly used to analyze the user reading habits. Counting the number of read words could be hard when the readings are obtained from imprecise eye tracking data – e.g., eye tracking calibration difficulties. Approaches that find patterns from saccades and fixations usually fail to solve the problem in such conditions. This paper introduces the Cowl approach, which deals with the imprecision problem by associating the eye tracking data with points obtained from character recognition. To detect text lines truly read, the problem is stated as one of merging two hypothetical lines and it is solved by a Bayesian approach. Tests show that the proposed approach shows high performance, reaching average precision rates up to 0.866 for recall 0.976 – in the case of text with different orientations.

Keywords

Eye tracking multiple lines fitting human computer interaction line features

1 Introduction

The significant progress made in designing a robust, low-cost, mobile eye-tracking system allows analyzing what the user is seeing. Two common techniques for eyetracking are electrooculography (EOG) and optical tracking. The technical principle of EOG is based on the fact that the eye acts as an electrical dipole between the cornea (positive) and retina (negative). This approach is cheap to implement and requires little processing compared to optical tracking; however, it only gives relative eyemovements.

Alternatively, some commercial systems – e.g., Eye tribe or Tobii – estimate the gaze point by reflecting infrared light into the user’s eyes. The precision and accuracy of these systems are not always satisfactory; for example, Ooms et al. [12] reported that measurements obtained by the “Eye Tribe” optical tracking system have an offset of 34px (25px standard deviation) for non-border measurements, but high deviations – offset and standard deviation – at the edge of the screen. Some factors – e.g., lighting, set-up and calibration – can render the recordings useless.

The main contribution of this paper is a method to estimate the number of lines and words effectively read by the user. The idea relies on two observations: (i) fixations are usually not located on words due to sensor inaccuracies (Fig. 1a), and (ii) to read a text, user’s gazes must fall on the same text line several times (Fig. 1e). The outline of the proposed approach is shown in Fig. 1. First, fixations and character centroids are obtained from the eye tracking data and image, respectively. In order to better determine the number of read words, the proposed approach associates a cluster of the closest characters to each fixation (Fig. 1c). These small point clusters are analyzed to find lines (Fig. 1d). Finally, noisy fixations are corrected by a Bayes factor test for merging lines (Fig. 1e).

Fig.1

Overview of the proposed approach. (a) fixations obtained from tracking data superimposed on the text read, (b) data after preprocessing: fixations (circles) and centroids of characters (dots), (c) each fixation is associated to a set of character centroids in its neighborhood according to the path followed by the gaze, (d) a sequential RANSAC approach is used to find a line within each cluster, and (e) a Bayes factor is used to test for merging two lines, this approach filters out the false positives.

As shown in the experimental results section, the precision-recall rates are high for the proposed algorithm, even from corrupted eye-tracking data. Furthermore, it can recognize reading for any text direction; i.e., it is not restricted to recognize reading of a left-to-right text.

The rest of the paper is organized as follows: Section 2 reviews the related work, Section 3 reviews the Bayes Factor used to choose the right model between two competitive models (of one and two lines, respectively), Section 4 explains the proposed algorithm, Section 5 describes the empirical study; Section 6 discusses its findings; finally, Section 7 presents the conclusions of this work.

2 Related work

While reading, the eye follows a distinctive pattern – e.g. moving from left to right for English –, many approaches seek to identify such pattern for detecting the reading activity. For example, Huda et al. [6] propose to use the derivative of the Horizontal EOG (h–EOG) signal. Points that correspond to reading activity are found by applying an empirical threshold between peaks. In their approach, the number of selected points corresponds to the number of lines read, and the separation between them resembles the time taken to read the corresponding line.

For recognizing reading activity in daily life scenarios, Bulling et al. [4] use a head-mounted accelerometer to detect “head down” periods. They report an average precision of 87.7% with a recall of 87.9% for a pattern recognition approach based on strings – saccades are translated into strings of the alphabet Σ ={‘L’: long left, ‘l’: left, ‘R’: long right, and ‘r’: right}. As well, they report a precision of 88.9% with a recall of 72.3% for a Support Vector Machine (SVM) classifier that analyzes saccades, fixations, and blinks.

For reading detection, Kunze et al. [9] uses an SVM classifier with a radial basis function on a vector of features from the eye tracking data within a frame sliding window. The vector includes features from fixations (the number, sum of duration and average time), saccades (average and minimum length, horizontal, and vertical components), and average amplitude of wavelets. Unfortunately, some of these features are user-dependent – e.g., fast readers produce fewer fixations, fixations of shorter duration, larger saccades, and fewer regressions [8].

Yamaya et al. [15] consider the problem of fixation-to-text alignment and solve it by a two–step approach: (a) the gaze data is analyzed to find a set of sequential reading segments, these segments are found by a threshold test of the horizontal backward distance, (b) the second step searches for a minimum cost alignment between the reading segments and text lines using dynamic programming. The cost of the alignment uses a similarity based on the segment length. The main drawbacks of this approach are: gaze segmentation fails in some cases (e.g., for small text lines, or regressions in reading), and it only detects reading for horizontal text.

Biedert et al. [2] propose a linear classifier that uses the relation between consecutive saccades – v.g., angles and lengths – to discern between reading and skimming. Such classification problem is different to the one of counting words. An interesting strategy implemented by Biedert et al. is the normalization of features by using an estimation of the character size.

For detecting lines, Kunze et al. [10] project each saccade vector on the horizontal axis and create a histogram of distances. By considering the histogram as a mixture of two Gaussians (with means μ₁ and μ₂), they suggest to use (μ₁ + μ₂)/2 as threshold for line-break detection. For word count, they propose to estimate the number of words per line by using statistics obtained from the whole document. The works of [2 , 15] are related to the proposed approach in the sense that they associate eye tracking withwords.

Table 1 summarizes the approaches that quantify the text read by the user. Summing up, to the authors’ knowledge, this paper offers the first effort to detect readings for text in any direction. This is an important advantage because some languages do not follow the left-to-right reading direction, other documents (e.g., diagrams) do not have complete text lines. Even more, the proposed approach can better estimate the text read for different reading strategies (e.g.,skimming).

Table 1
Comparison of approaches for detection of reading activity

Reading detection characteristics

Reference Sensor # Lines # Words Any orient Small lines Other

Huda et al. [6] EOG ✔ ✖ ✖ ✖

Bulling et al. [4], STR EOG ✔ ✖ ✖ ✖ †

Bulling et al. [4], SVM EOG ✔ ✖ ✖ ✖ †

Kunze et al. [10] EOG/ET ✔ ‡ ✖ ✖ ‡

Yamaya et al. [15] ET ✔ ✖ ✖ ✖

Cowl (this paper) ET ✔ ✔ ✔ ✔

		Reading detection characteristics
Huda et al. [6]	EOG	✔	✖	✖	✖
Bulling et al. [4], STR	EOG	✔	✖	✖	✖	†
Bulling et al. [4], SVM	EOG	✔	✖	✖	✖	†
Kunze et al. [10]	EOG/ET	✔	‡	✖	✖	‡
Yamaya et al. [15]	ET	✔	✖	✖	✖
Cowl (this paper)	ET	✔	✔	✔	✔

† Can detect reading activity in a mobile setting (e.g. while walking) by detecting head-down position.

‡ Estimated as (average word count of the document read) × (estimated number of lines).

3 Bayes Factor Review

For completeness, this section reviews the Bayes factor for merging lines proposed in [11]. The Bayes factor [13, 14], a quantity for comparing models in the Bayesian framework, has played a major role in assessing the goodness of fitting competing models. Here, the Bayes factor is used to detect whether two point clusters follow the same linear model. An example of this problem is shown in Fig. 2, by using the measurement model of points, the Bayes factor can decide that clusters D_a and D_c follow the same linear model, but D_a and D_b do not.

Fig.2

Bayes factor for merging lines. Clusters D_a and D_c follow the same linear model, and other combinations (e.g., D_a and D_b) do not.

Errors from point to lines. Given a set of n independent Gaussian data points D = {z_i = 〈x_i, y_i〉|i = 1, …, n}, the sum of squares of the normalized orthogonal directed distances from points in D to the line ℓ : 〈r, φ〉 is defined as $χ^{2} (D, ℓ) = \sum_{i} \frac{d_{⊥}^{2} (z_{i}, ℓ)}{σ_{i}^{2}}$ (1) where σ_i is the variance of the i–th point z_i.

Bayes Factor Test to Simplify Point Clusters. Suppose there are two point clusters, each of them following a linear model. Let D_a and D_b be the two point clusters of size a_n and b_n, respectively. The model M₀ considers that both data sets were extracted from the same line in $ℝ^{2}$ ; i.e. the set D_ab = D_a ∪ D_b is characterized by the same line with unknown parameters ℓ_ab = 〈r_ab, φ_ab〉. An alternative hypothesis, represented by the model M₁, is that clusters have a perceptible change; therefore, the measurements are characterized by two different lines, ℓ_a and ℓ_b, with unknown parameters.

Lara-Alvarez et al. [11] propose that the Bayes Factor between models M₀ and M₁ is

$\begin{matrix} BF (M_{0}, M_{1}) & = & D_{a} ◊ D_{b} \\ = & e^{J} \times \frac{r_{m} φ_{m}}{4 π} \\ \times \sqrt{\frac{det (S_{D_{a}} ({\hat{ℓ}}_{a})) det (S_{D_{b}} ({\hat{ℓ}}_{b}))}{det (S_{D_{ab}} ({\hat{ℓ}}_{ab}))}} \end{matrix}$ (2) where: r_m and φ_m are the maximum length and angle for a given line, respectively; $J = \frac{1}{2} [χ^{2} (D_{a}, {\hat{ℓ}}_{a}) + χ^{2} (D_{b}, {\hat{ℓ}}_{b}) - χ^{2} (D_{ab}, {\hat{ℓ}}_{ab})];$ and $S_{D} (\hat{ℓ})$ , is the Hessian of χ² (D, ℓ) at the maximum likelihood line $\hat{ℓ} = 〈 \hat{r}, \hat{φ} 〉$ calculated as

$S_{D} (\hat{ℓ}) = 2 [\begin{matrix} \sum_{i} \frac{1}{σ_{i}^{2}} & \sum_{i} \frac{1}{σ_{i}^{2}} (x_{i} a_{i}) \\ \sum_{i} \frac{1}{σ_{i}^{2}} (x_{i} a_{i}) & \sum_{i} \frac{1}{σ_{i}^{2}} g_{i} \end{matrix}]$ (3) with $a_{i} = sin \hat{φ} - y_{i} cos \hat{φ}$ , and $g_{i} = (\hat{r} - x_{i} cos \hat{φ} - y_{i} sin \hat{φ}) (x_{i} cos \hat{φ} + y_{i} sin \hat{φ}) + (x_{i} sin \hat{φ} - y_{i} cos \hat{φ})^{2}$ .

4 Proposed algorithm

The proposed approach, named Cowl (COunting Words from Lines), is described in Algorithm 1. It assumes that text follows a linear pattern, this assumption allows detecting reading in any direction. The algorithm is composed of four major steps: preprocessing, detection of linear clusters, consolidation of text lines, and estimation of the number of words. In the preprocessing stage (lines 2-3), it finds fixations and character centroids. In the second stage (lines 4-9), each fixation is associated with a set of centroid points. Then, each cluster is analyzed by a sequential RANSAC approach; i.e., sequentially apply RANSAC [3] with a linear model and remove inliers from the data set as each model instance is detected. After that, the most probable line from those lines detected is selected. At the end of this stage, each element of $D$ is a linear point cluster with a minimum consensus t_in. The third stage (lines 10-16) iteratively merges text lines until the Bayes Factor is lower than a given threshold (t_bf). At the final stage (lines 16-20), the number of read words is estimated.

The following paragraphs provide some implementation details that are specific to the current development:

Algorithm 1
Proposed Algorithm (cowl)

1: Procedure COWL(E, I) ⊳ Eye Tracking Data E obtained from the image with text I

PREPROCESSING:

2: F← Find fixations from E

3: C← Estimate centroids from characters detected in I

DETECTION OF LINEAR CLUSTERS:

4: D ← {}

5: forf_i ∈ F do

6: Find S_i ⊂ C of neighbors of f_i with high probability of being a line

7: Detect the best linear cluster $D_{i}^{'}$ from S_i by a sequential RANSAC approach

8: $D \leftarrow D \cup {D_{i}^{'}}$

9: end for

TEXT LINE CONSOLIDATION:

10: repeat

11: $[D_{a}^{}, D_{b}^{}] \leftarrow$ Find the pair D_a, D_b in $D$ such that D_a ≠ D_b with the highest D_a⋄ D_b (Equation 2)

12: if $(D_{a}^{} ◊ D_{b}^{}) > t_{bf}$ then

13: $D_{a}^{} \leftarrow D_{a}^{} \cup D_{b}^{}$

14: $D \leftarrow D \ {D_{b}^{}}$

15: end if

16: until $(D_{a}^{} ◊ D_{b}^{}) \leq t_{bf}$

ESTIMATION OF WORDS:

17: N_w 0

18: for $D_{i} \in D$

19: N_w ← N_w + words (D_i)

20: end for

21: return $D$ , N_w

23: end procedure

Preprocessing. A simple position-variance method [1] is used to find fixations F = [f₁, f₂, …] from the gaze data E. It determines whether M of N points lie within a certain distance d of the mean of the signal [5]. The parameters used for this detector were: N = 6, M = 4, and d = 30px.∥ Detection of linear clusters. Two steps were performed to find points close to a given fixation f_i ∈ F with high probability of being a text line. In the first step, the k closest characters to f_i are selected, this sample is used to predict the size of characters of the text; hence, the median t′ of their character sizes is calculated. Each point f_i is associated to the set of character points $S_{i} = {c \in C ∣ ‖ f_{i} - c ‖ \leq 6.5 t^{'}} .$ where || · || is the norm of a vector, and the radio 6.5t′ was selected according to the number of characters that a user can read in every fixational pause [7]; hence, the expected number of points associated to a given cluster of f_i in the reading direction must be approximately equal to 13. At the end of this step each fixation is associated to a set of text points (Fig. 1c).∥In the second step, the set S_i is analyzed by a sequential RANSAC approach, this analysis is performed with a threshold for inliers of t′/2 to obtain a set of lines L_i such that the number of inliers of L ∈ L_i is greater or equal than t_in = 5. Hence, the line $L_{f}^{'} = \underset{L \in L_{i}}{argmin} | d (L, f_{i}) |$ and its supporting points are selected to represent the fixation f_i. Here, |d (L, f_i) | is the absolute value of the orthogonal directed distance of the point f_i to the line L. At the end of this step each fixation is associated to a set of text points that follow a linear model (Fig. 1d).∥ Consolidation of text lines. The algorithm searches for lines that follow the same linear model by applying (2) iteratively. The threshold for merging lines was set to t_bf = 500. Furthermore, to reduce the search space, the actual implementation only consolidates lines within a sliding window of three neighbors in the gaze sequence. At the end of this step, the set of lines read is known (Fig. 1e).∥ Estimation of words. For estimating the number of words, a closing operation with a kernel of t^′/2 is applied. The number of connected components is used to estimate the number of words.

5 Experimental evaluation

The goal of the experiment is to evaluate the performance of algorithms that count the number of lines and words read by users using a low cost eye tracking device. A qualitative comparison of approaches that estimate the number of lines and words is shown in Table 1. These approaches use different settings and therefore cannot directly be compared to the proposed approach; for instance, use the EOG sensor, and not the optical eye tracking. Some approaches do require specific preconditions; e.g., Kunze et al. [10] can only be used for text of the same font size. Specifically, we are interested in comparing the proposed COWL approach against another approach from those that count lines/words by aligning eye tracking data to text [10, 15]. The approach proposed by Yamaya et al. [15], hereinafter referred as DP, was selected because it also deals with eye tracking errors. To ensure a fair comparison, the position-variance method [1] was used to find fixations, and the parameters of the DP algorithm were configured as suggested in [15].

Materials. Two images containing text in Spanish were used for this test. Image $I_{1}$ has a single text column (135 words, 10 lines); and Image $I_{2}$ has a mind map with branches at different orientations (40 words, 11 lines).

These two images were sequentially presented to the user for reading purposes and the gaze data was obtained from each image. Elements (words and lines) truly read by the user – ground truth dataset – were obtained by an expert that carefully analyzed the gaze data; henceforth, these elements are represented by Θ (gt). Analogously, elements obtained by algorithms (COWL, DP) are represented as Θ (alg).

Metrics. The average precision $\bar{P} (Θ)$ and recall $\bar{R} (Θ)$ are defined as: $\begin{matrix} \bar{P} (Θ) & = & \frac{1}{n} \sum_{q = 1}^{n} \frac{| Θ (alg) \cap Θ (gt) |}{| Θ (alg) |} \\ \bar{R} (Θ) & = & \frac{1}{n} \sum_{q = 1}^{n} \frac{| Θ (alg) \cap Θ (gt) |}{| Θ (gt) |} \end{matrix}$ where |·| is the set cardinality, n is the number of participants, and Θ ∈ {W, L} are functions to obtain the words (W) and lines (L), respectively. A simple measure that combines precision and recall is the balanced F–score, $F (Θ) = 2 \cdot \frac{\bar{P} (Θ) \cdot \bar{R} (Θ)}{\bar{P} (Θ) + \bar{R} (Θ)} .$

Equipment and participants. Experiments took place in a room illuminated by both natural and artificial light, the gaze data were recorded with the low-cost Eye Tribe tracker ET1000 at a sampling frequency of 60 Hz. The eye tracking system was integrated with a 17-inch TFT monitor used to display the reading material at a resolution of 1440 × 920 pixels.

Study participants were men and women (n = 14) aged 25 to 45 years, all of them native Spanish speakers with bachelor’s degree. At the time the experiment was conducted, only one participant was wearing glasses (two diopters).

For each participant, the eye tracker was calibrated with the OGAMA software tool (using 12 calibration points) and it was set at a distance of about 50 cm from the subject’s head. Participants were instructed to read – according to their usual reading habits – the images. No time limit was imposed for any trial, and the users were asked to change the image (by giving a click) once they read completely the actual reading material.

6 Results and discussion

Table 2 shows the results for the number of lines and words. The F–scores ( $\bar{F} (L)$ and $\bar{F} (W)$ ) are better for Cowl than for DP. As expected, a more significant difference was found for image $I_{2}$ because it contains text in different orientations and the DP was designed to detect reading for single-column text in horizontal direction.

The DP approach has good performance for image $I_{1}$ ; even, it has a better recall for lines $\bar{R} (L) = 0.915$ , than Cowl $\bar{R} (L) = 0.867$ ; but it shows a lower precision, $\bar{P} (L) = 0.887$ . This can be explained by difficulties to process different reading habits; e.g., skimming and regression.

Table 2
Results for images $I_{1}$ and $I_{2}$ shown in Table 1 (n = 14). Best values are in bold

$I_{1}$ $I_{2}$

COWL DP COWL DP

Lines $\bar{P} (L)$ 1.000 0.887 1.000 0.736

$\bar{R} (L)$ 0.867 0.915 0.815 0.276

$\bar{F} (L)$ 0.929 0.901 0.898 0.401

Words $\bar{P} (W)$ 0.992 0.840 0.976 0.657

$\bar{R} (W)$ 0.947 0.875 0.866 0.272

$\bar{F} (W)$ 0.969 0.857 0.976 0.385

		$I_{1}$	$I_{2}$
Lines	$\bar{P} (L)$	1.000	0.887	1.000	0.736
	$\bar{R} (L)$	0.867	0.915	0.815	0.276
	$\bar{F} (L)$	0.929	0.901	0.898	0.401
Words	$\bar{P} (W)$	0.992	0.840	0.976	0.657
	$\bar{R} (W)$	0.947	0.875	0.866	0.272
	$\bar{F} (W)$	0.969	0.857	0.976	0.385

Skimming is the process of reading only main ideas within a passage to get an overall impression of the content of a reading selection – Biedert et al. [2] suggest a reading classification algorithm. Regression is the process of re-reading text already read. These two habits difficult to obtain a complete reading path in those approaches that use a return sweep detection.

As shown in Table 2, the recall of words for Cowl in the second image, $\bar{R} (W)$ = 0.866, is less than the recall for the first image, $\bar{R} (W)$ = 0.947; but, the line recall is almost the same, $\bar{P} (L)$ for $I_{1} \approx \bar{P} (L)$ for $I_{2}$ . This effect can be explained by incorrect initial associations between a fixation and points (mainly caused by measurement errors), it seems that the proposed approach can deal with calibration errors that are orthogonal to the text direction. This is a disadvantage of the proposed approach.

On the other hand, it is evident from Fig. 3 that text lines with few words – e.g., the titles or some branches of the mind map – are hardly detected by Cowl. This effect was already expected and it is another drawback of the proposed approach because it requires at least two fixations in the same text line to detect reading. But in turn, this constraint induces a very high precision by avoiding false positives (the worst precision in this study was $\bar{P} (W) = 0.976$ for $I_{2}$ ). Furthermore, the constraint of having two consecutive fixations is by far softer than having a complete text line. We argue that the recall reduction for $I_{2}$ respect to $I_{1}$ was mainly driven by the length of text lines but not by the direction of such lines.

Fig.3

False Negatives of words. Colors indicate the proportion of words read by users but not detected (¬W): green (¬W< 20 %), yellow (20% ≤ ¬ W < 40 %), orange (40% ≤ ¬ W < 60 %), and red (60 % ≤ ¬ W).

7 Conclusions

This paper introduces the Cowl approach, that deals with the imprecision problem of eye tracking data by associating points obtained from character recognition to fixations. To detect text lines truly read, the problem is stated as one of merging two hypothetical lines and it is solved by a Bayesian approach.

The strength of the Cowl method is the high precision (up to 0.866) for recall 0.976 in the case of text with different orientations. In contrast to other approaches, the Cowl approach is able to detect reading in any direction. And with a few modifications it could also count regressions; i.e. to fixate back to a previous word. The main drawback of this approach is that it requires at least two fixations to detect reading. In general, we argue that the proposed approach can be used for counting lines and words in a real setting.

References

Anliker

, Eye movements and psychological processes, Eye movements: Online measurement, analysis, and control1976.

Biedert

, Hees

, Dengel

, Buscher

A robust realtime reading-skimming classifier, Proceedings of the Symposium on Eye Tracking Research and Applications123–130. ACM (2012).

Bolles

R.C.

, Fischler

M.A.

A RANSAC-based approach to model fitting and its application to finding cylinders in range data, IJCAI (1981), 637–643.

Bulling

, Ward

J.A.

and Gellersen

, Multimodal recognition of reading activity in transit using body-worn sensors, ACM Transactions on Applied Perception (TAP)9(1) (2012), 2.

Duchowski

, Eye tracking methodology: Theory and practiceSpringer Science & Business Media, vol. 373. 2007.

Huda

, Hossain

M.S.

, Ahmad

Recognition of reading activity from the saccadic samples of electrooculography data, International Conference on Electrical & Electronic Engineering (ICEEE) (2015)– 73–76. IEEE.

Jordan

T.R.

, McGowan

V.A.

, Kurtev

and Paterson

K.B.

, A Further Look at Postview Effects in Reading: AnEye-Movements Study of Influences From the Left of Fixation, Journal of Experimental Psychology-Learning Memory and Cognition42(2) (2016), 296–307.

Krieber

, Bartl-Pokorny

K.D.

, Pokorny

F.B.

, Einspieler

, Langmann

, Körner

, Falck-Ytter

and Marschik

P.B.

, The relation between reading skills and eye movement patterns in adolescent readers: Evidence from a regular orthography, PLoS One11(1) (2016), e0145934.

Kunze

, Kawaichi

, Yoshimura

and Kise

, The Wordometer - Estimating the Number of Words Read Using Document Image Retrieval and Mobile Eye Tracking, ICDAR.25–29.Proceedings of the ICDAR, Raytheon BBN Technologies; IAPR; ABBYY; VisionObjects; Google; HITACHI; iTESOFT; ebay; Univ Buffalo; LEHIGH Univ; Univ Maryland. (2013)

10.

Kunze

, Masai

, Inami

, Sacakli

Ö.

, Liwicki

, Dengel

, Ishimaru

and Kise

, Quantifying reading habits: counting how many words you read. In: Proceedings of the 2015 ACM International Joint Conference on Pervasive and Ubiquitous Computing. pp. 87–96. ACM (2015).

11.

Lara-Alvarez

, Romero

and Gomez

, Multiple straightline fitting using a bayes factor. Advances in Data Analysis and Classification, (2016), pp1–14.

12.

Ooms

, Dupont

, Lapon

and Popelka

, Accuracy and precision of fixation locations recorded with the low-cost eye tribe tracker in different experimental set-ups, Journal of Eye Movement Research8 (2015).

13.

Robert

C.P.

, The Bayesian choice: A decision-theoretic motivation. Springer-Verlag (1994), .

14.

Sivia

D.S.

, Data Analysis: A Bayesian Tutorial (Oxford Science Publications). Oxford University Press (July 1996).

15.

Yamaya

, Topić

, Martínez-Gómez

, Aizawa

, Dynamic-programming– based method for fixation-to-word mapping, In: Intelligent Decision Technologies, pp. 649–659. Springer (2015).

		Reading detection characteristics
Reference	Sensor	# Lines	# Words	Any orient	Small lines	Other
Huda et al. [6]	EOG	✔	✖	✖	✖
Bulling et al. [4], STR	EOG	✔	✖	✖	✖	†
Bulling et al. [4], SVM	EOG	✔	✖	✖	✖	†
Kunze et al. [10]	EOG/ET	✔	‡	✖	✖	‡
Yamaya et al. [15]	ET	✔	✖	✖	✖
Cowl (this paper)	ET	✔	✔	✔	✔

		$I_{1}$		$I_{2}$
		COWL	DP	COWL	DP
Lines	$\bar{P} (L)$	1.000	0.887	1.000	0.736
	$\bar{R} (L)$	0.867	0.915	0.815	0.276
	$\bar{F} (L)$	0.929	0.901	0.898	0.401
Words	$\bar{P} (W)$	0.992	0.840	0.976	0.657
	$\bar{R} (W)$	0.947	0.875	0.866	0.272
	$\bar{F} (W)$	0.969	0.857	0.976	0.385