Legal document recommendation system: A cluster based pairwise similarity computation

Abstract

Legal practitioners analyze relevant previous judgments to prepare favorable and advantageous arguments for an ongoing case. In Legal domain, recommender systems (RS) effectively identify and recommend referentially and/or semantically relevant judgments. Due to the availability of enormous amounts of judgments, RS needs to compute pairwise similarity scores for all unique judgment pairs in advance, aiming to minimize the recommendation response time. This practice introduces the scalability issue as the number of pairs to be computed increases quadratically with the number of judgments i.e., O (n²). However, there is a limited number of pairs consisting of strong relevance among the judgments. Therefore, it is insignificant to compute similarities for pairs consisting of trivial relevance between judgments. To address the scalability issue, this research proposes a graph clustering based novel Legal Document Recommendation System (LDRS) that forms clusters of referentially similar judgments and within those clusters find semantically relevant judgments. Hence, pairwise similarity scores are computed for each cluster to restrict search space within-cluster only instead of the entire corpus. Thus, the proposed LDRS severely reduces the number of similarity computations that enable large numbers of judgments to be handled. It exploits a highly scalable Louvain approach to cluster judgment citation network, and Doc2Vec to capture the semantic relevance among judgments within a cluster. The efficacy and efficiency of the proposed LDRS are evaluated and analyzed using the large real-life judgments of the Supreme Court of India. The experimental results demonstrate the encouraging performance of proposed LDRS in terms of Accuracy, F1-Scores, MCC Scores, and computational complexity, which validates the applicability for scalable recommender systems.

Keywords

Legal document recommender systems Pairwise similarity Graph Clustering Semantic similarity

1 Introduction

Advancement in the information age forms an opportunity for Legal institutions and fraternities to publish various kinds of Legal documents online. Internet-based various digital repositories manage the different types of Legal documents such as Precedents (Judgments), Constitutions, Various Codes, Laws, Acts, Rules, and Regulations, etc. Previously delivered judgments are the essential source of law to derive the decision in the judiciary system (i.e., the Court). Indian judiciary system follows the belief of “stare decisis” that is “similar facts, and circumstances should be treated in a similar way”. Thus, judicial decision-makers are bounded to consider and follow interpretations of prior judgment(s) as per concern, if the present case and prior judgment(s) have comparable arguments, facts, circumstances, and issues [1 –3]. These prior judgments are also known as a precedents, which is interchangeably used with the judgment in this research. Judgment is a textual document discussing the arguments, facts, circumstances, issues, and decisions associated with the particular case. Referential information such as previous judgments and codified statutes related to particular issues are also embedded in judgments. There are various types of case matters (case type) for which judge(s) transcribes the judgment such as Criminal matters, Civil matters, Constitutional matters, Taxes matters, etc. To prepare the strong, convincing, strategic, and supportive arguments for the ongoing case, Legal professionals requires to investigate and analyze relevant previous judgments and their associated Legal issues [4]. The relevant judgment identification process is complex, domain knowledge-, time-, and labor-intensive. These critical challenges strongly demand an efficient and effective Legal Document Recommender System (LDRS) to automate the process.

In recent years, existing frameworks utilize referential and/or textual information to capture the relevance among Legal documents. Primarily, text-based approaches [1 , 5] aim to identify semantically relevant Legal documents, by computing the Cosine similarity score between respective document vectors. Thus, Legal documents are required to be transformed efficiently into fixed-length and real-valued vectors using vectorization techniques such as TFIDF [4], LDA [6], Word2Vec [7, 8], and Doc2Vec [9]. Whereas, Reference-based approaches [4 , 11] construct a citation network from the referential information to apply the network-based similarity measures like bibliographic coupling, co-citation analysis, node embedding. Hybrid approaches [2 , 13] enrich similarity measures by integrating text- and reference-based approaches.

The LDRS performs recommendation of relevant Legal documents given a query document by computing pairwise similarity scores between the query document and rest of documents in the corpus. However, the major challenge lies in handling the massive amount of presently available Legal documents, which are also exponentially growing. Accordingly, LDRS should implement the recommendation as simple lookup of priorly (i.e., offline) computed pairwise similarity scores for all unique pairs [14]. Consequently, it possesses the scalability issue since computing the number of pairwise similarity scores increases quadratically with the number of documents. There is a total $(\frac{n^{2} - n}{2})$ number of unique pairs leading to the computational complexity of O (n²) where, n is the number of documents.

This research focuses on the scalability issue that occurres due to the computation of pairwise similarity scores for a massive amount of judgments. It is observed that most of judgments hold strong relevance only with a small number of other judgments (e.g., tens, hundreds, or thousands). Thus, there is no significance to compute the similarity score for insignificant pairs consisting of no or trivial relevance between judgments. However, judgments with comparable case matters could hold strong relevancy compared to judgments of other case matters. Also, a group of judgments with comparable case matters shares a similar kind of associated citations. So, to overcome the issue of scalability, this research proposes scalable LDRS that performs graph clustering on the citation network to form clusters (i.e. groups) of referentially relevant judgments. Pairwise similarity scores are computed for the individual cluster instead of the entire corpus. This way, similarity scores are only computed for significant pairs consisting of strong relevance between judgments. This practice considerably reduces the number of pairs to be computed, which saves the vast amount of computational resources. Also, the proposed LDRS effectively captures the semantic relevance among judgments (i.e., identifying textually similar judgments) within clusters.

To the best of authors’ knowledge, the proposed graph clustering based LDRS is an initial effort to reduce the computational complexity in the Legal recommendation system, specifically to the Indian judiciary system. The major contributions of the proposed work are abridged as follows:

This research work presents the scalable LDRS to effectively and efficiently capture the comprehensive relevance among judgments, using textual and referential information.

The proposed LDRS applies a fast and scalable Louvain approach [15, 16] on the citation network to form clusters of referentially similar judgments. Existing work [2, 4] constructs the sparse citation network of judgments only. In contrast, present work strengthens the citation network by incorporating codified statute and laws along with judgments.

This research computes pairwise similarity scores for each cluster instead of the entire corpus to make the LDRS scalable. This practice computes the similarity score only for significant pairs while irrelevant pairs are not considered, which drastically reduces the computational complexity.

The proposed LDRS employs Doc2Vec [9] to identify the semantically relevant judgments within a cluster. Doc2Vec transform the textual judgments into real-valued vectors such that semantically relevant judgments are in proximity in the vector space.

Empirical analysis was performed using large number of real-life judgments of the Supreme Court of India 1 to evaluate the effectiveness of the proposed LDRS. Experimental results of graph clustering are validated by demonstrating excellent performance in terms of Accuracy, F1-Score, and MCC Score [17]. The proposed LDRS has also demonstrated a significant reduction in computational complexity to compute pairwise similarity scores as compared to the Brute Force approach.

The rest of the paper is structured as follows: Section 2 reviews the literature and related work. Section 3 presents an methodology of the proposed cluster based LDRS. Section 4 demonstrates the empirical analysis of the proposed LDRS. Finally, Section 5 summarizes the presented research work including challanges and future goal.

2 Literature review

This Section provides a brief introduction to the existing similarity analysis approaches. The brief discussion is also carried out on the recent document clustering approaches. The studies of two approaches named Doc2Vec and Louvain approach are also briefly described in this Section.

2.1 Similarity analysis

In the Legal domain, existing research is centered on similarity analysis using textual and/or referential information for recommendation of relevant document.

• Texual Information

Kumar et al. [4] analyzed the lexical similarity that matches the string/lexical among text documents. Here, TFIDF was applied to the textual features for constructing the judgment vector space. However, it possesses challenges like sparsity, high dimensionality, and the inability to preserve the semantic [18]. Alternatively, semantic vector space modeling can enable a fruitful similarity analysis, which has motivated existing works to capture semantic relevance [1 , 19–21]. LDA based topic modeling was utilized to find relevant Legal documents given query [5], which may lack in capturing some of hidden semantic features from long textual documents [22]. Word2Vec and Doc2Vec utilize contextual information which can prominently help to preserve semantic relationships among words or documents [23]. In an experimental study of Mandal et al. [1], Doc2vec has demonstrated the best performance when whole judgment text considered as a single document, in terms of Correlation and Accuracy referenced to the human expert similarity score.

• Referential Information

The other predominant source of information is cited references such as Statutes, Acts, and previous judgments. Citations are significant features to capture the qualitative relevance since it implies direct associations with citing documents [24]. Thus, citation network is formed using referential information where each node signifies the Legal document, directed edge from citing node (i.e., A) to citation node (i.e., B) signifies that document A has cited the document B. Existing works have adopted various network-based retrieval techniques from the field of Scholarly Article citation network [25]. Kumar et al. [4] proposed an approach to identify similarity among Indian judgments, applying Bibliographic Coupling [26] and Co-Citation [27] analysis to the judgment citation network. Koniaris et al. [10] utilized the structural and statistical information (i.e., degree) of the network to capture the relevance among European Union Legal documents. However, network-based techniques’ performance and utility are purely dependent on how strongly a network is connected. The majority of judgments have only a few citations forming a sparse citation network [1].

• Hybrid

Kumar et al. [2] improved the performance using hybridization of textual and referential information. Authors formed a citation network using previous judgments and “paragraph links ”. There exists a paragraph link between two judgments, if the similarity between TFIDF vectors of two paragraphs from different judgments is beyond the threshold value. The hybrid approach has demonstrated a significant improvement in performance when compared to standalone approaches [4]. Raghav et al. [3] designed an approach to catch the relevancy using citation network, paragraph links, and both, followed by performing the clustering. Leibon et al. [12] coupled a network-based approach with the textual representation technique like LDA for US Supreme Court opinions. With the advancement in network representation techniques, Sugathadasa et al. [13] designed an approach to find similar legal documents using Node2Vec [28], a neural network-based graph embedding approach to represent nodes as dense feature vectors, and TextRank algorithm [29] for sentence similarity.

2.2 Document clustering

Document clustering is the practice of grouping up similar types (i.e., comparable in some way) of documents [30]. It has been extensively employed in the field of bibliometric [31] and Legal informatics [24, 32], where text and citations are main streams of information. Clustering based on the textual information considers the lexical or semantic feature matrix (i.e., vector space), derived using TFIDF, LSA, LDA, Word2Vec, Doc2Vec, etc. Subsequently, clusters can be prepared by applying traditional clustering methods (i.e., k-mean, hierarchical clustering) to the feature matrix [30]. Citation-based clustering considers document citations to model the citation network, where clustering is the practice of identifying the group of densely linked nodes which are sparsely linked with nodes of other groups. In recent years, network-based clustering techniques are increasingly getting consideration in the variety of research domains where information can be modeled in the network structure, such as social networking [33], bibliometric literature [34], epidemiology [35], and recommendations [36], etc.

2.3 Louvain approach and Doc2Vec

Louvain method [15] aims to maximize the modularity objective function, which measures the inside edge density of the cluster compared to the external edge density between clusters. A higher value of the modularity results in healthier partitioning, and smaller value results in the poor partitioning of a network. However, modularity maximization is an NP-Hard problem [37]. Louvain approach performs a greedy optimization-based heuristic to achieve rapid convergence. It is therefore an extremely scalable, fast, and effective solution for real-life large-scaled graphs. Zhao et al. [38] performed the comparative analysis of state-of-the-art graph clustering algorithms in terms of efficiency and effectiveness, over the various sizes of artificial networks wherein, Lovain approach emerged as the superior candidate resulting the best accuracy and computation time. It has been widely adopted in the field of bibliometrics and informetrics [39], and Legal information system [24, 40].

Doc2vec [9] is a document vectorization technique representing the arbitrary length of text (i.e., sentence, paragraph, document) as a low-dimensioned and real-valued vector. It is strongly developed based on the word embedding technique named Word2Vec, which represents a word as a semantically riched vector. Word2Vec is a shallow Neural Network (NN) that utilizes the contextual information (i.e., surrounding information of words) to learn the vector representation. It follows the belief of distributional hypothesis [41], summarized as “semantically similar words share a similar kind of contextual information”. So, vectors of such words are embedded in proximity in the resultant vector space. Inherently, Doc2Vec incorporates document vectors along with word vectors to learn semantic document vectors. This way, document vectors capture semantics during the learning of word vectors.

2.4 Research significant

Semantic relevance is captured by text-based approaches using vectorization techniques (TFIDF, LDA, Word2Vec, and Doc2vec) [1 , 5]. Reference-based approaches [4 , 11] uses direct or indirect citation relationships to capture the relevance among judgments. However, the performance of these approaches is dependent on the density of the citation network. Specifically for Indian judgments, existing works form the citation networks using only previous judgments as citations, which cause a very sparse network. Alternatively, Hybrid approaches [2 , 13] are also proposed in the literature to improve performance at the cost of higher computational resources. A review of significant studies in this area found that they are centered on computing the qualitative similarity measure (i.e., how well relevance can be captured) among judgments aiming to enhance the correctness. While, in the recommender system, computing pairwise similarity scores (i.e., for all unique pairs) in priorly is also an essential process to achieve sufficient response latency at the time of querying. However, it possesses the scalability issues as the number of unique pairs grows quadratically with the increasing number of judgments. It is very expensive in terms of computing and memory due to the existence of an enormous amount of judgments. This problem has received very limited consideration in the existing Legal recommender system. To mitigate the above mentioned challenges, the next section provides an in-depth methodology of the proposed LDRS.

3 Proposed LDRS

Computing pairwise similarities is a computationally very expensive task due to quadratic complexity. To overcome, this research proposes a scalable LDRS that utilizes the Louvain approach to obtain clusters of referentially relevant judgments. Pairwise similarity scores are computed for an individual cluster instead of the entire corpus to prevent unnecessary computation. Within those clusters, Doc2Vec is utilized to identify semantically relevant judgments. The architecture of the proposed framework is illustrated in Fig. 1 and consists of three Phases: 1) Citation Network Clustering 2) Judgment Vectorization 3) Cluster based Pairwise Similarity score Computation (C-PSC). First Phase, Citation Network Clustering aims to form the citation network using citations on which graph clustering algorithm (i.e., Louvain approach) applies to decompose the network into clusters. Second Phase, Judgment Vectorization aims to generate semantic judgment vector space applying a prominent document embedding technique (i.e., Doc2Vec) on the judgment text. These two phases are independent of the order of their execution due to the use of a different type of information. Third Phase, C-PSC intends to compute pairwise semantic similarity scores for each cluster instead of the entire corpus. The detailed description of these phases is discussed in following subsections.

3.1 Citation network clustering

There are various types of case matters for which judgments were delivered. Specifically in the Indian judiciary system, these case matters could be categorized as Criminal, Civil, Constitutional, Taxes matters, etc. Judgments discuss the specific case matter(s) and also cite related citations such as previous judgments, Statutes, and Laws. It is observed that judgments hold strong relevancy with judgments having a similar kind of case matters as compared to judgments with diverse case matters. Even, a group of judgments having comparable case matters also share a similar set of associated references. This belief has motivated us to utilize the referential information for clustering judgments.

The judgment may refer to previous judgments and Statutes along with the specific article number or section number. This research extracts all the referential information to form the citation network. However, some references may be utilized several times at different places in judgment, which does not have any significance in citation networks. Hence, this type of repetitive citations are eliminated. Earlier works [4] have considered only previous judgments as citations that forms a very sparse network. However, the present work considers previous judgments, Statutes, Laws, specific articles, and specific sections as citations to strengthen the density of the citation network.

Let us consider G = (V, E) is a citation network (i.e., directed graph), where V signifies the set of vertexes {v₁, v₂, v₃ . . . , v_n} consisting n nodes and E signifies the set of directed edges. In citation network G, each node corresponds to the judgment or citation, and a directed edge v_x → v_y between node v_x and node v_y indicates that judgment v_x referred citation v_y. G does not have any two-way edge like judgment v_x referred citation v_y and citation v_y referred judgment v_x, and self-loop like judgment referred itself.

The proposed LDRS uses the modularity based Louvain approach to form clusters of relevant judgments in the citation network. However, modularity suffers from the problem called “resolution limit” that may not discover small-sized certain clusters even though clusters are clearly noticeable [16, 42]. In order to prevent the resolution limit, proposed LDRS utilizes the Louvain approach with improved modularity objective function upgraded with “resolution” parameter that provides certain flexibility to control the cluster size [15 , 42]. Input to the Louvain approach is judgment citation network G = (V, E). It decomposes citation network into m clusters (C), where each group comprise of an assigned set of nodes (i.e., judgments), as shown in equations (1) and (2). This research eliminates all citation nodes (i.e., only keep citing judgments) from each group as they are insignificant for computing the pairwise similarity score. $C = {C_{1}, C_{2}, . . ., C_{r}, C_{m}}$ (1)

$C_{r} = {J_{1, r}, J_{2, r}, . . ., J_{i, r}, J_{k, r}} and | C_{r} | = k_{r}$ (2) Here, C indicates the set of m clusters, C_r indicates the set of k_r judgments {J_1,r, J_2,r, . . . , J_i,r, J_k,r} placed in r^th cluster, and k_r indicates the size of r^th cluster.

3.2 Judgment vectorization

This phase utilizes the textual information to capture the semantic relevance among judgments within each cluster. All raw formatted judgments are available in the form of textual files containing the headnote, temporal information, judge or bench name, publicizing court, judgment text, and other metadata. Generally, judgment text discusses several issues, facts, arguments, and decisions regarding the specific case, which are an essential source of information in relevance capturing. Accordingly, only judgment text is extracted by discarding all other inappropriate data using contextual rules. It is observed that there is a lack of linguistic and structural homogeneity in available judgments, due to the use of natural language and unavailability of predefined structure for transcription. So, the direct utilization of such judgments may result in performance deterioration. Moreover, the judgment text also contains noisy and useless data like stop-words, punctuations, whitespace, etc. The proposed LDRS performs following text pre-processing steps aiming to structure, standardize, and reduce the vocabulary of raw judgment text.

• Text is translated into the lower case.

• Text is tokenized by considering line breaks, white spaces, and paragraphs.

• Removal of tokens with a length of one and two character, white space, new line, punctuations, numbers and stop-words.

Once the judgment text is pre-processed, the proposed LDRS applies Doc2Vec to learn the semantically rich, low-dimensioned, and real-valued vector representation for input judgments. Thus, vectors of semantically relevant judgments result in neighboring vectors. Input to the Doc2Vec is a set of judgments and their respective pre-processed judgment text. The resultant is the judgment embedding consisting a set of judgments and their respective vectors, as shown in equations (3) and (4): $J = {J_{1}, J_{2}, . . ., J_{i}, J_{n}}$ (3)

$Vector = {{Vector}_{1}, {Vector}_{2}, . . ., {Vector}_{i}, {Vector}_{n}}$ (4) Here, J indicates the set of n judgments and Vector indicates the set of corresponding vector representation Vector_i of judgment J_i.

3.3 Cluster based pairwise similarity score computation (C-PSC)

This phase aims to compute pairwise semantic similarity scores for an individual cluster using resultant of Phase-1 and Phase-2. In Phase-1, the Louvain method decomposes judgments into several clusters consisting set of densely interconnected judgments and sparsely connected with judgments of other clusters. This characteristic intuitively articulates that a judgment within a cluster has a strong relevance as compared to judgments of other clusters. The proposed LDRS determines the potential set of relevant judgments consisting of strong inter-cluster associations. Hence, this phase computes pairwise similarity scores for an indivisible cluster instead of the entire corpus. This practice eliminates the large number of unimportant pairs having irrelevant judgments, which rigorously decreases the requirement of computational resources. The strong relevance can be captured by preserving the semantic relations among judgments. So, within each cluster, Cosine similarity scores are computed using semantic judgment vectors learned from Doc2Vec, as computed in equation (5).

The algorithmic representation of the proposed Cluster based Pairwise Similarity Score Computation (C-PSC) is shown in Algorithm 1. Citation based information is used to decompose judgments into the cluster where an intention is to restrict the search space from the entire corpus to a single cluster. While, text-based information is used to capture semantic relevance and identify the most similar judgments by computing pairwise similarity scores within the cluster only. Hence, Resultants of Phase-1 and Phase-2 are inputted to compute pairwise similarity scores. Step No. 1 to 8 is the Function to generate all unique judgment pairs for a given set of judgments (i.e., within a cluster). This research does not consider pairs with same judgments since the similarity score between same judgments always remains maximum. Additionally, symmetric pairs like (J_i, J_k) and (J_k, J_i) results in identical similarity scores. Such pairs considered as a single pair to prevent redundant computation. Step No: 9 to 16 computes the pairwise similarity score for each cluster C_i in the C (Resultnant of Phase-1). Pairs consist of all unique judgment pairs generated from the set of judgments present in each cluster C_i by calling the Function, as performed in Step No: 10. For each judgment pair (J_a, J_b) present in the pairs, respactive vectors Vector_a (i . e . , V_a) and Vector_b (i . e . , V_b) are derived using J and Vector (Resultnant of Phase-1), as performed in Step No: 11 to 13. Step No: 14 computes the Cosine similarity score for all unique judgment pairs (V_a, V_b). This algorithm results in pairwise similarity scores for each cluster.

4 Experiments

This Section describes the dataset used for empirical analysis, experiment settings, and evaluation measures. Results and discussion are also discussed in this Section.

4.1 Dataset

An empirical analysis was performed on the real Legal dataset crawled from a profound online Legal repository 2 . The dataset comprises more than 48000 judgments decided by the Supreme Court of India, during the span period of January 1950 to December 2016 (76 years). The raw format of judgment is comprised of textual information such as “headnote” (optional), and “judgment text”, and Referential information such as “citations”.

Phase-1 of the proposed LDRS clusters the similar types of judgments using citation network to eliminate irrelevant pairs of judgments. While, the textual information is being utilized to identify the semantically relevant judgments within a respective cluster. Hence, two datasets were derived from the raw Legal dataset: 1) Citation Network Dataset that consisted of judgments and their respective citation(s) where, each of them (i.e., judgments and citations) were assigned unique numeric identity (UID). 2) Legal Text Dataset comprised judgments (i.e., assign UIDs similar to the Citation Network Dataset) and their respective judgment text.

4.2 Experiment settings and performance measures

During this empirical analysis, two experiments are performed which are discussed in following subsection.

4.2.1 Citation network clustering

Experiment-1 is conducted on the Citation Network Dataset to evaluate and validate the performance of the Louvain approach. In the absence of standard labeled results (i.e., pairs having similar or not similar judgments), Kumar et al. [4] collected the Relevancy score assigned by Legal Experts (RLE score) for selected SC judgment pairs. These pairs are utilized in this research to evaluate the judgment clustering. The RLE score is assigned on the scale of 0 (Minimum relevancy) to 10 (Full relevancy), where judgments of a pair considered to be relevant if RLE >7 and irrelevant if RLE <4. For the same pairs, if the proposed LDRS assigned both judgments in the same cluster, they are considered to be the relevant (C-Score = 1), which is otherwise irrelevant (C-Score = 0). This research evaluates the performance of the proposed approach using binary classification measures like Accuracy, F1-Score, and Matthews Correlation Coefficient (MCC) [17], are computed from the confusion matrix shown in Fig. 2. Accuracy is evaluates the complete utility by computing the ratio of correctly predicted pairs (TP+TN) (i.e., Relevant and Not Relevant) to all pairs (TP+TN+FP+FN). F1-Scores fits the criteria to evaluate the proposed approach due to presence of imbalanced labels (Relevant and Not Relevant) in testing pairs. It can be computed by performing the harmonic mean of Precision (P) and Recall (R) where, Precision indicates the ratio of the correctly predicted relevant judgments (TP) to all the predicted relevant judgments (TP+FP). Recall indicates the ratio of the correctly predicted relevant judgments (TP) to all judgments pairs declared as relevant by Legal Experts (TP+FN). MCC Score is a more insightful measure than Accuracy and F1-Score for the problem of an imbalanced dataset. It considers all values of the confusion matrix (i.e., TP, TN, FP, and FN) in a balanced proportion [17]. The resultant of the MCC score is on the scale of -1 (worst) to 1 (best).

Fig. 1

Architecture of the Proposed LDRS.

Fig. 2

Confusion Matrix to Evaluate the Performance of Clustering.

4.2.2 Judgment vectorization

The existing work [1] has already demonstrated and validated Doc2Vec’s superior performance for similarity analysis of the Indian judgments. This research therefore did not perform the empirical analysis to validate the performance of Doc2Vec. Instead, experiments were performed on the Legal Text dataset aiming to fine-tune the Doc2Vec model with hyper-parameters such as window size, vector size, and iterations [9]. So, rigorous experimentation was performed by varying the values of hyper-parameters. The most encouraging performance was determined with the vector size of 200, the window size of 20, and the number of iterations of 5. Consequently, these judgment vectors will be utilized in Experiment-2 to compute Cosine similarity scores.

4.2.3 Pairwise similarity score computation

Experiment-2 assesses the effectiveness of the proposed Cluster based Pairwise Similarity Score Computation (C-PSC) compared to the Brute force approach. The Brute force approach computes pairwise similarity scores for the entire corpus. This research evaluates the performance using time and space complexity. Time complexity signifies the total time 3 requires to compute cosine similarity for all unique judgment pairs. Space complexity signifies the memory 4 requirement to store the cosine similarity scores of all unique judgment pairs.

The citation network was constructed using Gephi 5 to visualize the knowledge flow. Louvain implementation from Gephi was utilized to cluster the network as it allows controlling the cluster size with the “resolution” parameter. Gensim 6 , a Python-based Open source Machine Learning Library, was exploited to learn the judgment vectors using Doc2Vec. All experimentations were executed on the Ubuntu 14.04 based machine with hardware consisting of Intel i7-7700 processor and 16GBs of RAM.

4.3 Experimental results

This subsection analyzes experimental results derived from the two aforesaid Experiments. First, the experimental result of the citation based analysis is being presented in Experiment-1. The proposed C-PSC approach is compared with the Brute force approach in Experiment-2.

• Experiment-1

During experimentation, it was observed that citation network decomposed into several components (i.e., connected graphs) viz. giant graph consisting of almost 99.86% of nodes to a total number of nodes, and several tiny graphs. There is no significance of such outlier tiny graphs in graph clustering. So, we kept only a giant graph, and rest tiny graphs are eliminated to reduce the process complexity.

In the Louvain method, resolution parameter handles the issue of resolution limit and partially controls the cluster size. This research analyzes the performance of proposed LDRS by varying the resolution values from 0.1 to 2.5 with an interval of 0.1, as depicted in Figs. 3, 4 and 5. Accuracy, F1-Score, and MCC Score are deteriorated with minor fluctuation when increasing resolution value. However, the recall remains consistent inferring that the most of relevant pairs’s judgments are correctly assigned to the same cluster irrespective of resolution. While, the precision declines inferring that irrelevant pairs’s judgments assigned to the same cluster as increases the resolution. The underlying reason is that smaller the value of resolution leads to a more number of clusters (i.e., small in size) and inversely, bigger value leads to a less number of clusters (i.e., large in size) as illustrated in Fig. 6. Hence, there is a high probability with large clusters (higher resolution) that judgments of irrelevant pairs assigned to the same cluster, which causes the incorrect classification.

Fig. 3

Accuracy with Various Resolutions.

Fig. 4

F1-Score, Precision and Recall with Various Resolutions.

Fig. 5

MCC Score with Various Resolutions.

Fig. 6

Cluster Size Distribution for Various Resolutions.

It is also observed that the performance achieves the stability beyond the resolution value of 1.5. The reason for that is a few large-sized clusters cover the majority of total judgments due to higher resolution value, as can be seen in Fig. 6(f). So, judgments of relevant and irrelevant testing pairs may cover by large clusters results in performance stability. The proposed LDRS resulted in a superior accuracy of 0.90 and an F1-Score of 0.86 and an MCC Score of 0.79 when the resolution value is 0.1. A total of 245 clusters were identified, and the size is ranging from 30 to 1642 nodes.

• Experiment-2

The efficiency of the proposed C-PSC is compared with the Brute force approach as a baseline. The Brute force approach computes the cosine similarity score of each judgment vector with every other judgment vector. (i.e., all unique judgment pairs for entire corpus). Table 1 depicts CPU time and Memory comparison of the proposed C-PSC and baseline approach. CPU time for baseline and proposed C-PSC are 28447.41s and 223.46s, respectively, demonstrating that the proposed approach achieves 127X speedup. Similarly, the proposed approach requires 0.026 GBs of memory that is comparatively much less than 3.32GBs of the baseline approach to preserve similarity scores. The reason behind better time and space complexity is that the baseline approach computes and stores similarity scores for $(\frac{n^{2} - n}{2})$ judgment pairs including irrelevant pairs, where n is the number of judgments in corpus. In contrast, the proposed approach identifies and filters irrelevant pairs by restricting search space to the cluster only instead of the entire corpus. So, it needs to compute and store only $\sum_{i = 1}^{i = m} \frac{(n_{i})^{2} - (n_{i})}{2}$ number of pairs, which is significantly very less compared to the $(\frac{n^{2} - n}{2})$ . Here, m is the number of clusters and n_i is the number of judgments in the i^th cluster.

Table 1

Performance Comparison of the Proposed LDRS and Baseline approach

	Proposed LDRS	Baseline Approach
CPU (Seconds)	223.46	28447.41
Memory	0.026	3.32

4.4 Discussion

The above empirical analysis is the evidence that graph clustering based proposed LDRS can achieve excellent performance. The promising performance has been demonstrated in terms of Accuracy, F1-Scores, and MCC Score of 0.90, 0.86, and 0.79, respectively. The reason for such an effective result is that relevant judgments refer to the similar kind of citation(s). Based on that belief, this research clusters relevant judgments by clustering the citation network. Results also confirm the encouraging computational efficiency in terms of time and space complexity. The proposed LDRS requires several orders of magnitude less memory than the baseline approach and reported the 127X speedup. The proposed LDRS computes the cluster based pairwise similarity scores since clusters consist of a relevant set of judgments. This practice prominently prevents excessive computational efforts by neglecting irrelevant pairs, which does not hold any significance.

5 Conclusion

The recommendation of relevant Legal documents is an essential action in the recommender system. Pairwise similarity scores are computed in advance to minimize the response time during recommendation. Due to the explosive growth of Legal documents, the quadratic complexity of pairwise similarity score computation turns out to be a scalability issue. To mitigate the scalability issue, proposed LDRS perform citation network clustering to cluster referentially relevant judgments. It computes pairwise similarity scores for each cluster, having strong relevancy among judgments. This way, unnecessary computation is prevented by restricting the search space to the cluster only instead of the entire corpus. Doc2Vec has been utilized to identify the semantically relevant judgments within each cluster. Experiments evaluated the citation network clustering, which has demonstrated the admirable performance in terms of Accuracy, F1-Score, and MCC Score. These results also demonstrate the potential superiority of proposed LDRS over the Brute force approach in terms of computational complexity to compute pairwise similarity scores. The encouraging performance of proposed LDRS can lead to the potential employment in the large scale Legal search engines.

• Challanges and Future work

Despite the success demonstrated, the proposed approach is limited by its consideration only for judgments with at least one citation. In a real scenario, there are many judgments without any single citation. In the future, consideration of all judgments would be an interesting aspect to enhance the proposed LDRS. Further, this research utilizes the Louvain approach, a crisp graph clustering approach that assigns each judgment to any particular cluster only. However, judgment may discuss multiple Legal concepts referring to dissimilar orientating references (i.e., criminal and civil). Accordingly, there is a possibility that a judgment may fall under more than one cluster. Hence, consideration of overlapping clustering can be a motivating research aspect in the future. In the Legal domain, the judgment citation network is very sparse owing to a lack of cited references. The network sparsity has severe impacts on the performance of the Legal information retrieval system. Hence, network connectivity would be enhanced by incorporating links constructed through textual information. Furthermore, the present work has not benefited from a parallel or distributed environment that can be investigated in the future.

Footnotes

Indian Kanoon Legal Document Repository:

Python’s time.clock() method is utilized to measure the processor time

Python’s numpy.ndarray.nbytes method is used to measure the space requirement (size) of an array of float point similarity scores (i.e., 4-byte).

References

Mandal

, Chaki

, Saha

, Ghosh

, Pal

, Ghosh

, Measuring similarity among legal court case documents, in: Proceedings of the 10th Annual ACM India Compute Conference, ACM, (2017), 1–9.

Kumar

, Reddy

P.K.

, Reddy

V.B.

, Suri

, Finding similar legal judgements under common law system, in: International Workshop on Databases in Networked Information Systems, Springer, (2013), 103–116.

Raghav

, Reddy

P.B.

, Reddy

V.B.

, Reddy

P.K.

, Text and citations based analysis of legal judgments, in: International Conference on Mining Intelligence and Knowledge Exploration, Springer, (2015), 449–459.

Kumar

, Reddy

P.K.

, Reddy

V.B.

and Singh

, Similarity analysis of legal judgments, in: Proceedings of the Fourth Annual ACM Bangalore Conference (2011), 1–4.

Nanda

, Adebayo

K.J.

, Di Caro

, Boella

and Robaldo

, Legal Information Retrieval Using Topic Clustering and Neural Networks., in: COLIEE@ ICAIL (2017), 68–78.

Blei

, Ng

and Jordan

, Latent dirichlet allocation, Journal of Machine Learning Research3(Jan) (2003), 993–1022.

Mikolov

, Sutskever

, Chen

, Corrado

G.S.

and Dean

, Distributed representations of words and phrases and their compositionality, in: Advances in Neural Information Processing Systems (2013), 3111–3119.

Mikolov

, Chen

, Corrado

and Dean

, Efficient estimation of word representations in vector space, arXiv preprint arXiv:1301.3781 (2013).

and Mikolov

, Distributed representations of sentences and documents, in: International conference on machine learning, (2014), 1188–1196.

10.

Koniaris

, Anagnostopoulos

, Vassiliou

Journal of Complex Networks6(2) (2017), 243–268Network analysis in the legal doma, a comlex model for European Union legal sources.

11.

Lodha

and Wagh

, Exploratory Analysis of Legal Case Citation Data Using Node Embedding, ICIC Express Letters13(10) (2019), 883–889.

12.

Leibon

, Livermore

, Harder

, Riddell

and Rockmore

, Bending the law: geometric tools for quantifying influence in the multinetwork of legal opinions, Artificial Intelligence and Law26(2) (2018), 145–167.

13.

Sugathadasa

, Ayesha

, de Silva

, Perera

A.S.

, Jayawardana

, Lakmal

and Perera

, Legal document retrieval using document vector embeddings and deep learning, in: Science and Information Conference, Springer, (2018), 160–175.

14.

Lin

, Brute force and indexed approaches to pairwise document similarity comparisons with Map Reduce, in: Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval, (2009), 155–162.

15.

Blondel

V.D.

, Guillaume

J.-L.

, Lambiotte

and Lefebvre

, Fast unfolding of communities in large networks, Journal of statistical mechanics: theory and experiment2008(10) (2008), 1–12.

16.

Lambiotte

, Delvenne

J.-C.

and Barahona

, Laplacian dynamics and multiscale modular structure in networks, arXiv preprint arXiv:0812.1770 (2008).

17.

Chicco

and Jurman

, The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation, BMC Genomics21(1) (2020), 6.

18.

Martinčić-Ipšić

, Miličić

and Todorovski

, The influence of feature representation of text on the performance of document classification, Applied Sciences9(4) (2019), 1–27.

19.

Chakrabarti

, Patodia

, Bhattacharya

, Mitra

, Roy

, Mandi

, Roy

and Nandy

, Use of Artificial Intelligence to Analyse Risk in Legal Documents for a Better Decision Support, in: TENCON 2018-2018 IEEE Region 10 Conference, IEEE, (2018), 683–688.

20.

Sugathadasa

, Ayesha

, Synergistic union of word2vec and lexicon for domain specific semantic similarity, in: 2017 IEEE International Conference on Industrial and Information Systems (ICIIS), IEEE, (2017), 1–6.

21.

Chalkidis

and Kampas

, Deep learning in law: early adaptation and legal word embeddings trained on large corpora, Artificial Intelligence and Law27(2) (2019), 171–198.

22.

Guo

, Lu

and Wei

, An Improved LDATopic Modeling Method Based on Partition for Medium and Long Texts, Annals of Data Science (2019), 1–14.

23.

Kamkarhaghighi

, Gultepe

and Makrehchi

, Deep Learning for Document Representation, in: Handbook of Deep Learning Applications, Springer, (2019), 101–110.

24.

Kuppevelt

and Dijck

, Answering legal research questions about dutch case law with network analysis and visualization, in: Legal Knowledge and Information Systems: JU-RIX 2017: The Thirtieth Annual Conference, Vol. 302, IOS Press, (2017), pp. 95–100.

25.

Zhang

, Xu

and Zhang

, Clustering scientific document based on an extended citation model, IEEE Access7 (2019), 57037–57046.

26.

Kessler

M.M.

, Bibliographic coupling between scientific papers, American Documentation14(1) (1963), 10–25.

27.

Small

, Co-citation in the scientific literature: A new measure of the relationship between two documents, Journal of the American Society for Information Science24(4) (1973), 265–269.

28.

Grover

and Leskovec

, node2vec: Scalable feature learning for networks, in: Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining, (2016), 855–864.

29.

Mihalcea

and Tarau

, Textrank: Bringing order into text, in: Proceedings of the 2004 conference on empirical methods in natural language processing, (2004), 404–411.

30.

Curiskis

S.A.

, Drake

, Osborn

T.R.

and Kennedy

P.J.

, An evaluation of document clustering and topic modelling in two online social networks: Twitter and Reddit, Information Processing & Management57(2) (2020), 1–21.

31.

Shi

, Liu

, Yu

, Zhang

, Li

, Wu

, Wang

, Huang

and Li

, Bibliometric Analysis of Medical Malpractice Literature in Legal Medicine from 1975 to 2018: Web of Science Review, Journal of Forensic and Legal Medicine66 (2019), 167–183.

32.

Venkatesh

R.K.

, Legal documents clustering and summarization using hierarchical latent Dirichlet allocation, IAES International Journal of Artificial Intelligence2(1) (2013), 27–35.

33.

Nerurkar

, Chandane

and Bhirud

, A comparative analysis of community detection algorithms on social networks, in: Computational Intelligence: Theories, Applications and Future Directions-Volume I, Springer, (2019), 287–298.

34.

Mora

, Deakin

and Reid

, Combining co-citation clustering and text-based analysis to reveal the main development paths of smart cities, Technological Forecasting and Social Change142 (2019), 56–69.

35.

Wang

, Gong

, Liu

and Wu

, Preventing epidemic spreading in networks by community detection and memetic algorithm, Applied Soft Computing89 (2020), 1–10.

36.

Garza

S.E.

and Schaeffer

S.E.

, Community detection with the Label Propagation Algorithm: A survey, Physica A: Statistical Mechanics and its Applications534 (2019), 1–28.

37.

Traag

V.A.

, Waltman

and van Eck

N.J.

, From Louvain to Leiden: guaranteeing well-connected communities, Scientific Reports9(1) (2019), 1–12.

38.

Yang

, Algesheimer

and Tessone

C.J.

, A comparative analysis of community detection algorithms on artificial networks, Scientific Reports6 (2016), 1–18.

39.

Baker

H.K.

, Pandey

, Kumar

and Haldar

, A bibliometric analysis of board diversity: Current status, development, and future research directions, Journal of Business Research108 (2020), 232–246.

40.

Ashihara

, Chu

, Renoust

, Okubo

, Takemura

, Nakashima

and Nagahara

, Legal Information as a Complex Network: Improving Topic Modeling Through Homophily, in: International Conference on Complex Networks and Their Applications, Springer, (2019), 28–39.

41.

Harris

Z.S.

, Distributional structure, Word10(2–3) (1954), 146–162.

42.

Arthur

, Modularity and projection of bipartite networks, Physica A: Statistical Mechanics and its Applications549 (2020), 1–16.