Abstract
Open data gained considerable traction in government, nonprofit, and profit organizations in the last several years. Open judicial data increase transparency of the judiciary and are an integral part of open justice. This article identifies relevant judicial data set types, reviews widely used open government data evaluation methodologies, selects a methodology for evaluating judicial data sets, uses the methodology to evaluate openness of judicial data sets in chosen countries, and suggests actions to improve efficiency and effectiveness of open data initiatives. Our findings show that judicial data sets should at least include court decisions, case registers, filed document records, and statistical data. The Global Open Data Index methodology is the most suitable for the task. We suggest considering actions to enable more effective and efficient opening of judicial data sets, including publishing legal documents and legal data in standardized machine-readable formats, assigning standardized metadata to the published documents and data sets, providing both programmable and bulk access to documents and data, explicitly publishing licenses which apply to them in a machine-readable format, and introducing a centralized portal enabling retrieval and browsing of open data sets from a single source.
Since the introduction of open data initiatives, there has been steady progress in opening government data sets (United Kingdom Cabinet Office, 2017; U.S. General Services Administration, 2017). Open judicial data is an important factor in increasing transparency, participation, and collaboration of citizens and civil society, which enable access to justice and supporting the fight against corruption. Thus, open data is an integral part of open justice. However, compared to data sets published by the legislative and executive branches of government, the progress in opening judicial data sets has been slower. This article aims to assess the current state of affairs in open judicial data, to identify good practices, and to suggest how the process of opening judicial data can be improved.
Open Knowledge (2017a) defines basic open data as “data that can be freely used, re-used and redistributed by anyone—subject only, at most, to the requirement to attribute and share-alike.” Tauberer and Lessig (2007) suggest eight basic principles to consider when opening government data: complete (all public data are made available), primary (data are as collected at the source, with the highest possible level of granularity and not in aggregate or modified forms), timely (data are made available as quickly as necessary to preserve the value of the data), accessible (data are available to the broadest range of users for the widest range of purposes), machine-processable (data are reasonably structured to allow automated processing), nondiscriminatory (data are available to anyone, with no requirement of registration), nonproprietary (data are available in a format over which no entity has exclusive control), and license-free (data are not subject to any copyright, patent, trademark, or trade secret regulation although reasonable privacy, security, and privilege restrictions may be allowed). In addition to these basic principles, additional ideas include online and free (data are available on the Internet at no charge, or at least no more than the marginal cost of reproduction), permanent (data should be made available at a stable Internet location indefinitely and in a stable data format for as long as possible), trusted (data should be digitally signed or include attestation of publication/creation date, authenticity, and integrity), presumptively open (the government and parties acting on its behalf will make public information available proactively, and they will put that information within reach of the public with low to no barriers for its reuse and consumption), documented (documentation about the format and meaning of data should be published), safe to open (data should be published using data formats that do not include executable content), and designed with public input (the public is in the best position to determine what information technologies will be best suited for the applications the public intends to create for itself).
Open data, as defined in Open Knowledge (2017b), must satisfy the two conditions of being technically and legally open. Data are legally open if data licenses allow anyone to freely access, reuse, and distribute the data. Data are technically open if it is available in a machine-readable format and in bulk for a price that is not greater than the price of reproduction (machine-readable format is a structured format that enables automatic data processing and data is available in bulk if the user can download the complete data set). Therefore, open government data are data produced or contracted by the government (or entities controlled by the government) that anyone can freely access, reuse, and redistribute (Open Knowledge, 2017c). The analogue definition for open judicial data would be data produced or contracted by the judicial branch of the government (or entities controlled by the judicial branch of the government) that anyone can freely access, reuse, and redistribute.
According to common law understanding, open justice can be defined as a speedy and public trial (United States Bill of Rights, 1791) and a fair and public hearing by an independent and impartial tribunal established by law (Council of Europe, 1950). According to Jiménez-Goméz (2017), the open justice principle states that judicial procedures should be open to the public, including information of judicial records and public hearings. However, openness in the judiciary has currently expanded beyond this view. Jiménez-Goméz (2014) defines open judiciary as an extension of open government’s philosophy and principles (including transparency, participation, and collaboration) applied and contextualized in the justice field where innovation and information and communication technologies (ICT) will be key tools to its achievement. Elena (2015) cites Naser and Concha (2012) who state that open justice increases judicial access, provides more effective justice service delivery, implements transparency policies, and promotes more open and closer-to-citizens institutions. Opening judicial data sets increases transparency of the judiciary that is an integral part of open justice and open judiciary. Therefore, open judicial data is an important factor that contributes to open justice and open judiciary.
What should be considered with the progress of technology and rising concerns about privacy and national security when opening judicial data sets is how McLachlin (2014) sees the open justice principle compared to security and privacy. Therefore, maintaining the open justice principle has become a challenging task in modern society. It is suggested that the open justice principle should not be considered as an absolute principle and the final decision on where to draw the line depends on each particular case and should be left to courts.
In an assessment of open data impacts in the judiciary branch, Elena, Aquilino, and Pichón Riviére (2014) discuss the benefits of open data, including how it keeps citizens informed about government activities and helps the government increase transparency to be more effective. Also, reuse of public information enables development of various projects and activities. Citizens’ involvement in the development of innovative solutions based on public data sets is beneficial for many stakeholders. Public data access increases credibility, transparency, accountability, and citizens’ confidence in the government helping them to understand the complete process administered by the government and not just its final decisions. They stress that the opening of data alone is not sufficient and promoting these benefits to society is important for their realization.
Sudbeck (2006) recognizes the connection between publishing court records and an increase of judicial accountability, public trust, and confidence bringing citizens more efficient and effective access to judiciary saving their time and efforts, especially in rural areas. Public access makes it more apparent to citizens how the judiciary works. Finding the balance between transparency in the administration of justice and privacy of involved parties is also identified as crucial.
Gomez-Velez (2005) observed public interest in Internet access to court records regarding government transparency as a way to demystify court actions as well as contributes to accountability and public confidence. Also, private sector interest is recognized for its use of court records for commercial purposes, for example, providing online services based on the data. Availability of bulk and aggregated data is significant for such purposes.
Based on a workshop involving participants from the government, academic, and civil sectors, Harrison, Pardo, and Cook (2012) use an ecosystem metaphor to describe a mutual interdependence between the government and interacting entities. In such an ecosystem, the quality and usefulness of government data contribute economic and social innovations and accountability. An open government ecosystem is represented as an intersection of economy, legal, and policy domains where the government, innovators, and citizens interact through interdependent relationships. The significance of three issues determining the dynamics of the ecosystem is emphasized, including intentionality, value creation, and sustainability. Opening judicial data can improve the open government data ecosystem by enabling the private sector to use open judicial data to provide services to citizens and companies (cf. Open Courts, 2018) and enabling other government institutions to access the data (e.g., judicial decisions in criminal law, rulings on domestic violence, and statistical data).
In most developed countries, the government is divided into the three separate branches of legislative, judicial, and executive. Therefore, open government data sets can also be divided into data sets published by these branches. Presently, the process of opening data sets belonging to the judicial branch of government is widely believed to be slower compared to those belonging to the legislative and executive (Bargh, Choenni, & Meijer, 2016; Elena, 2015; Jiménez-Gómez, 2017).
To apply the open government initiative in the judicial context, Elena (2015) provides guidelines for an open judicial government focused on transparency and access to judicial information (i.e., ensuring that all public information is accessible to the public), features of open data in the judiciary (including accessibility, non-discrimination, reusability, sustainability, and relevance), participation in the judicial system (e.g., active participation in the search of solutions, public hearings, and experts opinions), and collaboration between the judiciary and civil society (e.g., innovative design and implementation of judicial public policies). Initiatives for releasing open data from the judicial sector consider open judicial data as a resource to increase government transparency and accountability (Bargh et al., 2016). Furthermore, these data offer the public an insight into the work of government bodies enabling a valuable service to provide interoperability with other open data. Comparing to similar open data initiatives from other sectors, they note that the publishing of open judicial data is rarely present in the literature.
We intend to identify and analyze worldwide initiatives that focus on opening judicial data sets. We identify the most important judicial data sets types, critically review several widely used open government data evaluation methodologies, and then select a methodology most suitable for evaluating the identified data set types, which comparatively assesses openness of the judicial data set from developed and developing countries. A cross-country comparison identifies the most successful initiatives and uses these experiences to suggest actions to improve efficiency and effectiveness for less successful initiatives. We also investigate different approaches for the protection of privacy and confidentiality of judicial data as well as the publication of open data licenses, which is an important factor slowing down production and consumption of open judicial data.
The article is structured beginning with related work on open data along with a review of open judicial data in the second section. The third section describes our method to evaluate openness of judicial data sets. The results of the evaluation are outlined in the fourth section followed by a discussion of results, identification of challenges faced when opening judicial data sets, and suggestions for actions to improve open data processes and products in the fifth section. Concluding remarks are offered in the sixth section.
Related Work
The main reason for opening data is to increase transparency and accountability of government institutions and officials, increase their efficiency and effectiveness, and create new business opportunities and new jobs. This section reviews related work on open data, open judicial data, open data assessment methodologies, and open data licenses.
Open Data
As noticed by Gray (2015), one of the Open Knowledge directors, publishing open data is of course not sufficient for open governments or open societies. It is just one ingredient in the mix, and no replacement for other vital elements of democratic societies, like robust access to information laws, whistleblower protection, and rules to protect freedom of expression, freedom of the press and freedom of assembly.
In Wirtz, Piehler, Thomas, and Daiser (2016), the resistance of public servants to the implementation of open government data is divided into the five barriers of a perceived legal barrier, perceived bureaucratic decision barrier, perceived organizational transparency, perceived hierarchical barrier, and a perceived risk-related attitude of administrative employees. The perceived risk-related attitude, which reflects an unwillingness to accept new technologies as they might have adverse effects on an employee’s job and responsibilities, was shown to have the most influence on the resistance to the introduction of open government data. Therefore, some care should be taken when choosing public servants to perform open data implementation tasks.
Open Judicial Data
Elena et al. (2014) suggest that the judiciary should publish at least court rulings, statistical data, and budget and administration data (e.g., budget allocation, procurement, and contracting data). Evaluation of these types of data sets was conducted for Argentina, Chile, and Uruguay using a methodology developed by Center for the Implementation of Public Policies for Equity and Growth based on four levels, including descriptive, diagnostic, analytical, and prospective. Recommendations offered by this article focus on increasing the awareness of the benefits of open judicial data, how to implement open data policies, and monitoring and evaluating how these policies should be performed. An assessment of judicial websites performed by Sandoval-Almazan and Gil-Garcia (2015) considers the four main website components: information (characteristics and organization of information on the website), interaction (possible methods for contacting public officials), integration (vertical integration, including the same content type between different institutions, and horizontal integration, including different content type between the same institution), and participation (the tools provided by the website, e.g., blogs, forums, and chats). A final grade is calculated as an average value of the answers provided through a questionnaire. Results show that most websites lack a user-centric interface as well as basic information related to open data and transparency.
In a study on publishing court decisions, van Opijnen, Peruginelli, Kefali, and Palmirani (2017) only considered online repositories accessible by anybody for free. The questionnaires were answered by 28 EU member states and 3 European courts. Recommendations for improving the accessibility of court decisions included publishing criteria should be precise and publicly available, negative selection should be applied to the highest jurisdiction courts, and positive selection should be applied to the lowest jurisdiction courts, large case law databases should provide importance tagging, decisions should be licensed with licenses that allow reuse (e.g., CC-BY and CC-0), and decisions should be published in computer-readable formats.
Open Data Assessment Methodologies
Some of the open government data evaluation methods are the 5-Star Linked Data schema (Berners-Lee, 2013), the Open Data Readiness Assessment (ODRA; World Bank, 2015), the Global Open Data Index (GODI; Open Knowledge, 2017d), and Open Data Barometer (ODB; W3C, 2017). In addition to the core purpose of evaluating the openness of a data set, these methods can be used as a guideline and action plan for implementing open data sets.
Tim Berners-Lee, the creator of World Wide Web and the initiator of the linked data project, suggests a 5-star linked data publication schema, which is a cumulative system meaning that each additional star assumes the conditions necessary for the previous stars are fulfilled (Berners-Lee, 2013). The system is described as follows: One star means that data are accessible online in any format under an open license, two stars represent data accessible in structured, machine-readable format, three stars means that data are accessible in a nonproprietary format, four stars are that data are published using open standards recommended by W3C, and five stars represent data linked with other open data.
The ODRA is a publicly accessible methodology for assessing the readiness of a country and an individual institution for evaluation and implementation of the Open Data Initiative. The methodology is action-oriented, which means that it serves to help government institutions select actions to be taken to support the initiative. It spans both the publication of data (supply) and use of data (demand) as well as other aspects, such as skill development, funding research, and funding open data plans. The methodology evaluates readiness in eight dimensions: higher leadership, legal framework, the structure and the capability of institutions, data management procedures and policies, open data demand, citizen participation, funding open data programs, and national infrastructure for technology and skill transfer (Zijlstra, Cerović, & Ivić, 2015). A set of actions providing a basis for an open data action plan is proposed following this analysis.
The GODI measures government data openness in 122 countries each year. It relies on an “open definition” according to which “open data can be freely used, modified, and shared by anyone for any purpose.” Advantages of GODI as a metric are that it is based on claims of citizens instead of the government, it enables comparison of the same categories of data sets in different countries, it helps citizens to learn about open data and its availability in their countries, and it follows the change of open data over time. So, GODI evaluates whether data are indeed published in a way that is accessible to citizens, media, and civil society. Although GODI considers a wide range of government data, none of its data sets belongs to the judiciary.
The ODB analyzes readiness for the implementation and opening of data. It is a part of W3C foundation work on data openness evaluation methods and is based on ranking three types of inputs. The first is expert opinion, where experts from each country answer questions about open data in their country, second is detail assessment, where technical experts provide an assessment based on answers to these questions, and the third is secondary data, where data are based on the assessments of the World Economic Forum, United Nations Development Programme, and World Bank experts. For ranking, three indexes are considered including the readiness index, the implementation index, and the impact index. The readiness index measures readiness for the successful introduction of open data practices, the implementation index measures the quality and quantity of an open data implementation, and the impact index measures the impact of open data on different spheres of life, such as the political, social, and economic spheres.
After reviewing the various open data assessment methodologies, we selected the GODI methodology because it intentionally limits its inquiry to the publication of data and does not consider other aspects of the common open data assessment framework, such as context, use, or impact. This narrow focus enables a standardized, robust, and comparable assessment of open data around the world. Furthermore, it is product-oriented instead of process-oriented, thus simplifying the evaluation process, as it does not require interviewing open data stakeholders, which is a time- and resource-intensive process.
Open Data Licenses
Three types of government data licenses are described in Gray and Darbishire’s (2011) study that include case-by-case (published data are protected by copyright and permission to reuse data is given on a case-by-case basis), reuse permitted (copyright is protected, but reuse by the public is explicitly permitted by a license or other act), and public ownership (documents or data sets are exempt from copyright with no limitations on reuse). Some of the most commonly used open data publishing licenses allowing reuse include Creative Commons (CC; Creative Commons, 2017) and Public Domain Dedication and License (PDDL; Open Data Commons, 2007). All CC licenses share the basic set of rights and duties of noncommercial copying and distribution are allowed, authors are attributed to their work, and the license is applicable throughout the world. Authors can decide to add additional rights and duties, such as share-alike (same conditions that are applied to the distribution of the original work are applied to the distribution of derivate work), noncommercial (copying, distribution, and adaptation are allowed only for noncommercial purposes), and no derivation (only the unchanged original work can be copied and distributed as a whole). It is interesting to mention that CC licenses consist of three layers. A “legal layer” is written in the language for lawyers, a “human-readable layer” with the most important elements of the license written in a language that can be understood by a general population, and a “machine-readable layer” using the CC rights expression language (Abelson, Adida, Linksvayer, & Yergler, 2008) that can be understood by computers. In 2008, the Open Data Commons project published an open data license called the PDDL, which was co-opted by Open Knowledge soon after, in 2009. PDDL allows anyone to freely share, modify, and use work for any purpose.
Method
The authors assessed the openness of available judicial data sets in selected countries from August to September 2017 using a customized GODI methodology. The assessment was conducted by analyzing the structure of the judiciary in each country, searching for websites of the relevant judicial institutions, evaluating the data sets using the customized GODI methodology, collecting the results in LibreOffice spreadsheets, and developing a narrative of the state of the open judicial data in each country.
The focus of the research was on the implementation of open data initiatives and not its readiness (conditions in a country, city, or sector determining if open data initiatives are likely to be successful) or impact (whether open data led to change). The plan was to conduct a comparative analysis of the state of open judicial data in several representative developed and developing countries and to suggest concrete guidelines for improving data openness. The key limiting factors included the language barrier and available resources. Therefore, we decided to assess the state of open judicial data in several East and Southeast European countries using an official language we understand and limit the scope to several developed countries selected based on the official language and legal tradition (i.e., civil or common law tradition).
We analyzed the openness of judicial data sets of developed countries from the United States of America, United Kingdom, and Austria as well as developing countries from Russia, Slovenia, Croatia, Bosnia and Herzegovina, Serbia, Montenegro, and Macedonia. For a detailed analysis of open data in the Western Balkan countries, see Stojkov, Gostojić, Sladić, Marković, and Milosavljević (2016), and Marković, Gostojić, Sladić, Stojkov, and Milosavljević (2016) presents a preliminary result of the analysis of open data in judiciary.
Judicial data sets analyzed in this article include court decisions, case registers, filed document records, and statistical data. The choice is based on its influence on transparency and accountability of the judiciary and brief explanations for each data set follow.
Court decisions, such as judgments and rulings, should be available after they are pronounced to the parties or after parties receive decisions by mail. Public availability of court decisions increases transparency of the judiciary by showing how justice is administered. Court decisions may contain the case number along with the content and finality of the decision.
Case registers contain metadata collected during the lifecycle of the case (e.g., data related to the parties, hearings, and case movement) and show if actions are taken before deadlines and if the trial is conducted within a reasonable time. A case register may contain the case number, the date and time of the initial act and when the case was formed, data about the parties and judges, the subject matter, the case numbers of connected cases (e.g., the appellate court’s case or the prosecutor’s case), the data about hearings and the movement of the case, and the status of the case (solved or in progress).
Filed document records keep track of the order in which documents are received and how cases are initiated and assigned to judges, contributing to the transparency of the courts and the fight against corruption. Filed document records may contain data about the filing party, the date and time the document was filed, the case number, the manner of submission (e.g., in person or by mail), and the type of the document (e.g., complaint, appeal, or motion).
Statistical data about courts, judges, or subject matter quantifies caseload and quality of work to help measure effectiveness and efficiency of courts and judges. The statistical data may contain the number of recent cases and solved or unsolved cases in a given period.
The GODI methodology evaluates data sets using both qualitative and quantitative variables. Qualitative variables are determined by answering five questions: “Is the data collected by the government (or a third party related to or linked to the government)?” (Q2), “Is the data available online?” (Q1), “Where can the data be found?” (Q5), “How much do you agree with the following statement: ‘It was easy for me to find the data.’” (Q6), and “How much human effort is required to use the data?” (Q11). If the answer to questions Q1 or Q2 is negative, the data set is not scored. Otherwise, each data set is scored by summing up six variables determined by answering six additional questions: “Is the data available online without the need to register or request access to the data?” (Q3), “Is the data available free of charge?” (Q4), “Is the data downloadable at once?” (Q7), “Is the data up-to-date?” (Q8), “Is the data openly licensed or in the public domain?” (Q9), and “Is the data in open and machine-readable file formats?” (Q10). We weighted the quantitative variables as suggested by the GODI methodology.
All of the questions are briefly explained using definitions available from Open Knowledge (2017e). The answer to Q2 is positive if data are collected by the government or an organization officially representing the government and negative for organizations that do not represent the government or organizations not relevant for collecting the data. This question is essential for considering the data as open government data. Q1 gives 15 points for data publicly published by the government, which is mandatory for all following questions. If data are not available online or registration is required, then no points are assigned. Q3 considers online availability even when registration is required. Our methodology tolerates the registration process as an agreement to terms of use but not as discrimination of some user groups. Q4 assigns 15 points for data available free of charge. If the user has to pay to access the data set, then it is not considered open by the Open Definition. An answer to Q5 is a URL locating the data set since, without access to the data set, it is impossible to complete an evaluation with the GODI methodology. Q6 is a subjective question evaluating how easy it is to find the data set with answers provided using a Likert-type scale (Likert, 1932). Q7 provides an answer if the data are immediately downloadable with a positive answer awarding 15 points to data sets that do not require downloading dozens of small pieces of information, provide access through a search interface only, sending requests, and having a CAPTCHA or other limits to download. Q8 assigns 15 points to up-to-date data sets, which is a context-dependent characteristic. Q9 gives 20 points to data sets published using open licenses or in the public domain as legal openness is one of the requirements of the Open Definition. Q10 awards 20 points for publishing in open, machine-readable file formats, where a format is considered open if it can be processed with at least one free and open-source software tool. Machine-readability is a significant enhancement of technical usability. Finally, Q11 measures how much human effort is required to use the data on a scale from 1 to 3, which is a subjective score and depends on the context and the purposes of its use.
Only data sets which can be accessed without registration were included in the survey because most of the questions prescribed by the GODI methodology cannot be answered without access to the data set and its metadata. Also, completeness of published data sets was not captured by this assessment. For example, most countries publish selected decisions only, such as decisions delivered by supreme or appellate courts.
Results
The key characteristics of openness of judicial data as evaluated by the GODI methodology are described in this section and summarized in Table 1. Each subsection describes the organization of the courts in the selected country and how the identified judicial data sets were published, if at all. In some countries, the constitutional court is not considered the part of the judiciary. Therefore, constitutional courts are omitted from this survey.
Summary of the Results.
Note. Available at https://goo.gl/xwUh1m
Austria
The Austrian regular courts, proceeding civil and criminal cases, are divided into four levels (The Federal Ministry of Justice, 2017). From the bottom to the top of the hierarchy, there are district courts, regional courts, higher regional courts, and The Supreme Court. The Austrian Federal Chancellery performs the publishing of court decisions through Das Rechtsinformationssystem des Bundes (RIS), the Legal Information System of the Republic of Austria (Bundeskanzleramt Österreich, 2017).
In addition to case law, RIS contains legislation, decrees, and other legal documents. RIS provides three-way access to anonymized court decisions including through a website search, web service application programming interface (API), and a mobile application called RIS:App. Decisions include an assigned ECLI identifier and are available in XML, HTML, PDF, and RTF formats. No information regarding if data is openly licensed or in the public domain is provided. In contrast to other reviewed countries, Austria provides access to a data set with complete court decisions. Citizens can file documents with the court, but they are not published. Also, court cases register data are not opened.
Along with annual statistical data provided via the website of the Austrian Ministry of Justice (The Federal Ministry of Justice, 2017) in HTML format without licensing information, there are annual reports of the Supreme Court (The Supreme Court, 2017). This information includes published data related to the proceedings before the court in PDF format with no open license information.
Bosnia and Herzegovina
The court organization in BiH is complex due to the organization of the state, which includes two entities and one district. The court system consists of three constitutional courts, three supreme courts, 16 county courts, 49 municipal courts, The Higher Commercial Court, and five commercial courts.
The Court Decision Base publishes judicial decisions and other judicial information (Baza sudskih odluka, 2017), operated and owned by The Judicial Documentation Center. A case registry and registry of filed documents are not available online while statistical data about courts are available annually.
Croatia
The judiciary’s structure of regular courts in Croatia consists of municipal courts as the first instance of courts established for the territory of one or more municipalities, county courts established for the territory of several municipal courts, and the Supreme Court of the Republic of Croatia as the highest instance. Case law portal (Case Law Portal, 2017) publishes decisions of Croatian courts in HTML and PDF formats. Access to court decisions is protected by CAPTCHA verification while no information on the open license is provided. Court case register data are published on the e-Case portal in HTML format (Portal e-Predmet, 2017). CAPTCHA again controls access and there is no open license information. Filed document records are not published. The Ministry of Justice of the Republic of Croatia (Ministry of Justice of the Republic of Croatia, 2017) publishes annual statistical data for Croatian courts with reports published in PDF format with no open license information.
Macedonia
The highest instance of Macedonian judiciary is the Supreme Court of the Republic of Macedonia. First instance courts of general jurisdiction are basic courts while appellate courts are established for the territory of several basic courts. The Judicial Portal of the Republic of Macedonia (Judicial Portal of the Republic of Macedonia, 2017) provides a search engine for court decisions. Decisions are available in PDF format while copyright belongs to the Judicial Portal of the Republic of Macedonia. The portal provides a separate section for each court where information on hearings schedules can be found. Statistical reports are also published on these pages in PDF format. The period for statistical reports varies from court to court ranging between 1 month and 1 year. Case register data and filed document data are not published.
Montenegro
General jurisdiction courts in Montenegro are organized in four tiers. It consists of basic courts, high courts, the Appellate court of Montenegro, and the Supreme Court of Montenegro as the highest court. Web presentations of the courts in Montenegro are accessible on the Courts of Montenegro (The Courts of Montenegro, 2017) portal. When a court is selected, its decisions are accessible in HTML format without any license information. A special portal section enables access to decisions from all courts. Individual courts, with several exceptions, publish their statistical reports in MS Word or PDF formats without license information. Court register data and filed document records are not available online.
Russia
Courts of general jurisdiction in the Russian Federation are headed by The Supreme Court of the Russian Federation and consist of federal courts and courts of constituent entities (Federal Constitutional Law, 2011). Federal courts of general jurisdiction include supreme courts of republics, courts of territories, regions, federal cities, autonomous regions, autonomous circuits, district courts, city courts, interdistrict courts, military courts, and specialized courts. Cases of general jurisdiction within constituent entities are performed by the Justices of the Peace.
The Information and Analytics Support Center State Automated System (SAS), referred to as “Justice,” was established by the Russian Federal government and provides a web portal for access to data of general jurisdiction courts (Internet Portal of SAS “Justice,” 2017). The portal publishes court case data and court decisions. Within court case data, the identification number and filing date of the document initiating the case are provided. Data are available only by search while personal names are displayed by the last name and the initial of the first name. In solved cases, decision texts are available in HTML format. Nevertheless, no information is provided regarding the licensing of published data. Some websites of Russian courts provide search for their court cases and decisions. Statistical reports are published on the website of The Judicial Department at the Supreme Court of the Russian Federation (Judicial Department at the Supreme Court of the Russian Federation, 2017) annually and semiannually in Microsoft Excel format. A section of this website dedicated to open data provides lists of divisions of Judicial Department at the Supreme Court of the Russian Federation in CSV format.
Serbia
The courts in Serbia are organized into general and special courts. General courts include The Supreme Court, four appellate courts, 25 higher courts, and 66 basic courts. Special courts include The Magistrates Appellate Court, 44 magistrate courts, The Administrative Court, The Commercial Appellate Court, and 16 commercial courts.
The Court Portal (Portal sudova Srbije, 2017) provides public access to records of filed documents. The selected decisions are available at the portals of The Supreme Court, appellate courts, and The Administrative Court. The Legal Information System (Pravno informacioni sistem, 2017) supports a case law database containing selected decisions available online, but the access is not free of charge. Statistical data about courts are provided annually.
Slovenia
The Slovenian court system is represented by the general courts (The Constitutional Court, The Supreme Court, 4 higher courts, 11 country courts, 44 municipal courts) and the special courts (The Higher Labor Court, 4 labor courts, and The Administrative Court).
The Sodna Praksa (Sodna praksa, 2017) portal publicly publishes court decisions, and the portal e-sodstvo (Portal e-sodstvo, 2017) publishes case registry and filed document records, but the access is not available to the public. Statistical data about the courts are available annually.
The United Kingdom
In the United Kingdom, the justice system is composed of courts and tribunals with The Supreme Court as the highest court. The structure of the judiciaries in England, Wales, Scotland, and Northern Ireland differs although the structures in England and Wales share some similarities. Thus, information about judiciary in England and Wales is available at (Courts and Tribunals Judiciary of England and Wales, 2017), in Scotland at (Scottish courts and tribunals, 2017), and in Northern Ireland at (Northern Ireland courts and tribunals, 2017).
The judiciary in England and Wales publishes delivered decisions on their website in PDF format under an Open Government License. Court case records in civil proceedings are available at the Case Tracker portal (Case Tracker for Civil Cases, 2017) in HTML format under the Open Government License. The Ministry of Justice publishes quarterly statistics for England and Wales courts on the UK Government website (Quarterly court statistics for England and Wales, 2017). Data are available in PDF and Microsoft Excel formats under an Open Government License, and other statistics related to the courts of England and Wales are available on the Open Justice (Open Justice, 2017) web portal.
The British and Irish Legal Information Institute (BAILII, 2017) collects and publishes British and Irish legislation and case law for free. Court decisions are available in HTML and RTF formats. BAILII refers to copyright given by the owners of legal material, and in the case of the UK judiciary, it is often Crown copyright.
Although the quality and the quantity of published datasets in England, Wales, Scotland, and Northern Ireland differes, data sets considered by this research include certain similarities. Chosen sets of decisions are available, filed document records are not published, and some statistical reports are present. Case register data are publicly available only for England and Wales courts. For this research, information collected for England and Wales will be considered for assessment of judicial data openness in the UK.
The United States of America
The judiciary in the United States of America is organized into federal courts and state courts. Federal courts include The Supreme Court, 13 courts of appeals, 94 district courts, and special jurisdiction courts. State courts include a Supreme Court, courts of appeals, trial courts, and special jurisdiction courts. As the status of data openness varies considerably between the states, only federal courts are reviewed in this article.
The Public Access to Court Electronic Records (PACER, 2017) service offers electronic cases and docket records from federal, appellate, district, and bankruptcy courts. Nevertheless, the CourtListener, a free legal research website containing millions of legal opinions from federal and state courts operated by a nonprofit organization and Cornell Law School’s Legal Information Institute (LII, 2017), publishes selected decisions delivered by The Supreme Court, federal courts of appeals, and other federal courts. Statistical data about courts are published by the Administrative Office of the U.S. Courts each quarter in PDF format.
As can be seen in Table 1, most available open data sets were scored similarly. Overall, published data are not provided in a machine-readable format, it is not published in bulk, although programmable access is provided in some cases, information about the legal status of the data is rarely available, and open data portals do not syndicate the data. On the other hand, the published data sets differ considerably. Each reviewed country publishes statistical data about its courts, although the level of detail differs, some reviewed countries publish delivered decisions in an open format, and few reviewed countries publish data about case register and filed document records. Interestingly, although we anticipated a larger difference between the scores of developed and developing countries, the research results suggest there are no significant differences between the scores.
Discussion
Compared to the quality and quantity of open data sets published by the legislative and executive branches, the quality and quantity of open data sets in the judiciary is usually the lowest. This conclusion is supported by the fact that open government data scores available at Open Knowledge (2017d) are higher than the open judiciary data scores presented in this article. Also, Elena (2015) states that judicial branches continue to be among the least willing institutions to implement policies on transparency and access to information, and Jiménez-Gómez (2017) argues that although many open government initiatives have been implemented around the world, most are related to the executive and legislative powers and institutions.
Obstacles in opening government data sets recognized in Michener and Ritter’s (2017) study are referred to as the “three-Ps” of open data resistance representing professional, political, and personal privacy concerns. Professional resistance comes from the possibility of assessing the quality of work based on open data sets. Political resistance reflects lack of readiness to dedicate both human and financial resources to publish data. Personal privacy is primarily affected by judicial data sets, as they can reveal personally sensitive information that causes irreversible damage once published. Although the judiciary provides partial solutions to correct damages caused to parties by wrong court decisions (e.g., appeals and rehabilitation), once those decisions are made public, they can remain online indefinitely.
Elena (2015) also states that the judicial branch is the most conservative, formal, and hierarchical branch, which suggests why the judiciary data openness is lower than the legislative or executive data openness. Recommendations for designing and implementing an open data policy for judicial systems include promoting and enabling the environment (promoting cultural change in the judiciary, promoting collaborative work between the judiciary and other branches of the government, creating communities of practice, promoting debate on open data legislation and regulation, and implementing open data training), implementation of open data policies (defining which data sets are to be open, developing collaborative partnerships with data users, promoting open data use, and using open data to underpin accountability), and monitoring and evaluating open data policies.
Differences in legal systems also influence the significance of open judiciary data. In countries with common law legal systems, case law has greater importance for the public because court decisions represent a source of law. Therefore, access to delivered decisions is not just an optional feature for citizens but a need. On the other hand, for countries with civil law legal systems, publishing case law has value more similar to the value of other government data.
As is the case with other government data sets, judicial data sets are also error-prone. Publishing incorrect or incomplete data can impair the citizens’ trust in the rule of law and judicial institutions. Migration from old to new information systems can result in incomplete data because revising data collected over multiple years is a challenging and a time-consuming task. Nevertheless, focusing on data quality issues today minimizes an identified problem with data collected and reported on in the future.
Opening data sets published by the judiciary is also a challenge because of the need to protect privacy and ensure confidentiality. A universal recipe for opening judicial data does not exist because different countries have alternate approaches to these protections. Specifically, data need to be anonymized and redacted before publishing. Anonymization means deleting or replacing personally identifiable data and redaction means deleting or replacing confidential data. One method for semiautomatic anonymization and redaction of judicial decisions is proposed by Sladić, Gostojić, Milosavljević, Konjović, and Milosavljević, (2016) that uses standardized legal document formats and formalized anonymization rules to facilitate this process. Two other methods for privacy protection, the restricted access and open access procedures, are described in Bargh et al.’s (2016) study. Reidentification (Conroy & Scassa, 2015) is another important question that should be considered during the opening of judicial data sets, which concerns a process of revealing the identity of an individual by published anonymized data that could be combined with other data. Data should be anonymized in such a way as to minimize the possibility of reidentification.
Until a complete opening of judicial data sets is available, the public can find statistical data about judges and courts valuable because it can reveal the quality of their work. Transparency of judicial data has an important role in increasing public trust in the judiciary and the fight against corruption, as explained in Granickas’s (2014) study. Publishing data about judges (e.g., name, biography, court of employment, date of employment, history of cases, statistical data about workload, and average time needed to make a decision) and courts (e.g., name, contact data, hearing schedule, judicial decisions, and statistical data) can improve transparency and minimize corruption.
The first precondition for opening judicial datasets is the standardization of formats for the representation of documents, metadata, and identifiers. These should enable relatively straightforward representation of the content of the documents in machine-readable format in the near future and should be based on W3C recommendations and linked data principles. The standardized format should include an electronic signature to confirm authenticity and the integrity of documents.
Instead of publishing judicial decisions in HTML, Microsoft Word, or PDF formats, machine-readable XML formats, such as Akoma Ntoso (Palmirani & Vitali, 2011), LegalDocML (OASIS, 2017a) or CEN MetaLex (Boer et al., 2010), should be adopted. These formats could be customized to the local language and legal system enabling automatic anonymization and redaction of judicial decisions and thus the timely and complete publication of judicial decisions. Furthermore, documents written by these formats can be automatically transformed into other formats as needed.
Metadata should be published in one of the standardized formats based on RDF and OWL. The machine-readable formats suggested above also specify a metadata set that can be assigned to court decisions allowing for the advanced retrieval and browsing of judicial documents. ELI (European Commission, 2017a), ECLI (European Commission, 2017b), URN:LEX (Spinosa, Francesconi, & Lupo, 2017), or LegalCiteML (OASIS, 2017b) formats should be used to identify court decisions to enable globally unique identification of court decisions in a machine-readable format. Globally unique identifiers, metadata, and machine-readable document formats provide easier integration of judicial information systems and integration of these information systems with those owned and operated by other organizations. Case registry and filed document records should be published in CSV, XML, JSON, and RDF formats. Some relevant XML formats include LegalXML Electronic Court Filing (OASIS, 2017c) and National Information Exchange Model Justice (2017).
Linked open data (LOD) is open data published by linked data guidelines. The linked data paradigm drove the transition from a document-oriented web, through a linked data-oriented web, to the semantic web (W3C, 2015). So, linked open government data (LOGD) is data produced or contracted by the government that anyone can freely access, reuse, and redistribute and is published by linked data guidelines. Publishing judicial data sets as LOD is a promising solution to the machine-readable publication of court decisions, case registries, filed document records, and statistical data.
The second precondition is to make information about the legal conditions under which data are published available at government websites. Therefore, it is necessary to develop and publish open data licenses or to amend legislation and regulation in force. If possible, licenses should be published in a machine-readable format. In most jurisdictions, intellectual property laws prevent use, reuse, and redistribution of data without explicit permission of the copyright holder. Although laws, regulation, and judicial decisions are exempt from copyright in many jurisdictions, this does not apply to assigned metadata. If data are not in the public domain, then it is necessary to specify the license under which it is published explicitly, and the license should be published in a machine-readable format.
Third, data access should be enabled both through an API and in bulk. Data accessed in bulk are usually static, and data accessed through APIs are usually dynamic. High-quality data available in bulk are necessary for the implementation of high-quality APIs (i.e., if the data do not satisfy the quality requirements, then neither will an API offering access to the data). Therefore, government institutions should publish quality data in bulk first and check if the published data satisfy users’ requirements. Only if these requirements are satisfied, then they should invest in the development of an API to offer additional functionality.
Finally, open judicial data sets and its metadata should be published individually (i.e., each judicial institution should publish the data and metadata for which it is responsible). However, an open data portal should also be introduced (by a supreme court or a justice ministry) with the metadata describing the data sets being syndicated by the portal to enable centralized retrieval and browsing of open judicial data sets.
Conclusion
In this article, crucial judicial data set types were identified, several widely used open data evaluation methodologies were reviewed, the GODI methodology was selected as the most suitable for evaluating identified data sets and applied to assess the openness of judicial data sets in select countries, and the assessment identified challenges faced when opening judicial data sets, actions were suggested to improve open judicial data initiatives.
Courts decisions, case registers, filed document records, and statistical data were identified as the most relevant judicial data sets. The results of the evaluation suggest that the openness of data sets was scored similarly in each country included in the survey, but not all identified judicial data sets were present in each country. The main drawbacks in published data sets include data not being available in bulk, not available in a machine-readable format and not being released in the public domain, not being published with an open license, or not having the publication license explicitly specified. Each country publishes statistical data about its courts, with some publishing delivered decisions in an open format and others publishing data about case registers and filed document records.
Publishing statistical data about judges and courts can be an initial step to accomplish before implementing a full opening of judicial data sets. Some actions to consider for enabling more effective and efficient opening of judicial data sets are to anonymize and redact data before publishing to support the protection of privacy and confidentiality, to publish legal documents and legal data in standardized machine-readable formats developed in the legal informatics community (e.g., LegalDocML and LegalCiteML), to assign standardized metadata and identifiers to the published documents and data, to provide both programmable and bulk access to documents and data, to explicitly publish open data licenses which apply to them in a machine-readable format, and to introduce a centralized portal enabling retrieval and browsing of open data sets from a single source.
Footnotes
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
