Abstract
We present a solution to data ownership in the surveillance age in the form of an ethically sustainable framework for managing personal and person-derived data. This framework is based on the concept of Datenherrschaft – mastery over data that all natural persons should have on data they themselves produce or is derived thereof. We give numerous examples and tie cases to robust ethical analysis, and also discuss technological dimensions.
Keywords
Introduction
We have entered an age of true mass surveillance that has previously been possible only in dreams – or nightmares – in the last millennium. During the last decade, active and concentrated effort on collecting and analysing Internet and other digital communications has been done by various state and corporate actors, and the widely reported revelations on mass surveillance by Edward Snowden [7] in 2013 permanently brought this topic to the light of the public discussion. The consolidation of communications on a single global communications network and the disappearance of the excessive cost of surveillance due to advances in computing have created a brave new world where surveillance is constant, global and near-real-time.
In this article, we focus on mass surveillance of individuals that aims to collection of information that is comprehensive as possible. Collected data can reveal private or sensitive information on individuals and from which it is possible to derive potentially harmful information. Collected data can be – and, indeed, has been – used for nefarious purposes in the past, by individuals, private companies and governments alike. Protection of this kind personal data has recently begun to be codified into legislation as a response to this misuse. Perhaps the most prominent examples of such legislation are the General Data Protection Regulation (GDPR) [76] and the forthcoming updated ePrivacy directive in the European Union [28]. In the United States, there is no comprehensive privacy law that protects individuals from the use of their personal data by the private sector. However, California is the first state that introduced its own privacy law, the California Consumer Privacy Act (CCPA), in June 2018. California is in the vanguard, as the tech giant headquarters are located there and thus privacy policy standards compliant with CCPA will be most likely to be adopted in other states as well [5]. The CCPA is one example of EU led data protection trend that gives more protection to individuals and is an appealing idea that many jurisdictions have followed [84].
Surveillance data is, in essence, both personal and person-derived data. By personal data we mean information that defines personal attributes or characteristics of an individual, thus making is possible to directly identify people with it [76]. By person-derived data we mean information that is collected from an individual but does not focus on personal information (i.e. personal data), but rather on the actions of the individual in networked society, such as location information, browsing history, and so on. Even though this separation is somewhat artificial it helps our analysis, as person-derived information may have undesired consequences for individuals, for example through profiling [see 42], even if their identities are not immediately revealed.
Whether data is gathered via targeted or untargeted surveillance, physical or online, it is most likely tied to a person and thus is personal or person-derived. This information is stored, analysed, and used to make decisions that can affect the person and those near them, among others. Due to its nature, surveillance data is usually collected covertly and without the knowledge of its targets. Those who gather surveillance information do not usually want their targets to know that they are targeted at all. Indeed, any inquiries for surveillance data gathered on people will probably be met with an amused but firm blanket denial from relevant agencies. Thus, one could argue that the data collection can be assumed to happen anyway, as it takes quite drastic measures not to be targeted for mass surveillance on the Internet.
It can be argued that we, as human beings, should have a right to assert some kind of control over how data derived from us is used and combined with other data. This is because personal data is a representation of us and nowadays an inseparable part of our being. Likewise, person-derived data can reveal so much about us that people would be surprised if they could “see” the situation. In this article we discuss how to secure the privacy and rights of the individual from surveillance and invasion of privacy. In a just society, at least, there would be an attempt to do so. A government – and certainly no other actors – cannot wield arbitrary power over its citizens. The state should respect the rights of its citizens, like e.g. both Locke [57] and Rawls [74] require the society to be a just one.1
Locke and Rawls can be characterised as philosophers of social contract approach amongst others such as Jean-Jacques Rousseau, Thomas Hobbes, Immanuel Kant etc.
Thus, there exists a clear need for a robust, fair and ethically sound way to combine these two seemingly contradictory requirements: on one hand, lawful interception of communications and use of surveillance data for justified aims, and on the other hand, individual control over personal and person-derived data to ensure freedom. In this article we propose a solution to this problem in the form of Datenherrschaft, and provide a thorough examination and discussion on the issues this kind of a solution would face in the real world.
The key dilemma in combining mass surveillance with privacy is: should we be allowed to assert control over data derived from our persons, and if yes, how? Thus, our Research Questions (RQs) can be formulated as follows
What are the problems of uncontrolled Internet mass surveillance from an ethical perspective?
Could the concept of Datenherrschaft act as an ethically justified basis for developing privacy legislation and thus give the main control of personal information back to individuals?
In this article we address these research questions and contribute by combining privacy and digital self-governance with properly justified use of personal and person-derived data by third parties – up to and including state-level institutions and actors – through Datenherrschaft, mastery over data. It is a concept that gives control of information to individuals that are the source of information, and thus aims to balance the biased power structure between data collectors and people.
This article is structured as follows. In Section 2 we discuss surveillance and mass surveillance in particular, and how it affects the information society. In Section 3 we briefly examine the technical aspects of Internet mass surveillance. In Section 4 we introduce Datenherrschaft in the form it has been previously been used in literature. In Section 5 we discuss our findings. Finally in Section 6 we conclude the paper with our closing remarks and highlight directions for necessary future work in this field.
We are living in a networked information society, where the transformative properties of technology [97, p. 8] and information being a central resource [65, pp. 271–272] have realised in multiple aspects of daily life. It is an interconnected society, with practically unlimited connectivity between citizens, organisations, and nations. In reality, this is realised through a global information network, i.e. the Internet. Even though the term information society is in itself widely used but imprecise [6,65,85,97], we see that it presciently underlines the nature of the society we are living in. While information control and surveillance is not a new phenomenon (see c.f. [70] for discussion on information control in organisational context), the aim towards information control and surveillance by interest groups such as Internet companies and governments is an emergent phenomenon [18,105].
In this networked information society, data has become one of the most valuable assets. As the value of information – and especially well-ordered information2
Well-ordered information is structured, searchable data that is usually stored in a database, in contrast to unstructured data that can be in any format, not indexed or categorized in a clear way, i.e. raw data.
Multiple definitions for surveillance can be found in literature, depending on the field and point of view of the author. However, certain consistent characteristics are visible in definitions, such as watch or continuous observation [91] of, focused attention [59] on, or monitoring of [19, p. 499] a target, the nature of which can range from aerospace, cyberspace, surface, or subsurface areas, places, persons to things [44], for example. In this paper, we use the term surveillance to represent information technology that is based on data collection from individuals, also known as dataveillance [19, p. 499].
We observe that criminal activities and prevention thereof are a common theme in the language used to describe surveillance, and similar language is likewise used as justification for mass surveillance. This connection has quite probably contributed to the negative connotation attributed to surveillance [9,31,100]. This may be problematic if taken for granted, as the negative connotation conveys the idea that surveillance is obviously necessary for catching “bad people” However, mass surveillance is directed towards everybody and not just criminals, who can avoid most mass surveillance methods given reasonable technical expertise. Thus, the fuel of mass surveillance seems to be fear, and not the rights of citizens. However, the majority of people should not need to be under surveillance, especially if we are living a democracy that should protect the freedom and rights of members of society. People should be regarded as not guilty unless evidence to the contrary exists, and why is there a need for surveillance aimed at innocents in the first place? Moreover, mass surveillance may also be, and very likely is, used for economic purposes that cannot be justified with crime prevention – a problematic state of affairs in itself. One reason for collecting data is the rise of the data economy. Data economy has been implicated as a new and flourishing area of the economy that changes the world – and at the same time, it is seen problematic [2,79] as it is based on data colonialism, which has normalised the exploitation of humans through personal data [21].
In the case of individual or targeted surveillance, the goal is limited to a single known target, and the information gathering effort is premeditated. By contrast, in mass surveillance, the main goal is to gather as much information as possible on the population.
Thus, there exists a difference between personal and mass surveillance. Personal surveillance is [19, p. 499]
“[…] the surveillance of an identified person. In general, a specific reason exists for the investigation or monitoring.”
“[…] the surveillance of groups of people, usually large groups. In general, the reason for investigation or monitoring is to identify individuals who belong to some particular class of interest to the surveillance organisation.”
Mass surveillance is less focused on gathering information about a particular target, and rather focused on gathering information on a wider group of people. Surveillance parameters used for target selection include, but are not limited to, information such as the identity and nature of the target (e. g. the name of an individual, organisation, or physical location), the methods to be used, and ramifications on the use of those methods (spatial, temporal, or contextual; e. g. personal phone calls are not included in a surveillance case related to financial misuse, surveillance is limited by time or physical boundaries). As mass surveillance is not “personal” it can be experienced as a constant gaze that you cannot predict or expect, an overseer from the shadows of the digital Panopticon.
Mass surveillance – the Panopticon of the new millennium
The Panopticon theory of surveillance is based on the works of Jeremy Bentham, an 18th-century philosopher, who introduced the Panopticon as a prison that relies on continuous surveillance and complete lack of privacy for inmates [10]. The concept of Panoptic surveillance was further refined by Michel Foucault in Disclipine and Punish: The birth of the prison [29]. The Panopticon has also been used in a more modern context, such as electronic and workplace monitoring [58,104].
In essence, the Panopticon is a model for efficient surveillance and self-imposed control. The key idea is that perceived surveillance is as good as actual surveillance, and that surveillance – whether perceived or real – changes the behaviour of the subject towards whatever is “desirable”. This combination makes it possible to control and modify the behaviour of subjects even with perceived surveillance, making collective surveillance easier and less resource-intensive. When subjects perceive surveillance to be constant, they adapt their behaviour to match this perception, regardless of whether that perception is founded in reality or not.
The Panopticon was initially presented as a prison design that would make the inmates supervise themselves, and lead to change in inmate behaviour. In a Panopticon prison, the cells are continuously illuminated and arranged in a circle. The wall facing towards the centre is transparent, making everything in the cell visible to an observer. The prison guards are stationed in a tower, placed in the structure’s centre. Windows on the tower are opaque when viewed from the outside, and transparent when viewed from the inside. This makes it possible for the guards to observe the prisoners at any time. Because of the opaque windows, the prisoners are not able to see the guards at all, and cannot know whether they are being watched or not. This perception of surveillance essentially forces the prisoners to adjust their behaviour to the knowledge that they can be under observation any given moment. In this manner, panoptic surveillance changes the behaviour of its subjects by forcing self-censorship and adaptation to conform to the expected behaviour standard.
A classic example of a Panopticon in literature is the telescreen in George Orwell’s classic book Nineteen eighty-four, in which a telescreen is a television-like device installed in every home. It provides the same functionality as a television screen, but also makes it possible for the authorities to observe those who are in the room. This observation happens without the knowledge of and any indication to those in the room. People are in turn aware that someone could be watching them through the telescreen at any moment. This kind of ubiquitous surveillance without any notification makes the telescreen an excellent example of a panoptic surveillance device.
Devices with similar capabilities as the telescreen do exist in real life, even though they have not been used in the same manner. For example, Microsoft’s Xbox 360 and Xbox One gaming consoles already had the capability, through the Kinect motion detection sensor, to identify whether there were people in the room, what they were doing, and record audio of what they were saying. The motion detection accuracy of the Kinect sensor [99] had been tested to be sufficient for applications such as gait analysis [101] and Parkinson disease monitoring [30], so it was definitely accurate enough to conduct surveillance. When Microsoft introduced Xbox One in 2013, one of the pre-release announcements was that the Kinect sensor was mandatory for the device to function. This requirement was later rescinded, partly due to resistance from consumers and privacy advocates.
The presence of smart devices in living rooms has been growing steadily and as the prevalence of such devices increases, security and privacy aspects also become salient. In March 2017, documents detailing an attack code named Weeping Angel that exploits vulnerable Samsung smart TVs [98] was published. Weeping Angel can turn a compromised TV into a device capable of monitoring its surroundings using the built-in microphone. Gathered data can be further processed and sent to the attacker. This kind of surveillance capability and method is eerily similar to the previously described telescreen example from Orwell’s classic book.
The Panopticon theory provides us with a compelling background upon which to examine the different manifestations of surveillance. The Panopticon indeed evokes mental images of an “all-seeing eye”, a concept perhaps best conceptualised by fictional entities such as Sauron in The Lord of the Rings. Sauron was depicted as a demigod capable of observing anything and anyone in the world with his giant eye, and it took the heroes great effort to avoid his gaze.
The Panopticon as a model has been widely criticised in the literature to be both inaccurate and unsuitable for modelling real-world surveillance [see eg. 6,8,13,36]. Haggerty [36] argues that the Panopticon is no longer a suitable model of surveillance, based on the vast number of different refinements to the model, and because “[e]ach new ‘opticon’ points to a distinction, limitation, or way in which Foucault’s model does not completely fit the contemporary global, technological or political dynamics of surveillance.” Indeed, the sheer number of different refinements indicates that the Panoptic model is not fully capable of capturing all nuances of surveillance.
Indeed, it is possible, and even tempting, to create ‘Yet Another Opticon’, to define a new Panopticon model to fit most surveillance scenarios. We argue that this is ultimately counter-productive. We should treat the Panopticon as what it is, a convenient tool for illustrating what complete and total surveillance could encompass, without trying to extend its scope. With mass network surveillance it has become possible to gain and control information about people, who cannot see who is watching them and why. The unit cost for surveillance has become a matter of physical resource – processing power or data storage – instead of human resource. This is the key difference between old world mass surveillance schemes in East Germany or the Soviet Union, and modern mass Internet surveillance systems.
Surveillance and behavior
Human beings have a tendency to behave rationally only in economic models, where the idea of people making rational, informed choices is assumed to be the norm. Actual human beings behave in unforeseeable ways and make decisions that are not optimal or rational for them, at least apparently. We all make “bad” choices all the time.
According to existing literature and research there is a strong connection between surveillance and behaviour. Surveillance or just the mere knowledge that others are paying attention changes our behaviour, sometimes quite dramatically. Already in the 18th century, Bentham [10] was keenly aware of the effect that constant surveillance has on its targets, and he was certainly not the first to perceive this phenomenon.
Solove [86] argues that we all have something to hide, whether we know it or not, and failure to realise this is due to a misunderstanding of the basic nature of privacy. He observes that those who support the nothing to hide argument generally tend to see privacy as negative – a necessity for hiding something harmful or illegal. Indeed, Schneier observes [82] that those who espouse the nothing to hide argument “accept the premise that privacy is about hiding a wrong.”
The chilling effect [80] is example of surveillance affecting behavior. The chilling effect can also be argued to apply directly to network surveillance. When one is aware that all Internet traffic is monitored for “suspicious” content, it is reasonable that one will avoid any behaviour that could be interpreted as even close to being suspicious. As “even lawful conduct may nonetheless be punished because of the fallibility inherent in the legal process” [80, p. 695], especially if it falls close to the border between accepted and unaccepted conduct.
Recently, Stoycheff [89] has examined the connection of surveillance and the chilling effect. She found that a perceived hostile opinion climate negatively affects the willingness to speak out online, and that if people are aware of surveillance and feel that the surveillance is justified, they tend to conform their opinions to the perceived majority, and staying silent otherwise.
The effect of direct surveillance has been studied widely. Extant literature includes studying generosity of donations [14,53,67,88], the effect of images of eyes [40], robotic eyes [15], effect on littering [27], among others.
The reaction to perceived surveillance seems a profoundly human condition that exists irrespective of the technology used. Overt, direct surveillance provokes change in behaviour of its targets towards more desirable patterns from the viewpoint of those performing the surveillance. When this observation is put into context of mass Internet surveillance, a de facto chilling effect on online behaviour is revealed. Its root cause is the knowledge that all actions taken online are monitored and stored for further analysis. However, the intended targets of this surveillance – the terrorists, criminals and spies – can continue their operations normally, as long as they know how to avert surveillance by using strong cryptography and strict operational security procedures.
Metadata as source for mass surveillance
To put it succinctly, metadata is data about data. A common argument for using metadata to represent actual data for surveillance purposes instead of using real communications data and context is that as it contains no details of the content of the exchange, no privacy violation happens. This is quite easily demonstrated to not be true. Bruce Schneier noted that “metadata equals surveillance” [83], giving an example of a detective hired to eavesdrop on a target. In this case, the surveillance data is who the target interacted with, where the target was located in a particular time, and so on – it is not “just” metadata.
Thus, it is possible to infer more information from metadata than immediately meets the eye. Metadata gives a strong implication on the contents and context of data from which it is derived, in addition to providing a whole new layer of information on the whereabouts and activities of the surveillance target. While discussing surveillance and metadata, Schneier [81] gives an example about a Stanford study [62] on what information can be deduced from phone metadata. The researchers collected data such as when the calls were made, from whom and to where, but without any knowledge of the actual messages in the phone conversation. Researchers could identify people with medical conditions, owners of firearms and drug offenders, just from observing metadata. Thus in many cases metadata is person-derived data, even when it does not contain any personal information itself but rather helps in gathering information out of sight on the actions of individuals.
Zappalà [103] discusses the problems of using metadata to wage war. While metadata is on one hand considered to be circumstantial, derived from the actual data and not indicating anything about the original data, on the other hand metadata is actually very accurate, as is evidenced by the study discussed above. Nevertheless, we encounter some grey areas when we consider what actions can be taken based on metadata. While metadata-based mass surveillance can lead to potential targeted surveillance of a person of interest in many cases, sometimes the actions taken are orders of magnitude more severe, ranging up to using lethal force. General Michael Hayden, former director of the NSA and the CIA, has stated on the record that “we kill people based on metadata.”3
This quote can be found at 18:01 on the video The Johns Hopkins Foreign Affairs Symposium Presents: The Price of Privacy: Re-Evaluating the NSA. Available online at
The concept of a data double is discussed in the context of surveillance by Haggerty and Ericson [37]. We all have a data double, a digital representation of ourselves. It is a product of all that we read, write or interact with in the digital world. On a conceptual level, at least, our data doubles accrue matter over time, as all our actions online are “stored” in our data double. Whether this kind of data double actually exists physically depends on whether someone has been able to collect online actions of individuals to a single entry.
It can be quite convincingly argued that from the point of view of a system that only observes the digital world, we as people are our data doubles, at least as far as the system is concerned. Mass surveillance definitely falls into this category, and can be seen as processing, analysing, and examining our data doubles. As the operators of mass surveillance systems are commonly state level authorities, they do indeed have the power to interact with and affect us physically. This shows that the coupling between our physical selves and data doubles is quite intuitive.
The surveillant assemblage is a concept introduced by Haggerty and Ericson [37]. It can be thought of as a widely distributed and decentralised surveillance apparatus that incorporates information from many unrelated sources, consolidating and distilling data to form a coherent and comprehensive data double of its targets. As most aspects of society and life are integrated into the surveillant assemblage by providing source data to surveillance, it becomes increasingly difficult to avoid leaving traces of oneself into the system. Haggerty and Ericson describe this progress as the “disappearance of disappearance”; that it is getting increasingly hard to disappear into the crowd, maintain anonymity, or hide from monitoring by social institutions. Especially, Haggerty and Ericson argue [37, pp. 619–620], it takes effort and causes you to forfeit social benefits and privileges such as voting, social security, banking, and even using the Internet in general.
Another related surveillance concept is rhizomatic surveillance [37]. It is used to describe surveillance characterised by growth through expanding use and a levelling effect on hierarchies. This means that even aspects of information society completely unrelated to surveillance experience function creep in the direction of surveillance. The continuous incorporation of seemingly unrelated features and functions of society into a distributed but yet connected surveillance apparatus describes the concept of rhizomatic surveillance quite accurately.
The surveillant assemblage is remarkably accurate in representing the experience of living in the networked information society. For example, in Finland it is very challenging to use societal services without a computer, an Internet connection, and online banking credentials – the de facto online ID in Finnish society. To say that without an online presence one might as well not exist is not a great exaggeration in some, if not many aspects of life in the networked information society.
Now consider if all data in private and public databases and online services is consolidated and then used to profile a person. The resulting dossier would be very accurate, and were it to be exploited in bad faith, the damage done could be irrevocable. This is why the electric Panopticon is such a frightening concept, even though the actual risk of this happening to an individual can be considered to be negligible, at least.
Surveillance in practice
Surveillance has ceased to be a resource intensive activity, and has nowadays become more dependent on the efficiency of data acquisition [81, p. 23–28]. In the past, all personal data was spread into various information systems, many of them not even connected to a network, or even not in electronic form altogether. This made combining personal data from different systems to a centralised accessible database of information prohibitively expensive, if not outright impossible. Two developments have changed this: the cost of computing has gone down and the cost of data storage has plummeted.
Lyon [61] identifies three layers of the “surveillance iceberg” that describes mass surveillance: accessing data in transit, accessing stored data, and using spyware to compromise individual devices. Data can essentially be in three distinct states: at rest, in transit, or in use. All three kinds of data can be used in mass surveillance.
Data at rest refers to data which is stored and is not actively used or transmitted elsewhere. Files stored on a hard drive, flash drive or cloud service that are not actively used are data at rest. It is not possible to intercept data that is not in transit. For example, social media companies generate a lot of data on their users based on their use of the service, and store this data in their systems. In many cases, this data is not ever transmitted outside the company secure networks, and therefore it is outside the scope of regular traffic interception. Against this background, it is clear that gaining access to this data is a priority to any intelligence organisation willing to preform mass surveillance, as it provides huge amounts of user data and metadata that is seldom on the move.
Access to data at rest can be obtained via various methods, ranging from breaking into information systems the data is hosted in to legislating access for surveillance purposes. If data can be accessed from the Internet by legitimate users, the same applies to malicious users. Data warehouses often use virtualisation of servers to improve operation efficiency and flexibility. Various methods for escaping virtual environments exist, for example by leveraging various vulnerabilities in the hypervisor and gaining illegitimate access to data and processes on the physical hardware [69,73]. The servers themselves may be vulnerable and compromised directly, as has happened previously with the Heartbleed4
Heartbleed (CVE-2014-0160),
Windows Print Spooler Remote Code Execution Vulnerability (CVE-2021-34527),
One interesting method for protecting data at rest is homomorphic encryption (see c.f. [1]), which facilitates computation on encrypted data. Normally data at rest is (or at least any valuable data should be) encrypted using strong encryption, making it difficult for adversaries to access. But encryption must be removed before the data can be used in any way, thus making it vulnerable while it is in use. In the case of homomorphic encryption the data is not decrypted in any phase of computation and therefore remains secure.
Regardless of protection methods for data stored on third party platforms, as a necessity of hosting, the hosting party will have access to metadata on when and how the data is used and accessed. These access patterns can be used to compromise the confidentiality of the stored information [94].
Data in transit is data that is actively being transferred in some manner. Data in transit can be observed or intercepted as it is transmitted, and the necessary tools depend on the medium. On the Internet, practically all sensitive communications are transmitted over secure connections, whether the data in question is online banking data, company internal communications, or discussion forum login information.
Modern encryption standards are designed to withstand attacks from practically omnipotent adversaries, but in some cases they can be circumvented. AlFardan et al. [4] estimate that about 50% of Transport Layer Security (TLS) protected traffic at the time was secured with RC4, an insecure stream cipher that has been deprecated from active use. Therefore storing traffic for later analysis suddenly became quite the fruitful approach into breaking encrypted communications. Should there be an identified pattern of interest in captured traffic, decrypting RC4 encrypted communications data is not a very difficult task for any sufficiently resourceful actor. After the encryption has been stripped away, this data can then be mined for further intelligence. This results in that communications encrypted with RC4 are perpetually at risk of being compromised at a later date.
Some protocols may allow the use of older, unsafe cipher suites in an effort in backwards compatibility. In some configurations TLS may allow fallback to SSL 3.0 for legacy systems. It is possible to exploit this option to force communications ot take place with unsafe cipher suites and subsequently breakable security. The POODLE attack [64] accomplishes exactly this by forcing the use of SSL 3.0 for client-server communications. The fact that communications encrypted by RC4 can be decrypted later given sufficient but reasonable time provides a great motivation for storing massive amounts of communication data for later analysis.
While TLS version 1.3, introduced in 2018, has significant improvements such as perfect forward secrecy, it has not made downgrade attacks obsolete. Lee, Shin & Hur [54] identify a vulnerability in commonly used network stacks and leverage it to demonstrate a TLS downgrade attack from TLS 1.3 to earlier, insecure versions on popular browsers. There are also various weaknesses that have been found in implementations of TLS 1.3 [77].
The same principles can be applied to compromising VPN connections. Adrian et al. [3] describe an attack against TLS based on exploiting a common insecure version of Diffie-Hellman key exchange protocol [26]. Logjam is a downgrade attack that forces vulnerable servers to use a weaker version of Diffie-Hellman. This attack made it possible for the authors to compromise connections to 7% of top million web sites indexed by Alexa at the time. Logjam is based on precomputation and exploiting weak Diffie-Hellman groups used in export-grade version of the protocol.
Data in use is data that is actively being used in a computational process. It is stored in volatile memory or can be considered to be otherwise temporarily stored, such as in a temporary file, folder, or swap file. Gaining access to data in use requires at least some kind of control over the computational platform and/or the process that is processing the data. This in turn effectively requires targeted surveillance to gain access that could be gained through vulnerable server software or specifically crafted malware, for example. Also, gaining physical access to the platform enables attackers to read the contents of RAM and gain access to unencrypted data or encryption keys stored in volatile memory [17].
Real surveillance is done with massive amounts of collated metadata [81], and there the most privacy sensitive problems arise: the gathering of data and its subsequent consolidation and combination with other records, and then making assumptions based on meanings derived from incomplete or incoherent data. Mass surveillance through gathering and storing information for later analysis has become the go-to tool for mass surveillance. The reasoning behind this development is actually quite sound: because of the complexity of Internet communication systems and the sheer amount of data transferred combined with the difficulty of pinpointing the target of surveillance, it quickly becomes more feasible to just gather as much data as possible and perform analysis later.
One chilling result of systematic data gathering is in the ability to compound surveillance records over time. If every single thing that a person does online is stored permanently, this information can be used at a later date, even years or decades later. Any potentially harmful information could be used years after it was originally captured, for example, and the target has very limited power to affect how this information is stored and used – that is if the target is even aware of the existence of such data.
Security and privacy concerns have driven secure solutions for network communications to become more and more prevalent, but the default assumption of openness can be seen in the nature of various Internet protocols. HTTP and FTP traffic, for example, were originally sent by default in the clear, without any privacy or security. Nowadays, secure versions of most Internet protocols are used by default. In wireless networks, all receivers within range of the transmitter are able to listen in on the traffic sent by any other host; the fact that on the chipset level, most wireless local area network interfaces choose by default to ignore packets not meant for them can be considered to be an act of politeness, not a feature that brings robust privacy or security to wireless networking.
In our context, this is the Internet, but in principle it does not matter what kind of medium or encoding is used. These certainly have an effect on potential forms of surveillance, though. Computer networks are relatively well-organized by necessity, and this suggests that networks are also well monitored and administered. An administrator of a network is able to perform surveillance on their own network as a by-product of normal administration and troubleshooting duties. Essentially, no traffic is safe from the view of the administrator or operator of a network. This makes the service provider a single point of failure regarding security and privacy, and also a very tempting target for infiltration or some other kind of clandestine information gathering.
Finally, we observe that technical solutions to individual vulnerabilities and threats are important, but in-depth analysis is also out of the scope of this paper. This is because our solution is not technical and it does not aim to address specific concerns on particular technology, vulnerabilities or risks, but rather create a guideline for the whole field of computing on how to protect privacy and rights of individuals while simultaneously granting the possibility to use personal and person-derived information in various processes, but in a human-centric approach.
Datenherrschaft
In this section we examine the problems that arise when data ownership gets into conflict with mass surveillance. As a solution, we propose applying a concept of Datenherrschaft borrowed from Kainu & Koskinen [45] and further used in context discussion on ethical and legal mastery of patient information [see eg. 48,50]. We evaluate this concept and its suitability to the problem of protecting the rights and privacy of citizens in the networked information society, while making it possible to conduct lawful surveillance. This issue is relevant especially in the European Union, where the GDPR was approved in 2016 and enforced on May 25th, 2018 [76]. It has already had a global effect on privacy, as the directive affects all actors present in EU market [34]. The GDPR is driven by a philosophical approach with roots in privacy as a fundamental human right, which is in line with the concept of Dateherrschaft [see 33].
Ownership of data has been extensively discussed in literature from the business side. Data is vital to most, if not all, businesses. Referring to previous work by Davenport et al. [22] and Strassmann [90], Redman crystallises issues of data ownership in business quite succinctly [75]: “Indeed, the politics of data ownership are among the most brutal in many enterprises.” Data can be so critical to businesses that the whole existence of a company can depend on who owns a certain piece of data. In the larger picture, data is reshaping the scientific fields and shifting paradigms [16]. Data collection has formed own field of economy that is called data economy where information collected from individuals has a major role, but not without ethical problems [52,72]. These aspects emphasise the need for research on ownership of information/data, as ownership defines where power lies in the information society.
We argue that it is not unreasonable to extend this characterisation to data ownership relationships between organisations and customers, or governments and citizens. A common quip about the revenue model of “free” services is that “if you are not paying anything for your service, you are not the customer, you are the product being sold”. As a real-world example, questions of data ownership and the use of user-generated data for profit have been at the heart of the discussion concerning Facebook since its founding in 2004 [11,43,87]. The aim, which is already supported by GDPR, should be protecting the rights of the citizens and give them control – mastery – over their personal and intimate information.
The questions of data privacy, data ownership and what can be done with collected data are central for the information society. Well-defined ownership of data is paramount to a society based on information. All collected data, whether originating from industrial or business processes or people and their behaviour, is valuable.
Now consider the information that is gathered with systematic mass surveillance. This contains data about all aspects of the lives of its targets, from their personal communications and online behaviour to location information and personal connections, and even personal health information. The latter can be argued to be clearly within the realm of acceptable privacy [see 46,48]. If patient records are considered to be such critical information as to warrant extreme protections, isn’t it also justifiable that other information of comparable compromising potential should also be afforded the same protection when possible? There is no silver bullet solution, at least not an obvious one, for the problems that arise from these issues.
Datenherrschaft – mastery over data
Mass surveillance generates massive amounts of data. From the perspective of targets of surveillance, that data is derived from their persons, activities and contains significant amounts of personally identifiable information [95]. Here we have an opportunity to examine the ethical dimensions of mass surveillance, and approach it through ownership of data. We use the aforementioned concept of Datenherrschaft together with mass surveillance to explore possible ethically sound foundations for legitimate surveillance.
The definition for Datenherrschaft is given by Kainu & Koskinen [45, p. 54] as: “the legal right to decide the uses of, and continuing existence of, in a database or another compilation, collection or other container or form of data, over an entry, data point or points or any other expression or form of information that an entity has, regardless of whether they possess said information, with the assumption that sufficient access to justice is implemented for a citizen to have this power upheld in a court of law.”
The difference between Datenherrshcaft – mastery over data – and ownership as property rights can be condensed to four key differences. First, Datenherrshcaft is nontransferable [47, p. 102]. While ownership of data can be transferred to another, the mastery over data is permanently bestowed and thus prevents the irrevocable transfer of “ownership” – like human rights. Koskinen notes that the mastery over patient information can only be given to the person from whom the information is derived from.
Second, Datenherrschaft is not a compensation or a reward [47, p. 102]. Koskinen observes that while work is done by someone is seen as a justification for granting that someone immaterial property rights to the result of that work, in some contexts this is not applicable. In the context of healthcare, the compensation comes in the form of salary, so there is no need to compensate the person who compiles data about the health of a patient into patient information with ownership of that immaterial property.
Third, Datenherrschaft is bestowed by default to the person from whom the data or information is derived [47, p. 103]. It is commonplace to pass on intellectual property rights to parties that have done no actual intellectual work in creating protected information. For example, in Finland, it is possible to transfer the intellectual property rights of an invention made by an employee to the employer. In contrast, Datenherrschaft is inherently a part of the originator of the data [45], and thus cannot be transferred to another entity as it is inalienable from the person. To summarise, Datenherrschaft should be understood similarly as human rights are, instead of object economical transaction that seems to be the case nowadays.
The fourth and final difference is that while intellectual property rights are originally based on an artistic process or efforts to create the property, Datenherrschaft can rather be used to protect the individual right for privacy and control about themselves, and thus should not be compared with property rights [47, p. 103]. Thus, we see that idea of legislating personal information with IPRs [96] is not the right path, as it reveals that people are a source of information and as individuals, are usually the weaker part compared to corporations due to fewer resources to defend their rights.
Ethical data-driven surveillance
Datenherrschaft has been previously used to examine and outline ethically sound and justified approaches to the ownership of patient information [47–51]. Patient information systems contain significant amounts of data on a person and their private matters. Patient information should thus be seen as sensitive, and that it needs explicit protection from misuse and unnecessary observation by others.
While Datenherrschaft regarding patient records and health information has already been discussed in the literature, surveillance and Datenherrschaft has only been discussed in a cursory manner by Hakkala [38]. This work aims to fill this research gap, opening the discussion on the applicability of Datenherrschaft to surveillance data and offers a new approach instead of relying on problematic property view on the information.
Concepts such as the Panopticon give us tools to observe how omnipresent surveillance works and how all aspects of society are being harnessed as sources of data for surveillance. Datenherrschaft gives us a framework through which we can observe surveillance in an ethically justifiable manner, and gives the potential to create practices that are better in protecting the individual citizen of the networked information society.
Koskinen [50] illustrates the problems with patient information ownership. They observe that traditionally the ownership of intangible things such as information is governed through intellectual property rights. Patient information, however, is not suitable for being governed by the same legal framework. Even though patient information is immaterial, it is still irrevocably bound to the source of the original information – the patient.
Now, let us consider the data that is used as the basis of mass surveillance. This, similarly to patient information, is immaterial information that is bound to the originating source – the person. This data is also used to make surveillance-related decisions that can have an effect on the person on various levels, even including their health. This parallel that can be drawn between patient information and data that is used to conduct surveillance serves as the basis of the argument that Datenherrschaft can also be used as a solution in protecting citizens of the networked information society from undue violations of privacy.
In an ideal world, the information that is gathered with the express purpose of tracking people should be governed by the principles of Datenherrschaft. The person from whom the information is directly derived should have control over that information, even though it is used by other parties. Indeed, the concept of the data double previously discussed in Section 2 gives some credibility to the demand for application of Datenherrschaft. If the information that is, in the virtual world, considered to represent a person (or even to be that person), it is only natural that the person should have a degree of control over the data double.
If we examine information that can potentially be used for surveillance, we immediately encounter the problem of defining the information to be protected by Datenherrschaft. Particularly, we should be able to identify the different kinds of data that are collected from us. Previous research, as observed above, has focused on patient records, but if we are to extend the scope of Datenherrschaft to other kinds of data, we must be able to accurately identify the information to which the principle of mastery over data is to be applied. As most surveillance is driven by metadata, this is where we must proceed next.
Datenherrschaft and metadata
When we consider metadata from the perspective of Datenherrschaft, we immediately encounter some questions. At what point does data derived from data cease to be under the mastery of the source of the data? Should we also own and have mastery over data about data from us? This is a central issue with applying Datenherrschaft on personal data used for surveillance.
One clear requirement for managing and combining metadata in any larger form or shape is that it cannot be identifiable or traceable to a single individual, a reasonably sized group of individuals, or a certain demographic. Anonymisation of data, in other words, is a clear prerequisite when any larger sets of personally identifiable data are processed.
Unfortunately, even anonymised data can be de-anonymised (see c.f. [25,66,92]). With current technology, algorithms, and a sufficiently large body of metadata, it is possible to trace entries back to the individual in most, if not all cases. Thus metadata has to be considered as a part of personal, person-derived data, even when some claim that metadata is anonymous.
Can we infer from this societal requirement – it is indeed societal, as it stems from legislation and the perceived need for such – that we should also have mastery over surveillance data and metadata that is derived from and used to surveil us? We continue to exert influence, albeit non-personalized, on how data about us is handled. An obvious counter-argument is that surveillance is by necessity a covert operation and that to give the subject or subjects of surveillance a notification let alone some actual power over this surveillance process would be ridiculous and counter-productive.
We come into apparent conflict with Datenherrschaft, which argues that the individual should have the right to decide how their data is used. This conflict can be resolved with an inherent property of Datenherrschaft – mastery over ones data confers not only rights but also responsibilities. Justifiable situations exist where the Datenherrschaft of a person over their own data can be overridden. Koskinen [50] outlines that prioritising life, health and liberty over possessions is justifiable, and thus Datenherrschaft can be superseded in these cases.
Why Datenherrschaft matters
We see that parts of the ideal of Datenherrschaft already exist in some legislations, especially in the GDPR, which underlines that people have an intrinsic right to privacy and protection of personal information. However, the rights of other people can supersede and override an individual’s mastery over their data whenever there are ethically “higher” values to be defended by violating Datenherrschaft. Therefore the individual rights can limited in some justified cases, such as in the state of emergencies, combating epidemic diseases and when individuals violate the rights of others.The guideline for limiting Datenherrschaft should be to be as non-intrusive as possible and respecting Datenherrschaft as much as possible. These exceptional situations should be codified separately with extreme deliberation.
The existence of the above limitations on Datenherrschaft means that a person cannot exercise their Datenherrschaft to stop the use of their data in lawful and ethically justified use. Investigations that are conducted according to legal guidelines certainly fulfil this criterion, but now we face another problem: what should be considered as lawful use, i.e. how should the law be written?
The difference with Datenherrschaft when compared the protection of privacy and personal information e.g. in GDPR is that by Datenherrschaft it is a clear articulation of that individual as a source of information posses the mastery over this information. Thus, the idea is not that privacy and personal information need only to be protected. The core idea is that the control of information use is given to people and that they should possess legitimate power to decide how information is used. The strength of Datenherrschaft lies in that it is not too detailed to be mere legal jargon. Instead, it straightforwardly states that people have the right to decide how information is used and when they decide so, the use of information needs to end. There is a demand for clear rules that can be understood by laymen, even though there is also a need for detailed legal texts. The kind of statement that Datenherrschaft offers is needed to clarify what should be the rights of people when using information derived from themselves. If this approach is taken seriously, it could be the foundation for self-regulation of data collection and hence drive policies in such a direction where people could truly decide how information is used – if organisations commit to following the idea of Datenherrschaft.
The legislative discussion in detail is clearly out of the scope of this article, rather being a task for professionals of jurisprudence. The observations made earlier in this paper give us a starting point when considering the negative effects of surveillance. Balancing these negative effects with the rights of the citizen is a difficult task, but Datenherrschaft has the potential to be the underlying guideline that facilitates ethically acceptable and justifiable choices as it gives the real possibility to act against unwanted use of information. Thus, Datenherrschaft should be considered in a similar manner as a constitutional right that can be limited only in certain, well-justified circumstances; collecting information for necessary governmental tasks or protecting justified rights of others that would be jeopardised without accessing information under Datenherrschaft. Purely commercial use is not a justification for overriding Datenherrschaft.
Currently surveillance is performed by both public and private instances: companies such as Facebook and Google have made their entire business dependent on practices that approach (or practically are) surveillance. The EU is actively taking the stance that individual rights for privacy and control over information are stronger than the rights of private companies, as is evidenced by the adoption of GDPR. However, by further implementing Datenherrschaft we could underline this and make it explicit. In the case of mass surveillance, Datenherrschaft should be used to provide general ground rules and any violations should be justifiable. Next, we examine the issues that must be taken into account when violations of Datenherrschaft are considered.
One counterargument from Governmental viewpoint is that without mass surveillance it is possible that we cannot find real security threats for society, as surveillance is the best tool for finding them. However, if this is considered plausible logic, then we can also justify systematic house searches, breaking inviolability of letters etc. by merely stating that with these violations of our basic rights, the real criminals or terrorists can be caught. The problem is that in that kind of society, the government itself has transformed to one that is a threat to the people themselves [see e.g. 57,63].
The idea of giving the possibility for overriding Datenherrschaft is to secure the rights of others, and thus the override has an ethically defendable goal, even though not entirely without gaps. However, to override Datenherrschaft, the threat to others should be significant in its quality and probability. Especially in mass surveillance, this seems to be a problematic issue. With mass surveillance, most individuals who are targeted are not a real threat to anyone. Thus, their Datenherrschaft should not be violated, as it limits their freedom and privacy without proper justification. Constitutional rights can be overridden only on a very strong basis. The same should apply to Datenherrschaft, and we claim that with mass surveillance, this strong ethical basis is missing. Democratic societies should be based on the rights of the individuals, and Datenherrschaft is offering a ground rule – empowering the individuals – to resist data colonialism, where people are seen as mere exploitable subjects under algorithmic control [see 21]. Hildebrandt [42] shows that there is a need for transparency to illuminate the logic behind information collecting technologies and algorithms. As in the definition of Datenherrschaft – the legal right to decide the uses of,… – it is vital that the people have the right to decide how information is used, and this can be achieved only by transparency, for which Hildebrandt offered five recommendations when using data mining algorithms for profiling users [42]. First, collaboration between engineers, designers and end-users and affected parties is required. All parties affected by data mining activities should have representation in the design phase of the data collection and analysis process. Second, data collecting mechanisms should be made visible to see what these data collecting systems can do. Third, all trials should be reported to reveal the possible problems and biases of those algorithms. Fourth, the data and the methods used should be verifiable by lawyers and software engineers when public interest requires it. Fifth, the results of any data mining operations should be constantly evaluated and examined to confirm that they conform to the rule of law.
Likewise, there is need to make non-technical aspects of data collection and use more transparent – why and for what purposes the information is used by different actors – to ensure that people can decide the use of information that Datenherrschaft protects.
There should be a clear justification for surveillance and thus violating Datenherrschaft. When someone is targeted for surveillance, probable cause and clear procedure should be the norm. Additionally, if the result of surveillance is that no severe crimes were committed, the target of surveillance should be informed, all collected material should be presented and people should be given the right to verify to what purposes the information is used for and when. All gathered information should be erased afterwards. Likewise, there should be further discussion on compensation for individuals whose Datenherrschaft has been violated. To avoid unnecessary and heavy-handed use of surveillance, the process should be sufficiently rigid (afterwards) that should there be no severe threat to others, surveillance is not done just in case.
In light of Datenherrschaft and demands set for justified exceptions for surveillance, mass surveillance – as it is defined in Section 2 – cannot be accepted. We need new informed ways to use the information collected from people in such a manner that people can understand and control it. While this is not an easy task, it does not make it any less important but rather an issue that should be focused on.
Discussion
Surveillance in itself is a commonly used tool, albeit one with significant drawbacks for both its users and targets. Nevertheless, calls for more surveillance in society in exchange for increased security are not uncommon at all. A cynic might even say that people bring the bad effects of mass surveillance upon themselves by demanding more surveillance in the hope of increased security.
A central problem in justifying surveillance is where to draw the line between the acceptable and unacceptable. In his book, Rule [78] observes a key issue in privacy: “there is no natural line of separation between the realm of the private and personal matters of legitimate interest to others” [78, p. 2]. Indeed, there exists a definite set of situations where surveillance and intrusion of privacy have historically been deemed as acceptable. Most prominent among these include the criminal investigation into more serious crimes such as murder, treason, and major financial and narcotics crimes.
A common yet naive argument used to justify surveillance is that everyone does it. Nearly every sovereign nation has its own secret service, police or intelligence agency responsible for intelligence gathering and espionage. It is indeed the prerogative of a sovereign nation whether to engage in espionage or not, and to target it as they see fit, be it on other nations, organisations, or individuals. Likewise, business has been joining this constant surveillance of individuals, groups and other instances, even the legal and ethical justification such surveillance is highly questionable in many cases.
Justification of surveillance by governmental authorities
However, surveillance can be justified in some situations. Few can argue against just and laudable goals, such as the need to fight crime and terrorism, and these two are indeed commonly cited reasons for using surveillance. Another aspect of generally acceptable surveillance is monitoring known dangerous places and massive events with large crowds, for example, so that emergency response is not delayed unduly in the case of an accident or other undesirable event.
What is more interesting in this regard is that many may even consider surveillance a good thing. This is evidenced by the prevalence of people who espouse the “I have nothing to hide” argument. They are perhaps confident that surveillance will target people other than themselves, and even if they should be targeted, nothing bad will happen as they are innocent and indeed have nothing to hide, and thus nothing to fear. Such attitudes towards surveillance have been found in studies [see c.f. 41].
The existence of valid reasons for surveillance, and even for mass surveillance in some cases, makes it very difficult for a society to limit the use of these powerful tools to the bare minimum of acceptable use cases and scenarios. Interception of private messages by law enforcement authorities is referred to as lawful interception. This is done acting under the colour of law, usually with a court order or a warrant, depending on the jurisdiction. Lawful interception of protected communications is an important part of law enforcement and criminal justice as a method for obtaining intelligence on unlawful activities.
Lawful interception is, for the majority of cases, targeted surveillance. The goal is to obtain particular information, or to follow only certain people, thus inherently narrowing the scope of surveillance to only intended targets. In most civilised countries, for example, the permit for telecommunications interception must be granted by a district court or equivalent, and the surveillance must be specifically targeted to a certain case, person, and topic.
However cynical it may seem, we can postulate that when a tool for targeted surveillance, such as monitoring telecommunications, network traffic, or traditional mail, becomes useful in existing mass surveillance, the umbrella of tools and methods deemed as lawful interception in a jurisdiction tends to expand to cover the use of new, useful tools. This is regardless of the original intent of the law regarding mass surveillance. This can be speculated to be a kind of function creep, where something that has been found useful in smaller scale is simply taken into use in a larger context. The concept of rhizomatic surveillance discussed in Section 2 describes this development quite accurately.
Using data mining for identifying anomalies from Internet traffic requires storing and analysing massive amounts of data to establish a baseline. While this can be seen as lawful interception, establishing the baseline requires processing a lot of data from individuals that are not being suspected of any crime.
Dictatorships and oppressive regimes have used surveillance as a method for control and suppression throughout history, and there are no indications that the method would have lost its effect, at least judging by the continued use of surveillance in contemporary societies. One timely example is the downfall of Afghanistan, which also has a connection to surveillance of people. Here, personal information collected from people that have been working with the collapsed regime or otherwise have been acting against the interests of the Taliban can be extremely dangerous for those individuals. What is notable that most of information is collected via private companies, such as social media, which underlines the fact that mass surveillance is a global phenomenon that exceeds the boundaries of governments and companies that cannot be controlled by any single actor.
Rise of surveillance society
It is possible to create – and we argue it is already forming – an information society that has surveillance built-in. Information from communication networks and services can be both observed in real-time and also stored for later access. Both approaches carry with them serious threats to security and privacy of citizens. When we consider surveillance from the perspective of the networked information society, finding a solution to this problem is paramount.
A massive amount of data is generated as a by-product of normal operations in all aspects of society and business. To provide services most companies must collect at least some data about their customers. For example, collection of usage patterns and related communication information of a smartphone, telemetry data gathered to assure that the operating system of a computer is functioning correctly, customer information on what was bought, when, and where. Various software vendors have some kind of feedback systems for collecting data on how their product is functioning and to facilitate solving potential errors. All these are a part of providing a service, one that is explicitly wanted and paid for by the customer.
As noted above, managing communications infrastructure and providing communication services generates a lot of personally identifiable data on who owns which device, where it is connected to, and which devices it is used to communicate with. This data is again generated as a by-product of normal operations. There are thus legitimate reasons for this kind of mass gathering of data, especially for improving user experience and product development in general. The important question is, when does this data gathering become surveillance? This naturally depends on what data is collected, how and for what purpose it is used, how well it is anonymised, and finally, and perhaps most importantly, in what manner and how securely it is stored.
Consider the scenario where all of the information that is generated as a by-product of normal operations of an information society can be accessed, analysed and used in surveillance activities. Lyon argues that the use of Big Data techniques for analysing intelligence data is changing surveillance in itself in three ways [60]: by increasing reliance on software, shifting the focus of surveillance to the prediction of future rather than analysing current and past events, and improper adaptation of techniques from other disciplines to surveillance.
Forced trust in surveillance society
Methods for controlling such an institutionalised infrastructure surveillance are rarely, if ever, technical, but rather based on softer controls such as legislation and customs. The main problem, in this case, is trust and specifically, lack of trust. How far can we trust that our data that is collected to provide a service is not used against us in the future?
A concept that describes this situation is known as forced trust [39]. In the context of information systems and information society, the users of an information system or the members of society are at a significant disadvantage regarding what systems they are supposed to use or in what systems their data is processed in. Decisions on what systems are used, who builds them and who administers them are something that ordinary users or people cannot affect in any way. Thus they are forced to trust that the security and privacy of such systems are in order, that their data is handled properly, and that there are no adverse effects from using such systems. In the case of critical governmental information systems that are responsible for many services in information societies, this is a significant problem. Surveillance systems fall arguably to this category as well, and the data gathered with surveillance is something that ordinary people have no power over. That is, unless we implement Datenherrschaft and give mastery over data to individuals, even in such situations.
Consequences of surveillance society for individuals and organisations
When mass Internet surveillance and data gathering is combined with modern efficient data analysis techniques, new associations and links between people are created; events and concepts that, however improbable or trivial, can be used to show connections and intent not based in reality [60]. The use of such information in a malicious manner is a serious concern for individual security and privacy. Even if this data is used in good faith to pursue justice and uphold “national security”, because of the disconnect between events and intent inferred from data and reality, mistakes are bound to happen. Connections will be made between people and events that have no basis in reality. For example, correlation of location data of people to movements of a target of interest can result in a person being targeted for surveillance just because they are in the same physical location as a person under targeted surveillance.
If an illegal act is erroneously associated with a person, what would the effect on society be if those falsely attributed allegations were aggressively prosecuted to the full extent of the law? For example, terrorism is a naturally a very serious crime for someone to be charged of. In the current societal climate, however, even the mere suspicion or illusion of association with terrorism can be dangerous. The publicly stated purpose of many mass surveillance programs is to increase national security and combat terrorism, so such crimes are routinely investigated in those programs. Therefore it would stand to reason that the erroneous accusations would be more serious. As it was noted previously when discussing the chilling effect, the legal system is not perfect, and even such erroneous accusations may lead to sentences. This would understandably enhance the chilling effect, as people would be more inclined to stay away from any topics of controversy to maintain an appearance of blamelessness.
These observations apply to any nation with a strong government and permissive legislation on mass surveillance. For example, one of the main problems for US-based tech companies is that they have no means for defending against allegations of cooperation with intelligence agencies. If the US government wants access to data in the possession of a US company, they can issue a national security letter. It compels cooperation under the threat of imprisonment for company executives, and also forces everything related to the letter to be kept secret. The recipient of a national security letter cannot thus divulge to any third party that they have in fact received one. Warrant canaries [32] – public statements on the company web page stating that no warrants have been issued to the company – have been formulated as a defence against this kind of government action, but their effectiveness is yet unclear.
The problem with assurances of non-compliance with authorities is that cooperation is practically mandatory. The US is a key Internet hub with a significant share of traffic routed through it, and the US is also one of the key suppliers of Internet technologies. A significant part of big data is also stored in data centres, in locations under US jurisdiction. For example, Lavabit, a secure email service provider used by Edward Snowden, was forced to shut down for the owner of the company to avoid prosecution and possible imprisonment [56]. Naturally, this option is not available for technology giants such as Google, Microsoft or Apple; They must comply with any valid requests for data by the government, as shutting down a company that large is simply not an option, and they cannot risk consequences that would arise from even discussing the details in public.
European data economy as an example of possibilities of Datenherrschaft
Building a European data economy is part of the European digital single market strategy and it is based on the free movement of non-personal data [20]. However, the data economy is also heavily based on the use of personal information, which in turn is based on GDPR and the portability of personal data. Data portability is relevant to data economy, as it provides the basis for transferring information and controlling personal data to be used in digital societies [23]. Thus, to overcome the lack of trust and risks for ethical society, there is a need for a new approach to data economy that is people-centric and transparent [52]. As a concept, Datenherrschaft is in line with GDPR’s aim to protect individual privacy and to give “ownership” of information to individuals. Especially when data is seen as an asset or resource [12,24,35] Datenherrschaft is a strong statement as it gives the mastery to individuals for that asset. It also helps individuals protect them from corporations as it fundamentally differs from property rights, as Koskinen noted: “The Datenherrschaft differs substantially from property rights in four specific ways. First, when ownership of property can be moved from one party to another one that is not case with Datenherrschaft. Datenherrschaft is irremovable from the individual who has it. It is the individuals choice of to make or not to make the criminal act, and is not removable form what the actor then is – even the driving forces behind the act can be interpreted. Datenherrschaft can be only be given to the person from whom the information is. It is notable that the person cannot give up the Datenherrschaft even they want to. This makes Datenherrschaft so unique.” [48]
However, before it can be applied, Datenherrschaft opens questions and areas of research, some of which are already shortly covered in this article. Nevertheless, there is one specific outcome of concept that makes the difference. If data gathered from an individual is seen as an inseparable part of that individual, we need to investigate new ways to distribute information and simplify the agreements between individuals and organisations that use the information. One proposed approach for this is called MyData, where agency for individuals is seen as a central issue, supported by data operators who provide tools to individuals for controlling information sharing and consent [55,71]. Although the MyData approach is not yet ready or perfect, it shows that the current system can be altered if there is the will to do it. The European model for Data economy could be based on the idea of Datenherrschaft, as it is a strong statement towards human-centric data economy, also visible in and supported by GDPR.
However, legislation cannot be replaced with Datenherrschaft. Legislation needs to be more accurate and give more detailed guidance for actors under it. These kinds of details are missing from the concept of Datenherrschaft. In this weakness also lies the strength of Datenherrschaft. By being straightforward and short, it is intuitively approachable and could serve as the understandable goal which different actors could aim for. It could be seen as soft law6
Soft law means varied, rule of law kinds of norms that do not fulfil the characteristic of the legislation, as soft laws are created in a different way compared to regular law. Government authority may formulate soft laws to support specific legislation and its application. Soft laws can be also created by co-operation of governmental officials and private actors as norms that should be followed. Likewise, specific branches of industry or professional groups may enforce norms that they should comply with (for example standards of accounting) [93].
Future work in this area should include a comprehensive analysis on the applicability of Datenherrschaft to the case of lawful interception and surveillance. The nature of personal and privacy-sensitive information should be analyzed similarly as Koskinen [47] has done to patient and health information, through the different ethical theories. This will shed light on the suitability of Datenherrschaft to this problem and gives an ethical foundation on which to build the future of the networked information society.
A key observation made by Koskinen [47] about Datenherrschaft is that it seems to be intuitively suitable for also other kinds of private information rather than just patient records. The idea of applying Datenherrschaft to other kinds of data is thus tempting. Surveillance has its own inherent problems with privacy and freedom of speech, and thus approaches for ethical solutions to managing surveillance and the vast data associated with it in a morally and ethically justifiable manner are quite desirable. Datenherrschaft has the potential to be the ethically sustainable solution to this difficult problem and the initial step that is desperately needed in managing person-derived data. Further research is required before this potential can become reality.
Conclusion
In this paper we highlight the ethical problems of mass surveillance: it violates the privacy of individuals and creates a society where people are seen as potential criminals and thus objects to be controlled or data sources to be used to gain (economic) benefits by companies or other organisations. This answers our first research question. However, We have shown that there are other options. By new ways of treating personal information – such as the concept of Datenherrschaft do – we are treating individuals with respect and endorse their ethical right to control information collected from themselves. Our main contribution lies within this crossroads of technological possibilities, legislative limitations and ethical justification. Our second research question was whether Datenherrschaft can act as an ethically justified basis for developing privacy legislation. As Datenherrschaft bridges the gap between fields of technology, jurisprudence and ethics by being simple enough to act as the foundation for all, but also leaves room for further development and adjustment depending on the field and context it is applied, it definitely is an ethically justified basis for further development of privacy regarding surveillance and use of personal data.
Finally, we state that when we justify violations of personal rights with the goal of greater good and protection of society, extreme care must be taken. It is a tradition in writings on this topic to quote – albeit disputedly – Cardinal Richelieu: “If you give me six lines written by the hand of the most honest of men, I will find something in them which will hang him.” We must consider very carefully to whom, if at all, we give those metaphorical six lines of text on ourselves. And if we choose to entrust our personal and private information into the hands of others, we must contemplate whether they do indeed have our best interest in their heart or not. And to be sure there should be strong support for this – which Datenherrschaft can offer as a common goal for different actors to aim for.
