Article List

How to Start Practicing Privacy in an Effective Way


Preface


I’ve long had a desire to talk about privacy and about Free and Open Source Software. I mentioned the latter in my guide for ripping Blu-Rays, but I haven’t yet made a dedicated post on the former. Or rather, I haven’t released one on this site. As I mentioned in Is This Thing On? (Yes, it is!), before this website became what it is right now, it existed on the Freenet, and there was one article there that I haven’t brought over when remaking the site, because I felt like it was not quite up to snuff. It was an argumentative piece about FOSS in which I argued for nothing meaningful, while at the same time talking out of my ass on a topic I didn’t know very well, and giving a lot of generalisations and simplifications that would likely antagonize anyone who had any knowledge on the matter.

Just because that one post didn’t work, didn’t mean I’d given up, though. At the moment, this is the fourth document I’m writing related to privacy and/or security, with the other three having some 3000 words total. I have yet to post any of them, because they too feel incomplete, insufficient and, most of all, meaningless. I want to talk about these things, but when I start writing, I quickly realize that I’m just meandering without getting anything specific done, writing with no purpose or specific message to convey.

Part of the reason for this is that, unlike some other pieces I wrote, when I write about privacy or free software, I don’t sit down with a goal. I don’t think “I want to explain this” or “I want to show how to do that”. I simply want to write something on the matter. Another reason is that the topic of free software is somewhat distant for me, as I’m not a software developer (whether by trade or by hobby), while the topic of privacy is so incredibly broad, that I struggle at maintaining a message while also relaying all the info there is to relay. The end result are posts that go on tangents, have bad disposition, and leave you none the wiser once you’ve finished reading them.

I hope this time will be different. I seek to write something that will let me learn and encourage others to do the same. Something that will include a personal story, making it engaging, while also providing practical knowledge, making it valuable. I will only know if I succeeded or failed after it’s over.


Who's Out for Our Data?


Operating in the digital space for any period of time, you’ll quickly realize a certain truth – privacy and convenience are, more often than not, exclusive. Not that they have to be exclusive (plenty of free software shows that this does not have to be the case), but that the way the “mainstream” internet is organised, seeking privacy often means giving up convenience, while seeking convenience typically involves giving up privacy. Every service works better when personalised, every registration happens faster when connected to a Facebook account, every form is filled quicker when your browser has autofill enabled. The current digital paradigm can be described with one phrase – convenience is king.

That is, at least, the consumer’s perspective. More precisely, it’s what those who deliver services for us want us to think. But consider this – creating intuitive, convenient and (often) free websites and programs requires lots of time and money, which has to be paid back somehow. And not just paid back, but also expanded upon, to make the investment worth it. How can Google operate so many services – YouTube, Gmail, Search, Drive, Android - completely for free, while remaining not just profitable, but one of the richest companies the tech industry has ever seen? The key to that is data collection. Behind the scenes, an enormous amount of data is collected, analysed and stored on just about everything that occurs on the internet. Big Data, as this operation is called, is one of the fastest growing industries in the world right now, and one of the biggest sources of revenue for technology companies. The free service is far from free for the owner, who has to supply customers with servers, storage space, support, and more. When you choose an e-mail provider, notice how some of them are paid, while other are free. In the case of the paid ones, it’s fairly easy to assume that the operational costs are covered by the fee you pay them, but what about the free one? Nothing is truly free, so there must be something else you’re giving in return, and more often than not, it turns out to be data. Going along with the e-mail example, Mailbox is a paid provider of e-mail services, while Gmail (owned by Google) provides a completely free e-mail service. Mailbox’s privacy policy provides a detailed list of all information that is gathered and why, and in that list there is no mention of the content of the e-mails you make or receive(1). Google’s privacy policy, on the other hand, states “We collect the content you create, upload, or receive from others when using our services. This includes things like email you write and receive […]”(2), which is an obvious privacy concern. Just to be clear, the distinction between paid and free services is mostly for illustrative purposes; nowadays, paid services are no less inclined to track everything you do, while others elect to not do so despite being free. Price as a measure of privacy is a rule of thumb that, as years go on, keeps losing relevance.

This kind of snooping will not surprise many people. That files you create and share get tracked and analysed is somewhat obvious, and as much as it is an invasion of privacy, it’s also something many people have come to expect and treat as a necessary evil if they want to enjoy the internet. What many may not be aware of is just how much spying occurs even when no specific requests are made, when no files are saved or uploaded, when no forms are filled and no data explicitly given. If this sounds like news to you, I recommend paying a visit to Cover Your Tracks (previously known as Panopticlick), which will show you the extent of this data gathering, known as browser fingerprinting. Most companies are not fond of sharing such information with their users, only mentioning it off-handedly in their privacy policies, but it’s fair to assume that the majority of large websites (and even many smaller ones) utilize this kind of data gathering to some extent. Case in point – Google’s reCaptcha. If you’ve been present on the web in the late 2000s and early 2010s, you’ve no doubt grown accustomed to writing down distorted text from an image to prove to a website that you’re not a bot. This sort of technology is still present in most websites today, but you may very well not notice it. That’s because the software most often used for this purpose, reCaptcha, owned by Google, now uses behavioural analysis to supplement the more traditional means of detecting bots. Even before you click the checkbox next to “I am not a robot”, the JavaScript code embedded in the website tracks and analyses various data points, like mouse movement inside the browser window, and uses that to determine the probability of you being a human or not(3). As a result, by the time you’ve finished doing whatever it is you’re doing, and you go to click on the Captcha verification, the website’s already gathered enough data about you to let you skip the text writing or image selection and mark you as a human.


Why do They Care?


We know, then, that both precise data like Twitter followers or uploads on Instagram, as well as more ephemeral information, like behaviour on a webpage or localisation data can and often is being tracked. What of it, though? It’s easy to rely on the fact that someone knowing so much about you is inherently scary to many people, but many people deflect such arguments with a response along the lines of “If I have nothing to hide, I have nothing to fear”. If that is something going through your head, consider this: would you, if asked, allow a stranger access to your mailbox, letting them read and inspect any and all letters and packages that are addressed to you? Most people would say no, yet there’s very little difference between that and what they do online. If your inbox is anything like mine, it’s mostly mundane. Maybe you ordered a book, maybe there’s a letter about a doctor’s appointment, maybe you get a newspaper. Not much of it would be considered sensitive information, and none of it is incriminating. And yet, from a young age, most of us are told that prying on someone else’s mail is very rude, and some people even develop a habit of shredding documents and parcels that come their way, so as to not allow a potential recycling worker or a dumpster diver insight into their postage, either. But if the sentiment above is valid, then there should be no point of that. If there’s nothing to hide (there is no incriminating or illegal content being sent or received), there should be no harm in a third party having access to it.

You see, “nothing to hide” does not mean “everything to show”. My browser history contains nothing wrong in the eyes of the law, or indeed most people, but it’s still something I would be quite wary of showing to anyone (and I don’t even look at much porn), so there’s no reason to give Google, Apple or Mozilla access to it, either. Even when we don’t have anything we need to hide, there are many things we choose to hide, because we prefer to keep it for our eyes only. The distinction between real and digital life becomes even more interesting when you consider the double standard applied to various situations. If someone online asked you for a photo of your face, you’d likely deny their request. That’s not something a stranger should have easy access to. Yet the same face may be plastered all over your social media platforms, along with more identifiable information, and leaving the computer space for a while, everyone you’ve ever passed on the street or in a store was able to take a look at your face, while also getting a pretty good idea of where you live – if you regularly take a bus in Minneapolis, it’s likely that’s where you live, and not in, say, Ontario. The same parents who may preach to their kids the idea of stranger danger, may happily post frequent status updates and images relating to their children online, giving any would-be dangerous individual access to a great deal of information about their child’s way of life, daily habits and such.

Another reason to restrict access to our data is that, once it is collected, it becomes difficult to track it and have control over where it’s used and by whom. Imagine that you attend middle school. You have a secret, that you decide to tell to your friend, and they promise to keep your secret safe with them. Unfortunately for you, your friend’s name is Facebook, so even if they keep it “safe”, their idea of safety involves being very entrepreneurial with your secrets and selling it to other students (“trusted third-parties”) in exchange for cash, candy, trinkets or whatever else middle school students might involve themselves with these days. The exchange involves a non-disclosure agreement, so while every person in the chain promises to keep your data safe, you have no way of controlling who’s involved in the chain or how far it spreads. And with your secret being spread so far and wide, it’s only a matter of time before someone spills the beans and it goes out into the public space, analogous to a data leak. Suddenly, a ton of your information is out in the open for anyone to grab, because the defences of a certain company got breached, despite you never even making any deals with that company.

Some services may try to appeal to privacy-conscious individuals by only collecting anonymous data. This often means that what’s collected isn’t data, but metadata (data about data). If I make a phone call, the data would be the content of the phone call, a recording of what was said, while metadata would be things like who contacted who, when the call took place and how long it lasted. This may seem like the lesser evil, but such data can often be de-anonymised, or the content of the data can be inferred from metadata. Imagine that you’re given a list of locations that someone’s been to over the course of several weeks. You don’t know who that individual is, only where they’ve been – you only have metadata. This would be considered anonymous by some, but it’s in fact very easy to figure out personal information – after all, if I see that your phone seems to stay in place for long stretches of time during the night at a certain location, I can be fairly sure that’s where you live, and the same information during the day lets me know where you study or work. Then, consider how different kinds of metadata can be put together. Suppose that there’s this person A, who you don’t know any personal details of. But you know that they have recently received a letter from a local medical centre, and contacted a suicide hotline a few days later. It’s not hard to figure out that they’ve gotten diagnosed with some debilitating disease, like cancer, even if you were never able to read the content of the letter or hear what was being spoken about on the phone. Anonymous data is only anonymous if it is mixed with other people’s anonymous data, or if it’s very small in quantity. With enough data points, anonymous data can still be used to identify a person(4).

The third and last point I’m going to bring up is that we don’t know what our data is used for once it is acquired. Some uses are obvious – data given to a grocery shop through a loyalty card will likely be used to improve the selection of goods, and give you relevant coupons to encourage you to shop at that store instead of another. But what if that data could be used in another way, for example by analysing your shopping patterns and determining that you might be pregnant, before you’ve had the chance to tell the people near you about it? This is not a theoretical, but a summary of a real story, of a teenage customer of Target(5). In this case, the data given to a store was used by that store to increase the its own financial gains. Some people tend to treat that as somehow tolerable, but I’d have to disagree – I’d rather spend my money on things I actually want and enjoy, not on wants and desired imposed on me by an algorithm or by a carefully crafted advertisement. But what if the data is gathered from many different sources, and compiled to produce something that few people would expect possible, such as manipulation of emotions and political positions? Once again, it’s not a hypothetical case, but reality – the Cambridge Analytica scandal from 2018, which many of you may remember, was a rather successful attempt of doing just that(6). You can hear more about how such manipulation can take place from the company’s CEO, Alexander Nix himself, in a presentation given in 2016 at the Concordia Annual Summit(7). Many conspiracy theorists fear vaccines, claiming that they’re an attempt by Bill Gates and his crew to implant microchips into their bodies. Likewise, they denounce ideas of a vaccination index, where one could see if someone was vaccinated or not, as unlawful and morally corrupt. At the same time, these people get most of their information on vaccines from Facebook groups, YouTube videos and Google searches, giving select few companies a very detailed look into their thought processes and beliefs. It’s likely that, given your Facebook profile, YouTube watch history and Google search history, I could – with fairly large probability – deduce whether you’re someone who’s willing to take a vaccine or not, and your personal stance on the topic can be made even more clear by the comments you post (publicly) and information you share with others (often also publicly). There’s no need for a microchip to be planted in your body – the infrastructure needed to identify anti-vaxxers and inhibit their daily lives with penalties and the like already exists, and is not something imposed by any government, but something most people chose to involve themselves with willingly.


That's Big Data - What About Big Brother?


We behave differently when we are being observed(8). If everyone was being observed, or at least suspected that they were being observed, their behaviour would be limited. This is a key component of social pressure, but more importantly for our discussion, it influences what people feel comfortable (or even safe) doing or saying. When doing or saying anything that does not favour whoever’s in charge of the surveillance can cause you problems, any kind of dissent becomes risky. A surveillance state, like the one portrayed in George Orwell’s book “1984”, is therefore one where the values of free speech and democracy are severely limited if not non-existent. As Edward Snowden claims in his book “Permanent Record”, the United States National Security Agency has the infrastructure necessary to process internet traffic coming to and from the country, flagging any potentially dangerous requests and search queries appropriately(9). It is not unreasonable to believe that, at the moment, this technology is used primarily to flag people researching things like home-made explosives, paedophilic content, hitman services, and so on, but since the system is already established, adjusting it to look for, let's say, dissidents to the current sitting president, would be fairly easy.

At this point, a question ought to be asked – is there anything, other than good faith, preventing such technology from being used in an autocratic, dictatorial fashion? After all, the technical properties of the NSA’s surveillance program do not differ much from those of the Chinese Great Firewall, which locates and tracks people who go against the Chinese Communist Party in any way. And if you think you’re safe because you live in Europe, or other parts of Asia, Africa, Australia or South America, think again; most internet traffic sooner or later goes to or at least leads through servers located on US soil, putting your traffic under the jurisdiction of the NSA, and even if it doesn’t, several countries cooperate with the US government’s surveillance, in a program known as the Five Eyes(10) (which over time was expanded to include more countries in what’s known as Nine Eyes and even Fourteen Eyes), further limiting where your data can go without being put in jeopardy.

While a state-led surveillance is not something I or most other people have the capacity to overthrow (especially given that I do not live in the US), our lives being progressively more and more intertwined with various services available online means that private companies like Google, Amazon, Microsoft, Facebook and Apple, along with their sophisticated machine-learning algorithms, exert more and more influence on us and are similarly capable of censorship, concealing information and discriminating against particular views. As such, another reason to seek greater privacy online is to preserve democracy and freedom of expression, of both people in industrial societies, and those still coming to grips with the modern way of life, whose struggles with dictatorship and oppression have begun to extend from their physical lives into their digital existence, as well(11). Even if you don’t feel like you’re particularly at risk of being affected by this kind of censorship – perhaps you’re not politically active, or live in a country with good privacy laws – if freedom of speech is something you believe in, you ought to act accordingly, and protect other people’s freedoms as well. If you wait and do nothing until the problem reaches you, in your privileged position, then like with climate change, it’ll be too little too late.


There's Too Much Information and Not Enough Advice


Suppose that the above convinced you to take your data into your own hands, or perhaps you were convinced of it before already, and are now looking to put that conviction into practice. The problem is that privacy violations are present all around us, and so the knowledge to be had about it is vast, often overwhelming. A lot of this information is generalized, making it useful for some and useless for others, and often lacks a clear solution for us to implement. Hearing about Facebook’s invasions of privacy is all well and good if I’m interested in the topic, but it offers no solutions to the problem – beyond “stop using Facebook” – and is actually useless for someone like me, who hasn’t had a Facebook account for over half a decade. I do not mean to throw writers of these articles under the bus; their work is valuable and important, but it needs supplementing in order to be of practical use to others. The solution of just not using a service is often the best one, but just as often it simply isn’t feasible, or at least not easily for someone uninitiated. How can you stop using Facebook, if it's the main (or even only) way of communicating with your relatives? If you hear about the invasiveness of Windows 10, how can you stop using it if you need a system to operate, and Linux-based alternatives feel daunting, and half your apps supposedly don’t even work on them?

Privacy, similar to veganism or off-the-grid living, is a lifestyle with many interweaving aspects, and trying to tackle it all in one go will only lead to exhaustion and discouragement. It is not a binary, but a spectrum; there isn’t a choice between being private and not being private, but instead a multitude of choices that can give you more or less privacy. What choices you end up taking depend on what you’re comfortable with, and what will give you most “bang for your buck”. Changing your web browser from Google Chrome to Mozilla Firefox, for example, will be a considerable step up in terms of privacy at almost no cost, whereas something like avoiding a social media platform with all your friends on it can be too much of a burden to bear, even if there are much less harsh alternatives that still give good results. Ideas like “I will not use any social media” or “I will only use free and open source software” are great to keep as goals, but breaking these beliefs every once in a while is not something to be shunned. Humans are imperfect, and even if our goal is complete privacy and autonomy, the road is paved with problems to deal with and compromises to make. I have been able to make a near-complete transition from Windows to Linux, but there remained one or two programs that simply would not function on my new setup. I did not let that stop me – I transitioned anyway – but I also didn’t cry about losing them. When I need to use them, I simply fire up Windows, do my thing and leave.


How to Practice Privacy


So, reading articles doesn’t help much if you just want to get started, as the information contained within is often too generic. What can you do instead, though? How can you get information that’s more specific to your needs? By taking a good look at your needs. Instead of reading about solutions to various problems, and seeing which of these problems are the ones you experience, you can instead look at what you do, what problems arise from that, and then look for solutions to them. This can be done either through general introspection, or through actually gathering data on your activities. The first option involves asking yourself questions, like “Do I really need X?” and “Are there any better alternatives to doing X?”. Taking YouTube as an example, I first asked myself if I really need to use it. I determined that, although it’s not a necessity, there is so much content there that I do not want to give up that, in principle, I do need it. Then came the question of alternatives to YouTube. Here, let’s distinguish between the front-end, meaning the website youtube.com and the YouTube mobile app, and the back-end, meaning the content hosted on the servers. The alternative to the latter exists in the form of Dailymotion, Vimeo, LBRY, PeerTube and others, but because the content is my main reason for needing YouTube, and so little of what I watch there is available on these other platforms, they are not viable alternatives to me. The front-end, however, can be replaced in a more privacy-oriented manner with: FreeTube, a desktop application; youtube-dl, a command line tool; and NewPipe, a mobile app. They let me access the content I enjoy with much less worry about Google’s invasive scripts. In most cases, they also improve the user experience through a better interface. They have their downsides, for example not letting me post comments, but since that’s not something I was doing to begin with, I can live without it. Asking myself the right questions allowed me to find more accurate information.

The second option is more involved, but it can also leads to better results. By keeping a sort of diary of everything you do, you can scrutinize your behaviour much better than if you relied on just your memories and assumptions. It also helps determine the weight of each choice you can make. I sought an alternative to YouTube because I spend a lot of time with the platform, but for you it might be of little importance if you only go there occasionally. Likewise, I don’t feel determined to seek out radical changes to my smartphone use, because I use it so little that it’s of little importance to my privacy, but if you have your phone with you everywhere you go and use it on a constant basis, doing something like installing a different operating system or getting more privacy-oriented social media apps can be a very good use of your time.


What I Will Do


I have already asked myself introspective questions many times, so there’s not much use in doing that again, but I’ve never kept a log of my activities, which is something I’d like to change. Although I consider my digital life to be fairly secure already, it won’t hurt to see just how far I can take it while remaining reasonable, and for that, I will use an activity log. The way I will do that is with a spreadsheet. Logging absolutely every activity and its precise duration would be too cumbersome, so what I chose instead is a short summary of 20-minute intervals. Many activities take a lot longer than that, and at the same time, going to some random website for 30 seconds – a website that I will likely never look at again – is hardly a privacy breach worth mentioning. I chose the 20-minute interval because I felt it to be a good compromise between detail and bother. Having to remember and fill in the sheet in 10- or 5-minute intervals would get annoying rather quickly, while choosing 30- or 60-minute intervals would risk losing track of smaller activities. A period of 20 minutes feels detailed enough without being cumbersome, and by the end of the two week period (which is how long I plan to keep it up), I will have almost 1,000 data points (around 600, if we exclude the time I’m asleep), which should be more than enough to give an accurate picture of my use of technology.

One thing worth mentioning is that my current lifestyle is rather unusual, which will be reflected in the activity log. Without getting into the details, almost all the time I have in the day is free time, but it will not stay like that forever (obviously). Once my daily routine has changed, so will my activities and my use of technology. The practice of keeping an activity log will remain useful, however, as I will be able to just do this two week trial again and use the knowledge gained to adjust my internet and tech use to be more privacy-oriented.

An activity log is a very flexible tool. While I will use it to judge my dependence on technology, it could just as well be used to study general daily habits, how much food you eat, how often you go outside, and more. It all depends on the kind of data you choose to gather.


Conclusion


In this post, I provided various reasons to care for one’s privacy, as well as laid out a framework with which one can begin to practice more privacy-conscious behaviour. After my two week plan is done, I will make another post, discussing my findings and possible improvements. Keep in mind that there are aspects of privacy that I will not talk about, for example the problems with digital payment options like credit cards, as that’s not something I understand too well, myself. Instead, I will focus on the software and hardware, the programs we use and the services we interact with in the computer sphere. In the meantime, I highly encourage you to carry out such an activity test for yourself, to see what you can change about your life to protect your data.


Sources


(1) - Mailbox.org - Data Protection and Privacy Policy, September 2019.

(2) - Google Privacy Policy, September 2020.

(3) - Ars Technica - Google’s reCAPTCHA turns “invisible,” will separate bots from people without challenges, March 2017.

(4) - "Anonymous" Location Data Problems - Computerphile, January 2021.

(5) - The New York Times - How Companies Learn Your Secrets, February 2012 (Onion link).

(6) - The New York Times - How Trump Consultants Exploited the Facebook Data of Millions, March 2018, (Onion link).

(7) - Cambridge Analytica - The Power of Big Data and Psychographics, September 2016.

(8) - BBC Future - How being watched changes you - without you knowing, February 2014.

(9) - "Permanent Record"; Chapter 20 - Heartbeat; Edward Snowden; 2019

(10) - Global Information Society Watch - Unmasking the Five Eyes' global surveilance practices, 2014.

(11) - Global Voices - Censorship


Back to the top