We use all apps. We know that they capture information about us. But exactly how much information? I have worked as a software engineer at Apple and in a medium-sized technology company. I've seen the good and the bad. And my experience at Apple makes me far more comfortable with the system Apple and Google have proposed for COVID-19 exposure alert. This is why.
Apple respects user privacy
When I worked on Apple Watch, one of my tasks was to record how many times the Weather and Stocks apps were launched and report it back to Apple. It's easy to record how many times each app is launched. But to report that data back to Apple is much more complicated.
Apple emphasizes that programs should always have customer security and privacy in mind. There are a few basic rules, where the two most relevant are:
- Collecting information only for a legitimate business purpose
- Do not collect more information than you need for that purpose
The other can use a small extension. If you collect general usage data (how often do people check the weather?), You cannot accidentally collect something that can identify the user, such as the city they are looking at. I did not realize how closely Apple enforced these rules until I was assigned to record user data.
Once I recorded how many times the Weather and Stocks apps were launched, I set up Apple's internal data reporting framework back to company. My first revelation was that the framework strongly encouraged you to transfer back numbers, not strings (words). By not reporting strings, your code cannot inadvertently register the user's name or email address. You are specifically warned not to register file paths, which may include the user's name (for example,
/Users/David/Documents/MySpreadsheet.numbers ). You are also not allowed to play tricks like encoding letters as numbers to send strings back (such as A = 65, B = 66, etc.)
Then I was told that I could not check my code in Apple's source control system until the Privacy Control Committee had inspected and approved it. This wasn't as scary as it sounds. A few engineers wanted a written justification for the data I submitted and for the business purpose. They also reviewed my code to make sure I didn't accidentally pick up more than expected.
Once I had been approved to use Apple's data reporting framework, I was allowed to check my code in the source control system. If I had tried to check my code in source control without authorization, the build server would refuse to build it.
When the next beta build of watchOS came out, I could see in the reporting panel how many times the weather and Stocks apps were released each day, listed by OS version. But nothing more. Missions achieved, privacy maintained.
TechCo largely ignores user privacy
I also wrote iPhone apps for a mid-sized technology company that should remain nameless. However, you have heard about it, and it has thousands of employees and multi-billion dollar revenue. Call it TechCo, partly because the approach to user privacy is unfortunately too common in the industry. It cared much less about users' privacy than Apple.
The app I was working on recorded each user interaction and reported data back to a central server. Each time you performed some actions, the app captured which screen you were on and which button you dropped. No attempt was made to minimize the captured data, nor to anonymize it. Each item sent back included the user's IP address, username, real name, language and region, timestamp, iPhone model and much more.
Remember that this behavior was in no way malicious. The company's goal was not to investigate its users. Instead, the marketing department just wanted to know which features were most popular and how they were used. The most important thing was that marketers wanted to know where people fell out of the "funnel."
When you buy something online, the buying process is called a funnel. First, look at a product, say a pair of running shoes. You add your sneakers to your shopping cart and click the buy button. Then enter your name, address and credit card, and then click Buy.
At every step of the process, people fall out. They decide they don't really want to spend $ 1
Companies spend a lot of time figuring out why people drop out at every step of the funnel. Reducing the number of stages reduces the number of opportunities to drop out. Remembering your name and address from a previous order and automatically filling it means that, for example, you do not have to re-enter that information, which reduces the chance of you dropping out of the process at that time. The final reduction is Amazon's patented 1-click order. Click on a single button and your sneakers will be on your way.
TechCo's marketing department wanted more information about why people dropped out of the funnel, which they would then use to set up the funnel and sell more product. Unfortunately, they have never thought about users' privacy when collecting this data.
Most of the data was not collected by code that we wrote ourselves, but by third-party libraries we added to our app. Google Firebase is the most popular library for collecting user data, but there are many others. We had half a dozen of these libraries in our app. Although they provided roughly the same features, each one assembled a unique computer that the marketing wanted, so we had to add them.
The data was stored in a large database searchable by any engineer. This was helpful to confirm that our code was working as intended. I was able to launch our app, tap through a few screens, and look at my account in the database to make sure my actions were recorded correctly. However, the database had not been designed to share access – anyone with access could see all the information in it. I could just as easily look up the actions of some of our users. I could see their real names and IP addresses, when they logged off and on, what actions they took and what products they paid for.
Some of the more senior engineers and I knew this was poor security, and we told TechCo management that it needs improvement. Test data should be available to all engineers, but user data for production should not be. Real names and IP addresses must be stored in a separate secure database. the general database should turn off non-identifying user IDs. Data that is not needed for a particular business purpose should not be collected at all.
But marketers preferred the kitchen approach and gathered all available data. From a functional point of view, they were not entirely unreasonable because the extra data allowed them to go back and answer questions about user patterns they hadn't thought of when we wrote the app. But just because something can be done doesn't mean it should be done. Our security complaints were ignored and we finally stopped complaining.
The app had not been released outside the United States when I was working on it. It is probably not legal under the European General Data Protection Regulation (also known as GDPR – see Geoff Duncan's article, "Europe's General Data Protection Regulation Makes Global Privacy," May 2, 2018). I guess it will be changed before TechCo releases it in Europe. The app does not comply with the California Consumer Privacy Act (CCPA), which aims to let California residents know what data is being collected and control their use in certain ways. So it may be about to change in a big way to meet GDPR and CCPA soon.
Privacy is baked in COVID-19 Exposure Notification Proposal
With those two stories in mind, consider COVID-19 exposure notification technology proposed by Apple and Google. This suggestion is not about explicit contact tracking: it does not identify you or anyone you came in contact with.
(My explanation below is based on published descriptions, such as Glenn Fleishman's article, "Apple and Google Partner for Privacy Preservation COVID-19 Contact Tracking and Notification, April 10, 2020. Apple and Google continue to refine elements of the project; read the article's comments for major updates. Glenn also received ongoing orientation information from the Apple / Google Partnership and he confirmed this retelling.)
The current draft proposal has a very Apple privacy-conscious touch. and broadcast information is opt-in, as your choice to report if you get a positive COVID-19 diagnosis, your phone does not send any personal information about you, instead it creates a Bluetooth beacon with a unique ID that cannot be traced back to you.The ID is from a randomly generated diagnostic encryption key generated fresh every 24 hours and stored only on your phone. that ID is not traceable: it changes every 15 minutes, so it cannot be used by itself to identify your phone. Only the last 14 keys – 14 days ”are retained.
The phone registers all identifiers it retrieves from other nearby phones, but not the location where it was recorded. The list of Bluetooth IDs you have encountered is stored on your phone, not sent to a central server. (Apple and Google recently confirmed that they will not approve any app that uses this contact notification system and also record location.)
If you test positive for COVID-19, use a public health app that can interact with Apple and Google's frameworks to report your diagnosis. You will probably need to enter a code or other information to validate the diagnosis. This prevents the apps from being used for false reporting, which will cause unnecessary problems and undermine trust in the system.
When the app confirms your diagnosis, it triggers your phone to upload to the last 14 days of daily encryption keys for Apple and Google-controlled servers. Fewer keys can be uploaded based on when exposure might have occurred.
If you have the service turned on, the phone continuously downloads all daily diagnostic keys that have verified that the devices have posted. The phone then performs cryptographic operations to see if it can match derived IDs from each key against any Bluetooth identifiers captured during the same period that the key covers. If so, you were nearby and would receive a notification. (Proximity is a complicated question, due to the Bluetooth reach and how devices can measure far apart from each other.) Even without an app installed, you will receive a message from your smartphone operating system; with an app, you get more detailed instructions.
At no time does the server know anyone's name or location, just a set of randomly generated encryption keys. You don't even get the exact Bluetooth signals, which can let someone identify you from public rooms. In fact, the phone never sends data to the server unless you prove to the app that you tested positive for COVID-19. Even if a hacker or an eager government agency were to take over the server, they could not identify the users. Because your phone loses all the keys over 14 days old, even cracking your phone will reveal little long-term information.
In reality, it would be more than one server, and the process is more complicated. This is a broad overview that shows how Apple and Google are building privacy from the first moment to avoid the kinds of mistakes made by TechCo.
Apple claims to respect users' privacy, and my experience indicates that it is true. I am much more willing to rely on a system developed by Apple than a system created by any other company or government. It is not that another company or government is trying to abuse users' privacy; It's just that outside of Apple, many organizations either lack the understanding of what it means to bake privacy from the start or have competing interests that undermine efforts to do the right thing.