In the wake of the Cambridge Analytica episode, we must look at Aadhaar as the foundation for India’s data policy.
A nation of 1.3 billion people, India is in urgent need of a debate around data policy, privacy, and security. The country’s internet population exceeds that of the United States, is four times that of Japan and Russia, and is expected to cross 500 million by the end of the year. In light of this fact, we have been naive in our quest to design and implement a thriving data policy across private and government domains.
Data capturing, as we witness it today, has its origins in the on-demand economy model. Inspired by Uber’s success, developers curated utility apps based on it. Driven by greater smartphone accessibility and ownership, businesses found a developing market for these utility apps in consumers who were not sceptical about sharing their data online.
By 2016, there were on-demand apps for food, clothing, home services, hospitality, and a host of other sectors. Promising lucrative revenues for the service representatives (Uber drivers, for instance, made in excess of Rs 100,000 per month in the first year of the company’s operations in India; the starting salary for an engineering graduate in any company is Rs 25,000.) Projections estimated the on-demand workforce to be in excess of seven million by 2020, attracting close to 30 million consumers annually.
With most apps being free for download and use, citizens everywhere marveled as their digital workflow was not only enhanced, but also eased. Cabs, food, clothes, books, and even dog-sitters could be ordered with a few taps on the phone. Unfortunately, everyone was too consumed by the likes of Facebook, Google, Amazon, Apple, Uber, and others to worry about the cost they were bearing in lieu of these services.
The apps required users to share their data as a matter of ‘voluntary obligation’. To avail cab services, users were asked to share their location, email, contact number, card details, and even allow for the service to track their location when they were not using the app.
Data, online, was being collected in numerous ways. A conventional social network like Facebook already had one’s email address, basic details in the form of name, age, education, professional experience, and photo and video content, and places visited, shared voluntarily on the network. User credentials for these social networks were used to access third-party apps, including ones that apparently allowed users to find their twin, which celebrity they resembled, who or what they were in their previous birth, and which infinity stone they deserved.
E-commerce websites and platforms absorbed the user data. The Pandora’s box of discount and promotional coupons was opened, and users, especially in the 25-55 years age bracket, moved online to avail most of their services. Data became the lighthouse for ships sailing in the sea of e-commerce boom. For instance, consistent use of a cab service app helped the parent company understand supply-demand dynamics. The location and frequency of services helped cab, food, beauty, and other utility industries design supply-management chains to ensure maximum service in a minimum time frame. The heavily criticised surge pricing model from Uber was also part of this dynamics, ensuring service even when demand exceeded supply. Eventually, this mountain of data became too big to go unnoticed by companies, users, and regulators.
Anyone with a rudimentary understanding of data management will understand the difference between data and knowledge. Raw data, collected from users via apps, websites, and other virtual products, is transformed into meaningful knowledge using data analytics and other relevant models. Often termed as Big Data, the process has data sets evaluated to identify trends, patterns, associations, and interactions pertaining to users that can help a company enhance their service capacity and capabilities, and most importantly, their profits. The derived knowledge is then employed to create a personalised experience for each user using relevant algorithms.
The GAFA (Google, Amazon, Facebook, Apple) foursome is the king of this jungle of virtual products when it comes to data capturing, analysis, and implementation. Google’s synchronised browsing history ensures one sees advertisements of products they might have browsed on Amazon or any other website. The publicly available data on Facebook helps businesses on the world’s biggest social networking site curate advertisements specific to location, interests, and ideologies, as Donald Trump’s team may or may not have discovered to their delight.
The ‘personalised user experience’ model has been put to good use by digital advertising agencies, and rightfully so. To boost conversions and subsequent transactions, advertisers now use models that take into account the user’s location, interests, usage history, and often, their interactions. Eventually, it boils down to four major components: services, data, knowledge, and profiling.
Cambridge Analytica’s work with the data they sourced was somewhat similar. However, even if one is to wildly assume that the company alone helped Donald Trump win the presidency, there is nothing wrong about the way voters were approached online, for personalised advertising and selective user targeting will be the norm in the future. There wouldn’t be any reason to be surprised if the Congress and the BJP were found to have employed similar models in the past, or do employ one in 2019. The problem, however, is with how the data for this personalised targeting was collected.
Speaking about the CA fiasco, the Information Technology Minister was rash in threatening Facebook founder Mark Zuckerberg with a summon in the future for data violation, given that there is no real data policy in India. Data is indispensable for services, and terming ‘Big Data’ as a threat is as sensible as opting for ballot papers over electronic voting machines. The focus must be to strengthen the data ecosystem, not dismantle it, and this is what the Indian government must keep in mind while designing their maiden data policy.
In curating a data policy in India, the diversity must be taken into account. While people in rural villages may be willing to share their personal information, there might be a resistance from the urban pockets, as seen in the case of Aadhaar. The data policy must also be progressive in nature and should not constrain existing businesses or future prospects, for the digital economy is expected to generate new employment opportunities throughout the 2020s.
To ensure user profiling does not exceed the rational requirements of a service provider, the data policy must focus on informational privacy. It should revolve around the three Vs – volume (the mountain, as noted), velocity (the rate at which data is used in real-time), and variety (sources for the data). Artificial intelligence, Internet of Things, and developments pertaining to Big Data and Analytics must also be covered under the same policy.
The challenge, however, lies in the manner in which data is shared. With elaborate virtual products and amalgamations like GAFA in play, data shared by user on one platform might end up on the other, thus making the data source, processor, and controller hard to identify. This is where the Indian government must take a cue from European Union’s General Data Protection Regulation (GDPR) of 2016.
Enforceable as a law in all member states from 25 May 2018, GDPR has user privacy as the primary area of focus. Imposing extensive control over collection, processing, and use of data, the legislation prohibits collecting sensitive data that could be used for racial, ethnic, political, philosophical, professional, health, or sexual profiling. The law warrants purpose of specification, data minimisation, data quality, and security safeguards, including the mandatory 72-hour deadline for a company to share information in case there is a data breach.
The buck does not stop there, for the user is given control of their data even after collection, for they can confirm how the data is being used, access it to validate or rectify it, port it for another service, restrict its processing under certain circumstances, disallow automated decisions at the service end, like auto-renewals and so on. Given the legislation is applicable to any entity across the world that uses the data of users from the EU, the law is indeed a giant leap towards data protection and privacy.
To get started with the data policy, the government needs a starting point that unites over a billion people, given that the rural-urban convergence will ensure a 100 per cent increase in the number of internet users by 2030, taking the number close to one billion. This is where Aadhaar comes into play as the foundation of India’s data policy.
For long now, the debate around Aadhaar has been misconstructed. Citing it as the government’s tool to snoop on citizens is stretching it, while listening to lawyers defending its security by quoting the measurements of the walls guarding its servers is excruciating. As the case for Aadhaar languishes in the Supreme Court, the measures taken by the Unique Identification Authority of India (UIDAI) could form the starting point for India’s data policy, for no other service will draw the number of users Aadhaar has. The mere scale and importance of Aadhaar in the context of data protection and privacy make it the ideal model for private players to replicate in their operations.
To make a suitable case for Aadhaar as India’s unique identification entity, as a reference for private entities like Uber operating in India, and as the starting point of our data policy, these factors must be stressed on:
First, the territorial scope must be defined by both government and private bodies to inform users where their data might be used. Data minimisation must prohibit the collection of sensitive data. Informed consent must dictate the processing and controlling of data, and a notice must be given to the users each time their data is put to use. While ‘consent’ and ‘notice’ fatigue in the form of email alerts and mobile notifications could take over in the long run, users will have the option of exercising greater control over their data through individual participation.
Users must be given the right to validate the purpose for which their data was collected at any instant, object to or question its usage in a manner which does not hinder operations, and most importantly, should have the right to be forgotten – that is, have their data deleted after they stop opting for a service, even if it is Aadhaar.
No data policy can be without its exceptions and exemptions, and these must be clearly defined. For instance, if the health records linked to my Aadhaar card are used for Modicare in the future to map the occurrence of non-communicable diseases across India, the purpose, exemptions in place, and data processors and controllers must be clearly notified. If the same data is then shared with a third-party agency, the legalisation must ensure secrecy of the data and anonymity of the individual.
In case of a data breach, leakages, or illegal sharing with external parties, the relevant agencies must be penalised and the affected users must be compensated. Stringent deadlines, as in GDPR, must be set for agencies, both government and private, to report data loss. Like in the case of Cambridge Analytica, Uber, or Yahoo, we must not debate a leak years after it has occurred.
Given how the use of Aadhaar unites our population, having an independent Data Protection Authority or Ministry in 2019 will not be a bad idea either, for it could educate the masses about online data sharing and regulate companies operating in India.
Already the UIDAI has implemented many of the discussed components in their data management mechanisms, and therefore, it will make sense for them to further elaborate on the work they have done, for it could not only help them navigate the judicial shenanigans, but also make them lead a nation still trying to crack a data policy. The road will not be without hiccups, for the need for urgent improvisation shall arise time and again, as we witnessed in the case of GST and demonetisation. The mistakes made today could transform into data marvels tomorrow for the world to look up to.
For long now, users have been ignorant of their presence in the digital ecosystem as products that are being consumed by vendors for data capturing. In our pursuit of efficient on-demand utilities, we have allowed ourselves to be fooled and flattered at the same time. The Cambridge Analytica episode must serve as a reminder of our vulnerabilities in the digital world, the inevitability of content manipulation, and the indispensability of a data policy, for empty threats and high walls are not an assurance of data security in the twenty-first century.
Also Read: How Data Has Real Impact In The Real World: The Case Of Cambridge Analytica