In August 2017 the Supreme Court of India ruled that the right to privacy is a fundamental right, protected by the constitution of India [1]. This led to the first draft of the Personal Data Protection Bill in 2018. After several consultations and debates with various stakeholders, in December 2019 the Personal Data Protection Bill 2019 (PDP Bill 2019) was tabled in the Indian Parliament by the MeitY [2]. A key clause in the Bill refers to the Privacy by Design policy [3]. This clause has triggered a flurry of discussions around what it means for India’s Tech Industry to follow Privacy by Design when dealing with personal data, and recently also when dealing with non-personal data [4, 5, 6].
This case study focuses on privacy of business data. We first explain why Privacy by Design is also very important when dealing with the data of business customers. Then we describe the thought process and the technical considerations that Tally Solutions Pvt Ltd has developed and followed for decades when dealing with the data of their millions of MSME customers. Lastly, we share a few tactical considerations that are useful when executing with Privacy by Design.
To ground our discussion on data privacy, let us first understand what is private data in the context of a business. The bulk of all information which is used/processed by businesses are implicitly public data - for example, the existence of a brand, a product, a company is public data since it is traded across multiple businesses. Here are some example criteria to identify data to be private.
Identifying data to be private vs public is a very important consideration that should be taken conservatively - when in doubt about a specific data point, assume it to be private.
For a business that is dealing with personal data, one facet of compliance with the data privacy laws is by ‘getting consent’ to access and/or store the customers’ data necessary to provide the services. However, the notion of ‘getting consent’ is not applicable in the same manner when your customer is a business. Hence, the onus to articulate and practice the concepts of data privacy for business customers mostly lies on the companies with such business customers.
Also, for a business, having full control over access of their data (e.g. financial data) is crucial to guard their competitiveness. So, a company serving business customers should have a transparent and well-defined process for handing the business customers’ data so that the business customers are always both aware of and in control of who can access their data.
Last but not the least, upfront focus on data privacy is a perceptible competitive advantage. Such attention to data privacy from day one can ensure that the company’s products are all built with the data privacy considerations put at the core from the ‘design’ stage itself instead of being bolted-on at a later stage. For example, such focus on privacy by design is arguably among the drivers that give Apple a competitive edge and help achieve premium positioning of Apple devices in the computing and smartphone market.
Tally Solutions Pvt. Ltd., is an Indian multinational company that provides enterprise resource planning software. The software handles accounting, inventory management, tax management, payroll etc. and is used by nearly 2 million customers [7].
Tally serves a few millions of businesses today. Much of the architecture of the current product is premise-based, therefore the control of customer data is fully with the customer, including the remote services that it provides. Very shortly, Tally will provide customers anytime anywhere access to their data, with third party services being integrated with the application as indicated in the picture above.
For Tally, to support data exchange with such diverse stakeholders including millions of business customers, it is essential to ensure that the business customers have full and exclusive control over who can and cannot see their data. Tally uses a set of principles while designing a system to handle their customer’s data in a way that fully protects the privacy - whether the data is residing at customer’s premises, is traveling to Tally’s backend systems, traveling through Tally’s backend systems, or is stored in Tally backend systems.
1. Customer data will lie on their devices and on the Tally backend
A business customer may need to store the data on their premises, on other devices that they use, or on Tally backend. Here is how Tally ensures privacy of the data regardless of where it is stored:
2. Customer data will move between their devices and the Tally backend
A business customer data may need to move among their various devices and also between their devices and the Tally backend. Here is how Tally ensures privacy of the data in transit:
3. Customers will integrate with third parties
A business customer’s data may need to travel through the Tally backend to avail services from various third parties as shown in the diagram above. Here is how Tally ensures privacy of the data while it is passing through the Tally backend:
4. Customers will avail data-based services including analytics
For any business customer’s data that is used to enrich the analytics database, Tally follows a set of rules to ensure that such decipherable data received is never identifiable and is only in anonymized and aggregated form.
Understanding data about how the product is used is useful to provide deep insights, that results into improvement in product design. Tally uses anonymization techniques while gathering such data, so that customer identity cannot be reverse engineered in the backend, thus completely protecting customer’s privacy.
Many of small business customers want to have physical control of their business data, and not worry about the possible risks of it going outside their office. Their data resides inside their premise. While customers can access their data through web browser from anywhere, the system is designed so that no data is stored in the Tally backend system. This allows the customer to remotely access the reports that they need to, without the privacy concerns of data being stored elsewhere.
The Data Privacy by Design (DPbD) principles outlined above should help clarify the technology and engineering related decisions and choices when building a product/service that deals with customers’ data. However, execution while adherence to DPbD also has important tactical considerations. The data-aware tech industry has only recently embraced the importance of DPbD, and hence naturally there are few documented precedents for the industry to learn from. To the extent that we have had some experience with the execution with DPbD, we share here the tactical pointers that may be useful for others. These pointers are neither objective nor comprehensive. However, we believe the industry as a whole would greatly benefit if more such pointers are shared by other players.
The skillsets required by both of these subtasks largely overlap, and hence it’s a common practice that both are carried out by the same team. However, the pressure to unblock the development of analytics products (that promise lucrative business impact in the short term) could lead to inadvertent yet significant compromise of the DPbD principles. To prevent such compromise, it is recommended that the team responsible for algorithm development for the subtask A is kept at an arm’s length from the team responsible for the algorithm development for subtask B. In fact, when possible, the former team should include at least one outsider who 1) has diverse experience on building the data-aware algorithms across a variety of contexts, 2) has in past architected the gates and filters for data privacy and 3) has no incentive to make the ‘data availability’ easier to expedite the development of the envisioned analytics products.