DataKind Bangalore: Using Data to Improve Development

datakind banglaore

On the eve of the birthday of MK Gandhi- India’s founding father- two very different groups of technologists are buckled up onboard flights to the United States. One surrounds a man who has risen from poverty to the position of the country’s Prime Minister. Soon, their plane begins its descent into the sun-drenched hillsides around Silicon Valley. The second comprises a trio of young middle-class professionals who’ve applied for extended leave from their day jobs to visit New York.

Peering out at the towering skyline from their windows, they dwell on the upcoming Second Annual Summit of the movement that they helped launched globally a year ago. Despite these surface differences, the two groups find their eyes glazing over the same dreams: harnessing the power of technology and internet connectivity to build a better, brighter India.

Who are they? The first-as you must have guessed- is the retinue of Narendra Modi, spearheading his ambition for a Digital India. Less obvious, and the subject of this three-part series- is DataKind Bangalore, and their diverse initiatives for the improved governance and accountability.

DataKind Banga-what? Mouthful alert. So let’s review that- one word at a time.

DataKind is a global nonprofit that unites pro-bono data scientists with social sector organizations to address critical humanitarian problems within a project-based framework. Since its launch in New York City in 2011, DataKind’s volunteers have undertaken a range of exciting initiatives– from scraping website data on Indonesian agricultural prices and Mozambique’s microfinance, to exploring poverty levels through satellite imagery of electric lighting in Bangladesh and roof materials in Uganda, to identifying trends in the needs of the distressed by mining their SMS text in India, the US and the UK.

This breadth of impact and depth of expertise has only been possible through a vibrant worldwide community represented at DataKind’s Chapters in Dublin, San Francisco, Singapore, the UK and Washington DC, and of course, Bangalore.

Yes, Bangalore. The city Indians would like to call the Silicon Valley of the East. And what Silicon Valley itself would like to dislike as the Outsourcing Capital of the world. Except now, Bangalore is ‘insourcing’. DataKind’s local Chapter, founded in 2014 has been harnessing the country’s top tech talent to take on its own greatest challenges.

Within just a year of operations, their tally of volunteers hit a staggering 700. So could India’s bemoaned Brain-Drain be quietly rebounding into a Brain Gain? Perhaps part of the pro-bono participants’ passion pertains to how Bangalore’s is the only Chapter situated in a developing country.


Of course, all members of DataKind’s international network confront the ‘wicked’ problems that bedevil poverty alleviation. But when you experience this wickedness first-hand, when it’s cackling in your face on a daily basis, you’re far more inclined to land an algorithmic slap on its cheek.

One of the most stinging issues- possibly one that brought Modi into power- was a lack of transparency and accountability and an almost resigned acceptance of corruption and inefficiency.  And as the trio in New York soon realized at DataKind HQ, governance had unintentionally become a Chapter theme of sorts. 4 of all their 6 nonprofit partners thus far had resolved to support public bodies with data-driven decision making, or at least to build societal consensus on the need thereof.

As another interesting insight at the Global Summit, the Bangalore trio noted that even in developed nations like the UK, the supply of well-trained data scientists still fell short of demands from the private and public sectors. What did this portend for India?

In parallel and on the opposite coast, Modi had been pitching to several CEOs to invest in his country’s IT infrastructure. This tied in with the 17-point Digital India vision he announced 3 months ago, which concludes with ‘I dream of an India where every Netizen is an Empowered Citizen’. But as former Microsoft researcher, Kentaro Toyama elaborates in his book ‘The Geek Heresy’– mere provision of internet and mobile technologies, without investing first in human capacity to handle them (and the resulting information deluge) would ring hollow. An empty promise.

Volunteers at DataKind Bangalore have been fortunate to belong to the narrow segment of digital elite equipped with the industry knowledge and cognitive capabilities to leverage these tools. And it turns out that 6 of the 17 points in Modi’s mandate could be linked to issues of Good Governance. So if there is any measure of evaluating just how truly efficacious the ICT4D mandate could prove for India, and particularly in transparency and accountability, DataKind Bangalore and its projects with local NGOs provide an exciting testing ground.

Likewise, this current Chapter theme of Governance will form the focus of this series, though future extensions may explore outstanding DataKind Bangalore projects in other areas such as education, agriculture and microfinance. The remainder of this entry outlines the workings of DataKind’s typical project cycle, and sets the stage for more detailed explorations that will follow in the coming weeks.

Given this backdrop of the non-profit and technologist landscape, how does possessing data lead to any sustainable change? The answer: it doesn’t. Not per se, at least. Then again, DataKind Bangalore isn’t a group of number crunchers alone. Think of it rather, as an innovation and strategy hub. Likewise, its leaders follow a system.

First, a rigorous scoping and outreach process helps determine which organizations hold sufficient management capacity and clearly defined data science problems for the collaboration to prove worthwhile.

Secondly, doors are opened to volunteers from not only the IT industry but a variety of fields including economics, design, journalism, anthropology, and business development. This diversity enriches the ideation process, while also providing many participants with their first on-the-job taste of programming and statistics.

Thirdly, through a defined sequence of community events, the nonprofit’s challenge is hacked and hewed much like Michelangelo sculpting David out of a block of marble. Project Accelerator Nights (evening brainstorming sessions that lead to problem formulation) and DataJams (sessions of data cleaning and exploratory analysis) then culminate in DataDives (weekend hackathons on clearly defined challenges).


If partners believe that the resulting proposition would boost social impact, a specially selected DataCorps project then fully integrates it into the host organization over a six-month period.

For every David, there’s a Goliath lurking out there somewhere. And the world we inhabit today teems not only with Big, but Giant Data. This isn’t just the statistics computed to furnish in a Non-Profit’s annual reports or the World Bank’s tables, or even decade-end census figures. Neither is it the information gleaned from large-scale randomized controlled trials on policy effectiveness.

Sure, all of that is pretty and polished. But to (grossly) twist a John Legend classic, quantitative analysts today have to learn to love data with it ‘all its curves and edges, all its perfect imperfections’. And this could either pop up mercilessly in real time (through the spread of social media, mobile devices and sensor devices) or turn musty over years in impregnable government PDFs.

So no matter what fancy statistical technique DataKind may have planned, the first step of problem solving remains the same. All available data- whether from partners or scraped off the net- must be tamed and standardized into a format suitable for computers to perform their magic on. Once this foundation is laid, applications of data science to governance could be classified broadly into two use cases. We will explore each with a pair of Datakind Bangalore partners.

The first centres on the executive wing of public administration- specifically interface with citizens at the municipal ward level. Hell hath no fury like a Smartphone owner scorned. Naturally, public officials often feel overwhelmed and understaffed to deal with the volume and variety of their complaints. As a first remedy, duplicates must be cleared, i.e. if many citizens are lodging new entries for the same issue. These must then be allocated to the appropriate authority for resolution.

For example, ISIF 2015 award winner (and coincidentally one of DataKind Bangalore’s inaugural partners), Janaagraha has leveraged its ‘I Change My City’ online platform and mobile app to empower over 2 million Bangalore citizens to lodge over 36,000 complaints on daily hassles such as potholes, garbage left in the open, streetlights, etc (see below).

With some practice on previous years’ data, a computer can soon begin to predict where and when they are likely to emerge, and calculate the probability that they will be resolved. Machine Learning, Mamma Mia! The next entry in this series will explore the mechanics of such an analysis both for the established Janaagraha initiative as well as the newly commenced e-Governments Foundation project in the neighbouring metropolis of Chennai.

The second approach turns to the judiciary and public finances by visualizing data over time or in specific areas. This allows for identifying trends to take action (for public officials themselves) or demand good governance (for citizens and activists). For example, a brief mapping exercise with the Bangalore Police helped them deduce the location of organized gangs (mostly around open public spaces) and then snatch up and enchain some unassuming chain-snatchers. But more importantly, such visualization converts endless and inscrutable reams of data into a clear and visually engaging narrative.  The final installments of this series will compare applications of data cleaning and visualization to two freshly minted DataKind Bangalore partnerships.

First, DAKSH and its Rule of Law project aim to throw light on another category of the overwhelmed government employee- judges at the District, State and National levels. By mapping and quantifying India’s notoriously high case pendency across courts, DAKSH aims to foster informed public debate and develop sustainable solutions.

Second, Centre for Budget and Government Accountability from New Delhi is striving to develop a detailed data Portal on Union and State budgets in India since 2005 and expose any discrepancies between funded allocated and those actually spent. With both partners, DataKind will help discipline and visualize unruly giant data for a simplified user experience that provides not only intelligible insights but impetus for informed action.

So there we have it- common citizens in the world’s largest democracy harnessing internet technologies for improved transparency and accountability.  The world has changed dramatically since back when Gandhi overthrew a colonial regime through the power of a clear national message and transforming the culture of community movements. It remains to be seen whether embedding technology and data-driven decision-making within organizations can help create a similar impact on the dramatically different challenges of the present day.

Two groups who believe in this potential- Prime Minister Modi and DataKind Bangalore- may have now caught the flight back to India to achieve their mission. But now it’s time for you to fasten your seatbelts. Stay tuned as we embark on new adventures with two fascinating methodologies applied to pioneering and passionate partners in the Silicon Valley of the East. No matter how long the seed needs to take root, and whether this experiment fails or succeeds- it’s definitely a journey you don’t want to miss.

Abhishek Pandit is a Strategy Consultant at ChaseFuture