Do you remember reading news reports a couple of months back about Cyberabad police arresting a man from Faridabad, Vinay Bhardwaj, for data theft of 66.9 crore individuals?
It was reported as the ‘biggest’ data theft bust, and according to the police, the stolen data belonged to major companies like Facebook, Amazon, Big Basket, PhonePe, Netflix, PolicyBazaar, among others. They also stated that the accused had been holding data from 138 categories containing sensitive information of government, private organizations, and individuals.
Cyberabad police revealed that the accused operated from a website called “InspireWebz” and sold these databases via Google Drive links.
Searching the keyword “InspireWebz” led me to a whole new corner of the Indian internet where ‘social media marketing experts’ moonlight as resellers of personal user data of crores of Indians.
When I googled the name “Inspirewebz”, their website was still active, but it’s no longer available now. However, according to the Internet Archive capture of their website (available since mid-2020), they sold dodgy-looking social media marketing tools like bulk WhatsApp/SMS senders and data extractors.
I could not find anything about databases of Indians available for sale on their website.
Luckily, some of their socials are still active (at the time of writing this sentence). Especially their YouTube channel is very interesting. They have 2.56k subscribers, and the channel contains tutorials of digital marketing tools they sold on their website.
According to the news reports, police also arrested another Faridabad man, Madan Gopal Sharma, in connection with this data theft. From the (now inactive) Instagram mutuals of Inspirewebz, it was easy to find Madan Gopal Sharma, aka Monty Bhai, who also runs a digital marketing company called ShivaanshSoft.
Their website “ShivaanshSoft.com” is now defunct but sold digital marketing tools similar to Inspirewebz. The Internet Archive capture list of their website has tons of weird nsfw URLs and posts.
They also have a YouTube channel, which has 1.38k subscribers. And videos are similar to the ones on Inspirewebz’s YouTube. At this point, I got a bit curious about the bulk message senders and data extraction tools that both of these companies advertised and sold.
I did a quick search of tools like Whatsapp Bulk Sender and Justdial Data Extractor. And to my surprise, I found dozens of similar-looking websites, FB pages, and YouTube videos selling the same tools at different prices. I could not identify the original creator of these tools, but most of these pages and posts were created after mid-2020.
It became clear to me that the same toolkit bundle is available for white labelling to a wide community of people – all self-proclaiming themselves as social media marketing experts.
This is when I started to find little breadcrumbs here and there from some of these companies and individuals also advertising databases of crores of Indians available for sale along with the toolkit. Here are a few of several such pages (all still online):
And it makes sense why these databases are included in the toolkit. To use tools like bulk email, sms, and WhatsApp message senders, they also need phone numbers in bulk to send messages to.
It is odd how these pages are still active after all the news reports. I can only assume they haven't seen the news, have forgotten about the existence of these pages, or just don't care. If I could find so many of them still advertising databases for sale, I wonder how many were deleted in the past couple of months.
I skimmed through the YouTube channels of some of these companies/individuals. And like Inspirewebz and Shivaashsoft, all of them contain tutorials of the same tools taught by different persons.
But here is the interesting bit about these tutorials – they share their personal computer screens in these videos. For example, to explain how to use WhatsApp Bulk Sender, they share their personal and unblurred WhatsApp chat screen in the video. Their Windows desktop with all the files and folder names is also visible. It's a lot of information for anyone looking.
Check the filenames in this example:
In one of the videos from Monty Bhai's ShivaanshSoft YouTube channel, the person sharing the screen in the video nonchalantly opens a folder containing hundreds of these Excel spreadsheets:
In one of their other videos, they momentarily opened a sticky note containing all of their company usernames and passwords and didn't think to edit it out of the final video. I got the impression that they're not serious people.
Considering how recklessly they were sharing their desktop screens in these videos, I wondered if it was possible that one of these individuals also accidentally leaked any google drive links to the databases?
It didn’t take me a lot of time to find a YouTube video (still active) in which a voice is enthusiastically advertising a database containing personal details of more than a hundred crore Indians available for sale. There is a google drive link open on his web browser with the url visible in the address bar as he excitedly scrolls through the database list. 🤷♂️
I thought to myself, could this be it?
I screen-grabbed the URL from the video and opened the Google Drive link. And... there it was. The whole database in front of my eyes. I was staring at hundreds of spreadsheets and zipped files. It was too overwhelming for me to wrap my head around the sheer volume of data available in these files.
I was still thinking it shouldn't have been so easy to find these databases out in the open web with unsecured Google Drive links available for anyone with, as I have shown, very little sleuthing.
Here is a little video I prepared of me scrolling this list. There were tons of files in this folder.
With a spring in my step after finding one database link, I did a few more searches using different Google dorks to see if I could follow any more trails that could lead me to more such open links.
And sure enough, I discovered a few more Drive and Megaupload links carelessly dumped (probably by some of the buyers) on document-upload websites like Scribd - again available in plain sight for anyone looking.
Here are some of the redacted screenshots of more copies of these databases:
I verified if the database list matched the one provided by Cyberabad police. They reported that the database was divided into 138 categories, and the one I found had 138 categories too.
Analysing all of these files in detail is beyond the scope of this post. As the sole writer of this free substack, I don't have the time, resources, or ability to chase this goose.
I do have some thoughts though after glancing through some of the files. A disclaimer – my observations are only based on my very limited and superficial reading of a few files from this enormous database.
Data in a majority of these categories seem to be obtained from scraping websites like Google, Justdial, IndiaMart, LinkedIn, etc. (possibly using their own data extraction tools from the toolkit). This is not necessarily personal or accurate data and is publicly available online.
There are a bunch of files with only phone numbers or email addresses – sorted location-wise or category-wise. Fraudsters could use these to send targeted spam messages or phishing emails in bulk. But no personally identifiable information of any individuals exist in these files.
Data in certain categories - like user data from banks and private companies do contain confidential personal information. The data format and header row names in these files are too specific to be made up. It points to some kind of data theft from within these companies.
Rather than a data breach by an attacker, it seems more likely that these files were stolen by either the employees within these organisations or third-party vendors with whom many banks and companies share their data for various operational purposes. There is a precedent for this kind of internal data theft happening several times in the past too.
I am not sure how the Cyberabad police got to the 66.9 crore number, but the accuracy of this data needs to be studied. According to the last modified dates, many of these files are quite old. The validation and verification of personal user data in these files can only be performed with the assistance of companies to which the data belongs.
The folks that Cyberabad police have arrested are only a few of hundreds of people on the Indian internet running the same hustle. How many of them are they going to arrest? I don't believe any of these ‘social media marketing experts’ are masterminds behind this data theft. They are all small-time resellers who stumbled upon these tools and files, looking to make a quick buck.
The confidential user data in these files is not a result of any single massive data leak but a compilation of data stockpiled over the years from different groups of data brokers. It will be next to impossible to find original sources of these leaks.
Also, how do you contain a data theft like this? Different versions of this database are available on hundreds of different open drives and personal computers of people who have purchased or sold these databases.
The Digital Personal Data Protection Bill is still pending to be presented to the Indian parliament. From my understanding, I don't see how data thefts of this nature can be prevented even after these laws come into existence. It will take an organisation-wide cultural shift within companies to treat personal and confidential information of their customers as personal and confidential. (yeah, not gonna happen in this country)
I will leave you with this fascinating YouTube video from a database reseller explaining the whole personal database-selling business in detail. From how these databases are obtained, to how accurate the data is, to nifty little tricks like how to sell the same Class X students database two years later as the Class XII students database.
At the end of the video, he says once you join this business, you become part of a community that works together as friends and brothers and shares and exchanges databases as per the demands of their clients.
If this enormous compilation of data of allegedly 66.9 crore individuals is any sign, it must’ve been a very fulfilling friendship.
thank you, this was an interesting read. a suggestion - you used a lot of technical knowledge to find out information. you could share some of those techniques in the article, it'd be helpful to us non-tech people. thanks
As per websites like https://www.apollo.io/ should be banned ? But everyone seems to be using this.