Collect Fake Profiles on LinkedIn

Collecting Fake Profiles on LinkedIn

I had this idea when I was working on The Selenium Project, where I made small bots to automate daily tasks, but that and this collecting Fake Profiles on LinkedIn are totally unrelated. I just had an idea and thought why not see if it really works but there were several problems and in this post, I’ll explain everything from my idea to the code.

My Selenium Project

When I started to learn and experiment python, I had a thought what if we can automate the way we browse the internet like some code to automatically sign in and follow, like or click a button and automate all of this process. This caused the Selenium Project.

Scraping LinkedIn? Hold up!

I chose LinkedIn because

  1. It is a professional platform (tend to have less spam compared to others)
  2. It was eventually getting spammed and I want to give it a shot

When I first started automating LinkedIn tasks like login, follow and view other profiles I understood that it wasn’t an easy task here are some problems with automating LinkedIn

  • LinkedIn easily detects bot activity
  • We cannot view everyone’s profile easily
  • Search and scraping is strictly limited on this platform

However, I wasn’t exactly scraping all the data from LinkedIn and I made some tweaks with how Selenium works and in the end it worked

Did I really collect Fake Profiles

Well here’s the thing,

To really identify a fake profile we need to scrape or find out all the activities of a profile complete from their posts, likes and identify their profile photo and still even then we might need some help with machine learning etc.

I had just started with selenium and don’t want to stop my project before it even started and then I had an idea, a simple one but it does the job.

Fake Profiles, of Who?

We just can’t go after every random profile we find and investigate its authenticity, I think it might be possible but not for me at that time. However, there were some easy targets celebrities, CEOs with both unique names and designations, Yess !!!

Here’s what I did

  1. Find/Pick a celebrity name
  2. Preferably I chose the CEO
  3. [Automation starts]
  4. Search their name
  5. Collect all profile links
  6. Find profiles that contain their exact name
  7. Success!

I managed to get 50-75 Profiles per search

* Not all of them are fake, but a majority like 90% were fake or spam

I chose big CEOs because there were many fake profiles in their names and they were an easy to target compared to any other profile category on LinkedIn

So, let’s get started with the code now!

Requirements

  • Selenium [pip install selenium]
  • Python 3+
  • Chrome Web Driver [Download]

Remember, your Chrome Driver and Chrome Browser version must be same

Code to Collect Fake Profiles on LinkedIn

Like I said before, I used Python and Selenium and Chrome Web Driver

If you’re trying this make sure you have everything setup.

Start Session

from selenium import webdriver
import time
driver = webdriver.Chrome("C:/WebDrivers/chromedriver.exe")
driver.get('https://www.linkedin.com/uas/login')

Login

text_area = driver.find_element_by_id('username')
text_area.send_keys("[email protected]")
text_area = driver.find_element_by_id('password')
text_area.send_keys("your_password_here")
submit_button = driver.find_elements_by_xpath('//*[@id="app__container"]/main/div/form/div[3]/button')[0]
submit_button.click()

If the X-Path of submit button doesn’t work you can find a new one easily

  1. We use the element id and Xpath to enter data and click buttons
  2. Right-click on the required element button/field
  3. Select Inspect element option
  4. In the HTML code look for the id of the button

id starts with a ‘ # ‘

Looks something like this #button-login

Sometimes the id method doesn’t work then repeat the same i.e right click and find the XPath of the required element

List

s=[]
s=["Amitab Bachan","Jeff Bezos","Mark Zuckerberg"]
l=(len(s))

Search and crawl

Searches the profile name and prints out exact matches

def search(m):
    driver.get("https://www.linkedin.com/feed/")
    search=driver.find_element_by_xpath("//*[@id=\"ember41\"]/input")
    search.send_keys(m,"\n")
    time.sleep(5)
    print("Seraching",m)
    res=[]
    for a in driver.find_elements_by_xpath('.//a'):
        res.append(a.get_attribute("href"))
    n=""
    n=m.lower()
    n=n.replace(" ","-")
    print(n)
    #these if statements are for getting exact links to the resultant profiles
    for i in range (len(res)):
        if n in res[i]:
            print(res[i])

Iterator & Result

for i in range(len(s)):
    z=s[i]
    search(z)

End of code for Fake Profiles on LinkedIn

2. Brute Force Collector

This method also collects Fake Profiles on LinkedIn but focuses only on one name with more results like 50-70 links per search because I managed to actually go through all the search results pages and gather more information for a search query

Query start

m="Jeff Bezos"
final_list=[]

The only thing that makes this script intelligent is that it collects all links and then applies filters to match the exact keywords or names we’re looking for and additionally goes through more search result pages and the no.of search results pages we can go through can be easily adjusted bringing us more results per search query.

def search(m):
    num=1
    driver.get("https://www.linkedin.com/feed/")
    search=driver.find_element_by_xpath("//*[@id=\"ember41\"]/input")
    search.send_keys(m,"\n")
    time.sleep(5)
    print("Seraching",m)
    res=[]
    for a in driver.find_elements_by_xpath('.//a'):
        res.append(a.get_attribute("href"))
    n=""
    n=m.lower()
    n=n.replace(" ","-")
    print(n)
    for i in range (len(res)):
        if n in res[i]:
            print("Link ",num," : ",res[i])
            num=num+1
            final_list.append(res[i])
    current=driver.current_url
    j=2
    for i in range(5):
        j=j+1
        res=[]
        driver.get(current+"&page="+str(j))
        time.sleep(2)
        for a in driver.find_elements_by_xpath('.//a'):
            res.append(a.get_attribute("href"))
        n=""
        n=m.lower()
        n=n.replace(" ","-")
        for i in range (len(res)):
            if n in res[i]:
                print("Link ",num," : ",res[i])
                num=num+1
                final_list.append(res[i])
    
search(m)

The Result – Collects 70+ Profiles with exact match

for x in range(len(final_list)):
    print(x," : ",final_list[x])

Sample Result Data

Searching Jeff Bezos

Fake Profiles collected on LinkedIn
* not all are fake profiles on LinkedIn

Yes though we are using a unique name I believe there are some real people with the same name, but I have checked them and on an average search 90% are fake

Note: This is a fun project, data scraping and excessive use of this might ban your LinkedIn account use this for Educational purpose only, cheers have fun.

Collecting Fake Profiles on LinkedIn

This post is published as a part of my project called “The Selenium Project“, where I automate the boring stuff using python and selenium mostly. If you find it interesting check it out and drop a star at the GitHub Repository

Leave a Reply

Your email address will not be published. Required fields are marked *

Popular posts