Job Board Data Pollution: An Introduction
A series unraveling the web of job board data pollution: Fraud, Expired Listings and Duplication - learn how to navigate a cleaner job market analysis.
Have you ever been asked to pay a fee to enter data into a spreadsheet? To wire the CEO of a wine import company $5,000 in exchange for a check, a job, and a case of vintage Cab? Been pressed for your Social Security Number via a Hotmail account or tripped over sentences like “seeking for a part time job you can work from home and earn $500 and above daily but urgent and apply now best time is now really”? Maybe you’ve clicked through job ads from dream companies only to find the same listing somewhere else, and then in many places, all detailing different instructions and leaving you with nothing but tangled fistfuls of your own hair.
If you’ve answered yes to any of the above, we’ll bet you’ve spent at least thirty seconds on the average online Job Board.
Earlier this week, we debunked the myth of Ghost Jobs and showed how the appearance of inflated job listings is primarily a problem of inflated data, and not of job listings themselves. Here’s the thing: bad job data comes from job boards. Consider this our PSA: job boards are factory farms of data pollution and any analysis of the job market that draws from them will be distorted.
How does this happen?
Let’s start with what a job board is and how it operates. Job boards make a profit by promoting employers’ job postings on their website. Generally, the sites are free to use for job seekers, who flood the boards by the thousands to shoot off applications as easily and broadly as possible. The more open jobs these sites appear to offer, the more applicants they’ll attract, and a larger applicant pool will, in turn, secure more investment from hiring managers.
This feedback loop generates unintended consequences; when it comes to data pollution, three misbegotten offspring of the job boards skip to the front of the line: fraud, expired listings and duplication.
There’s no bigger fan of the low barrier to post on job boards than the con artist. Protected by the apparent legitimacy of good company–Fortune 100 companies post to job boards, most reputable companies do–fraudsters spin out B.S. postings to phish for personal information, credit card and social security numbers, any crumbs of data that might lead to a successful theft from job seekers. Forbes reports that job fraud is on the rise, with 14 million people exposed to job scams in the first quarter of 2022 alone. These jobs obviously do not correspond to the real job market, but when providers sell job board data, they don’t separate the wheat from the chaff. Millions of “jobs” that never had any intention of leading to a hire are thus incorporated into the data, giving the impression that job listings don’t actually correspond to hires.
If not a bigger threat to job seekers themselves, duplicate jobs are definitely the more pervasive threat to data quality at large–there are just so many of them. Millions of jobs are aggregated from other job boards and then syndicated to a network of other job sites and/or reposted elsewhere to drive traffic and satisfy the marketing priorities of hiring managers. This rapid multiplication of a single job into many duplicate listings gluts up the data with false signals.
Another flaw in the job ad model is the duration of listings on job boards. Companies looking to advertise an opening often purchase ad space on job boards over a given term, say 30 days. Even if the job is filled on day 7, the listing continues to appear on boards for the remaining three weeks of the term. These expired jobs form yet another thorn in the side of accurate, timely data.
In a series of posts over the next few weeks, we’ll walk you through these sources of data pollution at greater depth. To give you a comprehensive view of the problem we’ll explain how fake and duplicated jobs are generated and by whom, how these warped elements disrupt accurate analysis, the kinds of misguided conclusions they often point to, and–crucially–how to detect and avoid the job board data trap.
It’s difficult enough marshaling actionable predictions about the ever-shifting job market when the dataset is pristine. Doing it with corrupt data is an absolute shot in the dark. Better job data means better predictions. That’s why LinkUp never sources from Job Boards. It pulls directly from over 60,000 company websites, all over the country and around the globe, every day.
Job Board Data Pollution Blog Series:
Job Board Data Pollution: The Nefarious Underworld of Fake Jobs (Part 1)
Job Board Data Pollution: How Duplication Bloats the Labor Market (Part 2)
Job Board Data Pollution: Expired Listings Slow Data Down (Part 3)
Insights: Related insights and resources
-
Blog
11.06.2023
Gaslit by Bad Data? LinkUp is here to help.
Read full article -
Blog
11.01.2023
The Soft Landing Happened. Game Over. America Won. And October NFP Will Be 210,000.
Read full article -
Blog
04.09.2023
The Absurd Myth of 'Ghost Jobs'
Read full article
Stay Informed: Get monthly job market insights delivered right to your inbox.
Thank you for your message!
The LinkUp team will be in touch shortly.