NYU Tandon School of Engineering Research Team Tackles Tough Data Science Challenge, Vows to Update Its Public Database Weekly Through November Elections
Donald Trump, Planned Parenthood are Top Advertisers
A team of researchers from NYU Tandon School of Engineering and NYU Shanghai have released the first in-depth analysis of Facebook political advertising, as part of an ongoing cybersecurity research project.
Conceived by Computer Science and Engineering Assistant Professor Damon McCoy, the Online Political Ads Transparency Project has built easy-to-use tools to collect, archive, and analyze political advertising data. The researchers, including NYU Tandon doctoral student Laura Edelson and Shikhar Sakhuja NYUSH ‘19, pledged to improve the transparency of Facebook’s archive by releasing weekly updates of all political advertisements collected through the November election. The team also plans to use its complex data scraping methods to reveal similar information for Twitter.
Although Facebook became the first major social media company to launch a searchable archive of political advertising, for both Facebook and Instagram, in May 2018, McCoy found the archive difficult to use, requiring time-consuming manual searches. He decided to apply versions of the data scraping techniques he had previously used against criminals, including human traffickers who advertised and used Bitcoin.
McCoy and his team praised Facebook for its pioneering transparency in establishing a public archive and its plan to launch an API – an app interface – that will enable large-scale analysis; however, Facebook has not specified when in 2018 it will launch this API. “We wanted to quickly give voters easy tools to understand who is advertising and what they are advertising, as well as how much is being spent to influence votes and the targets of the ads,” McCoy said.
His team analyzed more than 267,000 political ads that primarily ran between May 2018 and July 2018 and reported:
- Facebook and Instagram users viewed political ads at least 1.4 billion times – and impressions may have reached nearly 3.9 billion. (Facebook’s data provide only ranges.) The NYU Tandon team is quickly adapting its web crawler to add information on videos and images.
- Political spending equaled at least $13.9 million and could have been five times that – the uncertainty is due to the ranges provided in the original data.
- Males aged 25-34 were targeted the most.
- The most ads per capita appeared in Washington, DC, followed by Nevada, Colorado, and Maine. The fewest appeared in Delaware, Nebraska, and New Hampshire.
- The Facebook archive’s misidentifications – for example, of a political action committee for a person or a political-themed clothing outlet as a political entity – are among numerous hurdles to meaningful and automated analysis.
A heat map showing how Facebook political advertising varied widely from state to state.
Top political advertisers and their minimum impressions and spending:
- The Trump Make America Great Again Committee: 4,127 ads, 26.4 million impressions, $190,400
- Planned Parenthood Federation of America: 3,389 ads, 24.5 million impressions, $188,800
- AAF Nation, LLC (manufacturer of political-themed clothing): 862 ads, 18.4 million impressions, $78,900
- National Rifle Association: 213 ads, 18.3 million impressions, $58,000
- Beto for Texas (Democrat running for Senate): 377 ads, 13.0 million impressions, $194,400
- Priorities USA Action and Senate Majority PAC: 2,794 ads, 12.9 million impressions, $120,600
- NowThis (liberal-leaning media company): 35 ads, 11.6 million impressions, $7,400
- Donald J. Trump for President, Inc.: 5,396 ads, 11.3 million impressions, $83,700
- 4Ocean, LLC (focused on reducing ocean pollution): 78 ads, 10.6 million impressions, $68,200
- Care2 (creates social networking around causes): 557 ads, 10.1 million impressions, $99,900
The data also revealed substantial online advertising by candidates in congressional and state races.
A significant number of ads – 43,573 – did not comply with Facebook’s new requirement that political ads list sponsors and were therefore shut down, but the NYU researchers’ daily archiving captured these “unvetted sponsor” ads. The researchers noted that some of the offenders may have been caught off guard by the policy change. They also noted that while Facebook reduced the time it takes to shut down these ads from 26.4 days to 5.6 days, the delay remains longer than the ads typically run.
The team reported the top five unvetted sponsors as identified by Facebook and their minimum impressions and spending:
- American AF: 253 ads, 8.2 million impressions, $103,800
- National Rifle Association of America/NRA: 56 ads, 7.9 million impressions, $78,500
- I’ll Go Ahead and Keep My Guns, Thanks (listed as a media company): 26 ads, 7.6 million impressions, $120,300
- China Xinhua News: 44 ads, 6.8 million impressions, $6,000
- Walmart: 18 ads, 5.8 million impressions, $51,900
Visit the project at https://online-pol-ads.github.io and download the data at https://github.com/online-pol-ads/FBPoliticalAds/tree/master/RawContentFiles.
An Interview with Shikhar:
The Gazette caught up with Shikhar Sakhuja ‘19, who spoke of the motivation and hard work behind the project:
1.How long have you been working in this research team? What are your responsibilities?
I have been working with the team since June 2018. I have been sharing responsibilities with another researcher in both research and engineering. I laid the groundwork for mining the data from Facebook, on which we eventually cleaned and performed our analysis.
2. What motivated you to join the project in the first place?
I joined the research team, led by Professor Damon McCoy, and the project because of their inclination towards solving problems that have a social impact and have a broader scope beyond the traditional world of Computer Science.
3. What have you learned through this experience?
Through this project, I have discovered a deep interest in the field of Social Network analysis. We live in an age where everything is driven by social networks, even the American Elections, apparently. I would love to keep doing researching in the field. The project gave me an opportunity to hone my skills in Data Mining, Machine Learning, Database Design and Management and, Software Engineering, in general.