NYU Tandon Data Scientists Tackle Archives Built by Social Media Companies; the Need for Manual Labor and Artificial Barriers Sometimes Thwarts Transparency
President Trump, Texas Senate Candidate Beto O’Rourke, and Senate Republican PAC are Big Spenders
Facebook Advertisers Lean Left, Google Advertisers Lean Right
Using cutting-edge machine learning and data scraping tools, computer scientists at the New York University Tandon School of Engineering today released the first database and analysis of political advertising based on more than 884,000 ads identified by Google, Twitter, and Facebook.
The team launched their user-friendly Online Political Ads Transparency Project in July with data from Facebook, which was the first company to provide it. But the researchers were forced to switch techniques when Facebook blocked their data collection two weeks later. Today’s report is the first to include not only Facebook (including Instagram), but data newly shared by Twitter and Google.
Although they found numerous roadblocks to meaningful transparency – ranging from faulty archives constructed in haste by the social media giants to varying definitions of “political advertising” and throttling of data collection by Facebook – NYU Tandon Computer Science and Engineering Assistant Professor Damon McCoy and his team nonetheless reported meaningful insights:
President Donald Trump and his PAC registered the largest number of ads of any candidate, due in large part to the preponderance of small, micro-targeted advertising. Virtually all were aimed at raising funds during the study period, September 9-22, 2018. The researchers found similar dominance by President Trump in their initial, Facebook-only, analysis.
The Democratic candidate for Senate from Texas, Beto O’Rourke, continued to be the apparent largest spender, mostly seeking small donations from outside his state via Facebook and Twitter. Although O’Rourke was the rare federal candidate unaffiliated with a PAC, he was like other candidates in using social media to raise funds outside their districts, McCoy noted.
The Senate Leadership Fund, a Republican Super PAC, was the largest spender on Google and across all three platforms combined.
Priorities USA, a left-leaning PAC, was among the big spenders, but exact figures are not available because it collaborated on ad placements with other PACs.
Left-leaning organizations are the big spenders on Facebook and Twitter; on Google, the trend is reversed.
Facebook apparently carries the most political ads, but Google apparently ranks higher in impressions and spending. This is due, in part, to the large number of small, micro-targeted ads on Facebook (60 percent) and because the majority of spending on Google (61 percent) is by PACs, which are more like to have large budgets. But analysis is muddied by the fact that both Google and Facebook disclose only ranges; only Twitter discloses exact spending and impressions. Each of the giants also defines “political advertising” differently. For example, Facebook alone includes non-media for-profit companies promoting slanted political content, companies selling merchandise with political messages, and solar panel firms with environmental messages. Google and Twitter, meanwhile, limited their reporting to only federal candidates, at least initially.
PACs accounted for 23 percent of the spending on Facebook during the study period.
The very top spenders during the study period on Facebook, though, were Facebook itself and its own Instagram – Facebook to publicize its responses to Russian election hacking and Instagram to spread a get-out-the-vote message. But the researchers pointed out that the company seemed to overcharge itself, based upon impressions.
Collaborators on the Online Political Ads Transparency Project are NYU Tandon doctoral student Laura Edelson, NYU Shanghai visiting undergraduate student Shikhar Sakhuja, and Ratan Dey, a former NYU doctoral student studying under Professor Keith Ross and now an assistant professor of practice in computer science at NYU Shanghai.
McCoy conceived the project to build easy-to-use tools to collect, archive, and analyze political advertising data. Although Facebook became the first major social media company to launch a searchable archive of political advertising, for both Facebook and Instagram, in May 2018, McCoy found the archive difficult to use, requiring time-consuming manual searches. He decided to apply versions of the data scraping techniques he had previously used against criminals, including human traffickers who advertised and used Bitcoin.
Despite the difficulty the team subsequently encountered accessing Facebook data, they report it has by far the most comprehensive political archive among the three social media companies. The report outlines problems with the API – an interface with other platforms – introduced in beta form by Facebook to allow researchers access to its archives.
Google’s data is the easiest for the public to access, as a BigQuery dataset, available in its entirety via the Google Cloud service. But it is updated in real time, with no archiving, so the NYU researchers are capturing the data daily, to share and archive.
Twitter has no easily accessible political ad archive, so the NYU research team is scraping all political advertising data identified by Twitter and sharing and archiving for the public, as well.
Although the researchers used the September period for comparison purposes, they have now compiled data from late May through October 3, with a gap of about six weeks while Facebook blocked its data scraping. They praised the social media companies for implementing fixes they recommended and continue to work toward transparency.
The work was funded in part by the National Science Foundation under a grant to McCoy for research that explores bias and the manipulation of online data.
Visit the project and download data at: https://online-pol-ads.github.io.
About the New York University Tandon School of Engineering
The NYU Tandon School of Engineering dates to 1854, the founding date for both the New York University School of Civil Engineering and Architecture and the Brooklyn Collegiate and Polytechnic Institute (widely known as Brooklyn Poly). A January 2014 merger created a comprehensive school of education and research in engineering and applied sciences, rooted in a tradition of invention and entrepreneurship and dedicated to furthering technology in service to society. In addition to its main location in Brooklyn, NYU Tandon collaborates with other schools within NYU, one of the country’s foremost private research universities, and is closely connected to engineering programs at NYU Abu Dhabi and NYU Shanghai. It operates Future Labs focused on start-up businesses in downtown Manhattan and Brooklyn and an award-winning online graduate program. For more information, visit http://engineering.nyu.edu.
Graph available to download at http://dam.engineering.nyu.edu/?c=2119&k=6b8d27e947
SOURCE NYU Tandon School of Engineering