A team of academic researchers will be availed of a new tool by Facebook that will assemble real-time data on the platform.
This week, a handful of academic researcher teams will gain access to a new tool from Facebook designed to aggregate near-universal real-time data on the world’s biggest social network, according to TechCrunch.
Facebook Inc, now rebranded as Meta has been embroiled in series of controversies, significant among then the 2018 Cambridge Analytica scandal where Millions of Facebook users had their personal data collected without consent by consulting firm, Cambridge Analytica to be used for political purposes. The resultant accusation was that the Facebook’s Application Programming Interface (API) gave access for data collection from users’ friends without their knowledge. In the three years that followed, the company had to shut down thousands of APIs, but it is beginning to restore broad access to them for academic research.
Kiran Jagadeesh, Facebook Product Manager, who spearheaded the project with the Facebook Open Research & Transparency (FORT) team while responding to TechCrunch preview of the academic research API believes the move ‘is just the beginning’, while noting that the Researcher API is a beta version of the toolkit that would eventually be rolled out.
The Application Programming Interface (API), which was first announced at F8 earlier in the year, is Python-based and runs in JupyterLab, an open source notebook interface but the new Resercher API is not without its own caveats and conditions. The first of them is that the API will only be made available to a restricted-small group of established and known academic researchers via an invite-only system, with the company planning to expand access beyond the initial test group in February next year, where it will incorporate the trial feedback into a broader academic launch.
Another caveat is that the Researcher API operates in a very controlled environment known as a ‘digital clean room’. This afford the academic researchers with API access to the environment with the use of a Facebook Virtual Private Network ((VPN), where they can collect data and crunch numbers but the only the analysis and not the raw data can be exported.
The justification for this controlled environment is to protect user privacy and prevent data already analysed from being re-identified but analysts and critics of the California based company may not be impressed by this excuse, especially as all of the public data the Researcher API gathers is already out there floating around but difficult to aggregate and analyse with Facebook’s existing tools.
The launch of the API will give access to pages, groups, events and posts, the 4 buckets of real-time Facebook data, with the tool only pulling from public data and only sources within the US and the EU. For the groups and pages, at least one administrator will be required to be present in a supported country for the data to be readily made available through the API.
Researchers can through the tool take analysis of large swaths of raw texts with the use of sentiment analysis methodology that helps track valence and emotions of people via their speech on a given topic. Researchers can also have access to related information like group and page descriptions, the date of creation and the post reactions.
Raw images and other multimedia data will not be included, neither will comments and user data like age, gender be added. The API will not collect data from Instagram, even though Jagadeesh understand that the sister platform will be a valuable tool for researchers, while adding that the team will explore ways to make Instagram data available.
The team will work closely with academic researchers to develop and build out the current tools with the Meta Inc brand already inviting 23 researchers around the globe to make this happen.
Some of the Researchers who are done with the onboarding process and have agreed with its privacy policies were granted access yesterday, November 14, with Facebook requiring anyone seeking access to the research to agree to privacy constraints.
Only a handful of academic institutions can have access to the research API for now, with the FORT team announcing its plans to grant access to other groups, including the media.