Times are displayed in (UTC-06:00) Central Time (US & Canada) (UTC-05:00 Daylight)Change
Population Estimates Based on Social Media Scraping
Submission ID: 6232
Date: Wednesday, 4:30 PM to 6:00 PM Session: Session B - W4:30 - 6:00 PM
Primary Presenter
Cong Ye, American Institutes for Research
Additional Authors or Round Table Presenters
, ,
, ,
, ,
, ,
, ,
, ,
, ,
, ,
Abstract
Social media has become a major communication channel for school districts to reach their audience. Important district announcements are usually broadcast on social media. This creates an opportunity for studying district policy at the population level with no burden on districts and much less effort from researchers as compared to surveys. In addition, much of the process can be automated so that statistics can be generated quickly for timely reports on important topics, such as school closures during the COVID-19 pandemic.
One challenge for this approach is to develop an automatic tool to code the announcements. Human coding would be relatively expensive and time-consuming. This challenge is addressed by topic modeling techniques which categorize documents (i.e., announcements in this case) into different topics where the probabilities of fitting each document to different topics are estimated through an iterative process.
Another major challenge is that school districts use social media differently. Some have dedicated managing staff, and some may not use it at all. Therefore, simple mean estimates of the school district population are likely to be biased. This challenge is addressed by weighting adjustments as population data (e.g., the Common Core of Data) on this population are rich and high quality.
The results are validated by comparison with aggregated administrative data.
Population Estimates Based on Social Media Scraping
Category
Paper > Data Science, Big Data, and Administrative Records