Study | Sample | Predictors variables | Outcome variables | Best performing ML approach | Explanatory power of best performing model | Limitations |
---|---|---|---|---|---|---|
Benítez-Andrades et al. [47] | 494,025 posts containing ED-related words on Twitter | Posts with ED-related words | (1) Posts written by users who identify online as suffering from EDs; (2) posts that promoted having an ED; (3) informative posts; (4) scientific posts | Bidirectional encoder representations from transformer–based models (RoBERTa) | (1) 83%; (2) 89%; (3) 84%; (4) 94% | Outcome based on content of posts not validated measures of EDs Only identifies content of people who publicly acknowledge their ED on Twitter Specific predictors unknown |
Chancellor et al. [40] | 62,000 posts with removed pro-ED content or ED content remaining publicly available on Instagram | Combinations and frequencies of different ED hashtags and captions | Whether the post was removed or still publicly available | Logistic regression | 69% | Only identifies content of people who publicly acknowledge their ED on Instagram |
Chancellor et al. [41] | 26 million posts from 100,000 users who post pro-ED content on Instagram | Mental illness severity (MIS; low, medium, high) in a user’s previous posts based on the content of hashtags | MIS (low, medium, high) based on the content of hashtags | Multinomial logistic regression | 81% | MIS inferred from posts not validated measures |
Chancellor et al. [39] | 877,000 pro-ED photo posts shared on Tumblr, 569 of which were removed by Tumblr | Text, hashtag, and photo content from Tumblr posts | Whether the post would be/was removed by Tumblr for violating community guidelines | Deep neural network | 89% | Specific predictors unknown |
De Choudhury [37] | 55,334 posts collected from 18,923 blogs on Tumblr who mentioned common ED and anorexia symptomatology tags | Social, affective, cognitive, and linguistic style expression in posts | (1) Whether a post shares any kind of anorexia related content; (2) Whether a post relates to the proana or the pro-recovery community | Support vector machine classifier | (1) 83%; (2) 81% | Outcome based on content of posts not validated measures of EDs Only identifies content of people who publicly acknowledge their ED on Tumblr |
Hwang et al. [45] | 185,950 posts and 3,528,107 comments from a weight management subcommunity on Reddit | 4 types of emotional eating behaviours and 5 types of feedback based on Latent Dirichlet Allocation topic modelling method | Emotional eating diagnosis based on authors’ expertise | Stochastic gradient descent | 91% | Outcome based on content of posts not validated measures of EDs |
Sadeh-Sharvit et al. [48] | 231 adult women on Prolific who contributed their internet browsing history over the past 6-months | Keywords related to EDs, daily visits to social media, fraction of searches on Google or Bing, activity rates, participant age | ED status (clinical/subclinical ED, high risk for an ED, or no ED) based on responses to validated surveys | GentleBoost | 53% | Small sample size |
Wang et al. [44] | 119,825,361 posts on Twitter from 72,047 users, of which 1,797,239 posts were from 3380 users who self-identify with an ED on Twitter | User engagement and activity, posting preference, interaction diversity, psychometric properties of posts | ED status (ED or non-ED user) | Support vector machine | 97% | ED status determined by self-identifying ED on Twitter Only identifies content of people who publicly acknowledge their ED on Twitter |
Yan et al. [46] | 4759 posts from 6 ED-related subcommunities on Reddit | Relationships between key words within the text of each post | Whether users need immediate mental health support for an ED based on expertise of two clinical psychologists | Logistic regression | 96% | Required human coders with extensive expertise |
Zhou et al. [43] | 18,288 posts on Twitter with ED-related words | ED-related words in posts | ED-related topic clusters/themes | Correlation Explanation (CorEx) topic model | 78% | Outcome based on content of posts not validated measures of EDs Only identifies content of people who publicly acknowledge their ED on Twitter |
Zhou et al. [42] | 123,977 posts on Twitter with ED-related words | Posts with ED-related words | (1) ED-relevant and ED-irrelevant posts; (2) ED-promotional and education and ED-laypeople posts | Convolutional neural network (CNN) and long short-term memory (LSTM) | (1) 89%; (2) 90% | Outcome based on content of posts not validated measures of EDs Only identifies content of people who publicly acknowledge their ED on Twitter |