Skip to main content

Table 2 Research using ML to detect eating disorder risk via social media and internet data

From: Potential benefits and limitations of machine learning in the field of eating disorders: current research and future directions

Study

Sample

Predictors variables

Outcome variables

Best performing ML approach

Explanatory power of best performing model

Limitations

Benítez-Andrades et al. [47]

494,025 posts containing ED-related words on Twitter

Posts with ED-related words

(1) Posts written by users who identify online as suffering from EDs; (2) posts that promoted having an ED; (3) informative posts; (4) scientific posts

Bidirectional encoder representations from transformer–based models (RoBERTa)

(1) 83%; (2) 89%; (3) 84%; (4) 94%

Outcome based on content of posts not validated measures of EDs

Only identifies content of people who publicly acknowledge their ED on Twitter

Specific predictors unknown

Chancellor et al. [40]

62,000 posts with removed pro-ED content or ED content remaining publicly available on Instagram

Combinations and frequencies of different ED hashtags and captions

Whether the post was removed or still publicly available

Logistic regression

69%

Only identifies content of people who publicly acknowledge their ED on Instagram

Chancellor et al. [41]

26 million posts from 100,000 users who post pro-ED content on Instagram

Mental illness severity (MIS; low, medium, high) in a user’s previous posts based on the content of hashtags

MIS (low, medium, high) based on the content of hashtags

Multinomial logistic regression

81%

MIS inferred from posts not validated measures

Chancellor et al. [39]

877,000 pro-ED photo posts shared on Tumblr, 569 of which were removed by Tumblr

Text, hashtag, and photo content from Tumblr posts

Whether the post would be/was removed by Tumblr for violating community guidelines

Deep neural network

89%

Specific predictors unknown

De Choudhury [37]

55,334 posts collected from 18,923 blogs on Tumblr who mentioned common ED and anorexia symptomatology tags

Social, affective, cognitive, and linguistic style expression in posts

(1) Whether a post shares any kind of anorexia related content; (2) Whether a post relates to the proana or the pro-recovery community

Support vector machine classifier

(1) 83%; (2) 81%

Outcome based on content of posts not validated measures of EDs

Only identifies content of people who publicly acknowledge their ED on Tumblr

Hwang et al. [45]

185,950 posts and 3,528,107 comments from a weight management subcommunity on Reddit

4 types of emotional eating behaviours and 5 types of feedback based on Latent Dirichlet Allocation topic modelling method

Emotional eating diagnosis based on authors’ expertise

Stochastic gradient descent

91%

Outcome based on content of posts not validated measures of EDs

Sadeh-Sharvit et al. [48]

231 adult women on Prolific who contributed their internet browsing history over the past 6-months

Keywords related to EDs, daily visits to social media, fraction of searches on Google or Bing, activity rates, participant age

ED status (clinical/subclinical ED, high risk for an ED, or no ED) based on responses to validated surveys

GentleBoost

53%

Small sample size

Wang et al. [44]

119,825,361 posts on Twitter from 72,047 users, of which 1,797,239 posts were from 3380 users who self-identify with an ED on Twitter

User engagement and activity, posting preference, interaction diversity, psychometric properties of posts

ED status (ED or non-ED user)

Support vector machine

97%

ED status determined by self-identifying ED on Twitter

Only identifies content of people who publicly acknowledge their ED on Twitter

Yan et al. [46]

4759 posts from 6 ED-related subcommunities on Reddit

Relationships between key words within the text of each post

Whether users need immediate mental health support for an ED based on expertise of two clinical psychologists

Logistic regression

96%

Required human coders with extensive expertise

Zhou et al. [43]

18,288 posts on Twitter with ED-related words

ED-related words in posts

ED-related topic clusters/themes

Correlation Explanation (CorEx) topic model

78%

Outcome based on content of posts not validated measures of EDs

Only identifies content of people who publicly acknowledge their ED on Twitter

Zhou et al. [42]

123,977 posts on Twitter with ED-related words

Posts with ED-related words

(1) ED-relevant and ED-irrelevant posts; (2) ED-promotional and education and ED-laypeople posts

Convolutional neural network (CNN) and long short-term memory (LSTM)

(1) 89%; (2) 90%

Outcome based on content of posts not validated measures of EDs

Only identifies content of people who publicly acknowledge their ED on Twitter

  1. ED, eating disorder; ML, machine learning