Determining IBD Trigger Foods using Machine Learning and Python (Part 1)


This post is the first in a three-part series describing my attempt to identify safe” and unsafe” foods for Inflammatory Bowel Disease (IBD; including Ulcerative Colitis and Crohn’s Disease) using various machine learning techniques and Python. Part I below walks through background on IBD, what’s known about its relationship with food, and my roadmap to tackling the problem of finding safe foods to eat for IBD sufferers. Parts II and III will expand upon the methodologies/code and findings of my analysis, respectively. Along the way I’ll also be detailing my experience as a mentee in the ChiPy Mentorship Program (click here to learn more about this incredible program), as well as some details of my personal battle with Crohn’s disease and diet.


What is IBD?

Inflammatory bowel disease, or IBD, is a name used to describe a group of conditions, the most common of which are Ulcerative Colitis (UC) and Crohn’s disease (CD). These conditions are characterized by gastrointestinal tract (gut) inflammation, the location of which depends on the type of IBD (UC is limited to the colon, while CD can involve any part of the GI tract from the mouth to the anus most commonly the small intestines).


Symptoms of IBD range from mild annoyances to potentially life threatening issues. The most common symptoms of IBD include diarrhea; abdominal pain/cramping/bloating; fever and fatigue; unintended weight loss (from avoiding food and/or absorbency issues); and blood in the stool, though more serious side effects including ulcers, malnourishment, bowel obstruction, fistulas, fissures, and colon cancer are also relatively common. As a result of the more serious side effects, it’s estimated that 67-75% of CD patients will have at least one surgery during the course of their lifetime, while ~20% of UC patients will require surgery (Crohn’s & Colitis Foundation [CCFA], n.d.). 


Unfortunately, IBD is a chronic, lifelong illness, currently with no cure; and impacts an estimated 1.6 million people in the United States alone (CCFA, n.d.), with global estimates potentially reaching ~20 million (estimate based on Kaplan, 2015; Kappelman, Moore, Allen, & Cook, 2013; M’Koma, 2013). And despite the large amount of people with IDB, understanding of the disease remains limited. Currently IBD is believed to be an autoimmune condition (i.e. a condition in which a person’s immune system attacks healthy cells in the body by mistake). While the cause remains unknown, it’s currently thought to be a combination of genetic and environment factors (viruses, bacteria, diet, stress, etc.).


What’s food got to do with it?

The role that food plays on IBD isn’t completely clear cut and can be a controversial subject. Although food is not thought to be the direct cause of IBD, it is believed by many people that eating certain foods (or avoiding them) plays a role in aggravating/alleviating symptoms or helping to manage flares” (i.e. particularly symptomatic periods). Researchers looking into the role of diet practices among people with IBD in one study for example found that 57% of participants felt diet could trigger a flare (Aggarwal, Burns, Mclaughlin, & Limdi, 2015)


However, to-date, the effects of diet on established disease remains poorly studied, and [to my knowledge] no foods have been identified by researchers as common triggers among UC and/or CD sufferers. This is in stark contrast to diseases such as Celiac disease or lactose intolerance, for which there is an exact list of foods to be avoided. Rather, safe/unsafe foods for an IBD sufferer are generally thought to be specific to that individual person. What’s more, diet isn’t as clear cut as being able to eat or not being able to eat certain foods often there are foods that are thought to be tolerated either in small amounts or outside of flares, but only a small subset of safe” foods that can be eaten during flare-ups.


Regardless of whether or not foods do indeed trigger flares, the uncertainty and fear around finding a “safe” diet can contribute to food avoidance and malnourishment among IBD patients – for a population of people, that if anything, need more than the typical nutrient intake (Jowett et al., 2004).


Why it’s personal to me

In February 2016 I was diagnosed with Crohn’s disease, after a colonoscopy revealed I had 10 ulcers. I’m lucky to have a doctor who’s up-to-date on the research; and with almost no trial and error we found a medication that has eliminated my ulcers (I now have zero) and my regular gut pain. However, I still find that eating even the most minute amounts of certain foods can trigger flares that last days or even weeks. After significant trial-and-error, I’ve finally found a diet that keeps me symptom free (going on six months now). However, the trial and error was painful at times, and that’s even with how lucky I’ve been with my medication.  There are many people, my mom included, who have not been so lucky. 


The Project

With [anonymized] IDB sufferer food survey data graciously provided by, I hope to bring greater clarity to food’s impact on IBD. While the data will ultimately determine the course of the analysis, it’s my hope that through machine learning I can find relevant evidence of food-based relationships for IBD sufferers that can help patients eat with confidence, and perhaps even lead to targeted medical research. Some of the questions that will drive my analysis include:
  1. Are there any universal food tolerances or intolerances?
  2. What foods can be grouped as triggering common reactions, if any? If such groups exist, what nutritional or chemical composition do they have in common?
  3. Are there any identifiable clusters of patient type with regards to food intolerances?
  4. Can food tolerance/intolerance be predicted with a reasonable degree of probability for an IDB sufferer with only a few known” safe/unsafe foods?
To address those questions, I will use Python, a number of data manipulation/analysis libraries (such as numpy and pandas), a wide variety of machine learning techniques (using scikit-learn and apyori), and several visualization and presentation tools (including matplotlib, seaborn, and jupyter notebook). In addition to the data provided by, I will utilize the USDA’s food database API to analyze the nutrient composition of surveyed foods. Ultimately, should the results of the data warrant, the final stage of my project would be to develop a web-based tool where IBD patients can fill out a survey with known food reactions and receive back model expectations for other foods they might or might not be able to target, with feedback from those same people ever-improving the predictive power of the models.
A more detailed methodology will follow in Part II, so stay tuned!



Aggarwal, D., Burns, H., Mclaughlin, J. T., & Limdi, J. K. (2015). PWE-048 Dietary practices in patients with
inflammatory bowel disease-food for thought.
Crohn’s & Colitis Foundation (CCFA), (n.d.). Fact Sheet–About Surgery for IBD. [Fact sheet]. Retrieved
Crohn’s & Colitis Foundation (CCFA), (n.d.). The Facts About Inflammatory Bowel Diseases. [Fact sheet]. Retrieved
Jowett, S. L., Seal, C. J., Phillips, E., Gregory, W., Barton, J. R., & Welfare, M. R. (2004). Dietary beliefs of people
with ulcerative colitis and their effect on relapse and nutrient intake. Clinical nutrition23(2), 161-170.
Kaplan, G. G. (2015). The global burden of IBD: from 2015 to 2025. Nature reviews Gastroenterology &
hepatology, 12(12), 720-727.
Kappelman, M. D., Moore, K. R., Allen, J. K., & Cook, S. F. (2013). Recent trends in the prevalence of Crohn’s disease
and ulcerative colitis in a commercially insured US population. Digestive diseases and sciences, 58(2), 519-525.
M’Koma, A. E. (2013). Inflammatory bowel disease: an expanding global health problem. Clinical medicine insights.
Gastroenterology, 6, 33.
Statistics. (n.d.). Retrieved September 22, 2017, from

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.