Community Data
Patient-Reported Sources
15,600 anonymized reports gathered from publicly accessible patient communities. No personal data is stored or linked to individuals.
Reddit — r/UlcerativeColitis
Online Community
The largest English-language UC patient community on Reddit. Posts and comments filtered for symptom mentions, medication discussions, and food experiences. Usernames stripped on ingestion.
IBD Forums & Communities
Dedicated IBD Platforms
Structured patient forums dedicated to Crohn's disease and ulcerative colitis, including community threads focused on treatment experiences and quality-of-life topics.
Patient Registries
Structured Registries
Publicly available aggregated registry data from IBD patient advocacy organizations. Used for demographic and prevalence cross-validation against community reports.
Scientific Literature
Research Databases
20,000+ peer-reviewed abstracts and full-text papers indexed from the world's leading biomedical literature databases.
PubMed / MEDLINE
Primary Scientific Source
The National Library of Medicine's database of biomedical literature. All scientific content in UCInsights is sourced exclusively from PubMed. Abstracts fetched via the Entrez API.
Google News RSS — UC Feed
Live Research News
Real-time UC and IBD research news aggregated from Google News via RSS. Powers the live research news feed in the app. Updated continuously; notifications triggered for new clinical findings.
Clinical Trial Data (ClinicalTrials.gov)
Trial Registry
Referenced for ongoing and completed UC clinical trials. Used to contextualize medication data and identify emerging therapies discussed in the community before PubMed publication.
Dataset Summary
Source
Volume
Type
Update cycle
Reddit r/UlcerativeColitis
9,200+
Community reports
Quarterly
IBD Forums
4,100+
Community reports
Quarterly
Patient Registries
2,300+
Structured registry
Annually
PubMed / MEDLINE
20,000+
Scientific abstracts
Monthly
Google News RSS
Live
News articles
Real-time
Personal tracker (on-device)
User data
Health logs
Real-time
AI & Technology
Tools Used for Analysis
The AI and computational tools applied to transform raw data into platform insights.
Claude AI (Anthropic)
Primary analysis engine
Used for sentiment analysis, natural language understanding, co-occurrence detection, and synthesis of patient reports with scientific literature. Also powers the in-app AI search and community assistant.
Statistical Analysis (Python)
Correlation & prevalence
Pearson correlation matrices, confidence interval calculation, Laplace-smoothed food safety scores, and minimum-threshold filtering implemented in Python (scipy, pandas, numpy).
PubMed Entrez API
Literature retrieval
NCBI's Entrez API used to fetch abstracts, MeSH terms, and metadata for all scientific papers. Queries scoped to Ulcerative Colitis MeSH headings to ensure relevance.
NER Pipeline
Anonymization & entity extraction
Named-entity recognition used to strip identifying information from community posts before ingestion, and to extract symptom names, medication names, and food mentions for structured tagging.
Data ethics: UCInsights does not collect, store, or sell any personally identifiable information. Community data is sourced exclusively from publicly accessible posts in compliance with each platform's terms of service. Personal health data logged in the app remains on your device unless you explicitly choose to share it. We do not use personal tracker data to train models.