Cross Disciplinary Consultancy: Negation Detection Use Case

Despite considerable effort since the turn of the century to develop Natural Language Processing (NLP) methods and tools for detecting negated terms in chief complaints, few standardised methods have emerged. Those methods that have emerged (e.g. the NegEx algorithm) are confined to local implementations with customised solutions.

June 18, 2019

Leveraging Discussions on Reddit for Disease Surveillance

In recent years, individuals have been using social network sites like Facebook, Twitter, and Reddit to discuss health-related topics. These social media platforms consequently became new avenues for research and applications for researchers, for instance disease surveillance. Reddit, in particular, can potentially provide more in-depth contextual insights compared to Twitter, and Reddit members discuss potentially more diverse topics than Facebook members. However, identifying relevant discussions remains a challenge in large datasets like Reddit.

January 21, 2018

Opioid Surveillance using Social Media: How URLs are shared among Reddit members

Nearly 100 people per day die from opioid overdose in the United States. Further, prescription opioid abuse is assumed to be responsible for a 15-year increase in opioid overdose deaths. However, with increasing use of social media comes increasing opportunity to seek and share information. For instance, 80% of Internet users obtain health information online, including popular social interaction sites like Reddit (, which had more than 82.5 billion page views in 20153.

January 21, 2018

Towards Tracking Opium Related Discussions in Social Media

In recent years, the use of social media has increased at an unprecedented rate. For example, the popular social media platform Reddit ( had 83 billion page views from over 88,000 active sub-communities (subreddits) in 2015. Members of Reddit made over 73 million individual posts and over 725 million associated comments in the same year [1].

August 20, 2017

Combining Text Mining and Data Visualization Techniques to Understand Consumer Experiences of Electronic Cigarettes and Hookah in Online Forums

Since their introduction to the US market in 2007, electronic cigarettes (e-cigarettes) have posed considerable challenges to both public health authorities and government regulators, especially given the debate – in both the scientific world and the community at large – regarding the potential advantages (e.g. helping individuals quit smoking) and disadvantages (e.g. renormalizing smoking) associated with the product1.

October 10, 2017

New challenges in tobacco surveillance: monitoring the prevalence of electronic cigarettes and hookah use in the United States

Since the 1990s tobacco control strategies --- at least in the United States and some developed countries --- have had considerable success in reducing the number of new smokers and encouraging existing smokers to quit through the creation of a regulatory infrastructure designed to monitor tobacco sales, limit advertising for tobacco products, and "denormalize" smoking in public places.

July 24, 2017

A web-based platform to support text mining of clinical reports for public health surveillance

PyConTextKit is a web-based platform that extracts entities from clinical text and provides relevant metadata - for example, whether the entity is negated or hypothetical - using simple lexical clues occurring in the window of text surrounding the entity. The system provides a flexible framework for clinical text mining, which in turn expedites the development of new resources and simplifies the resulting analysis process.

May 02, 2019

Evaluating Syndrome Definitions in the Extended Syndromic Surveillance Ontology

The Extended Syndromic Surveillance Ontology (ESSO) is an open source terminological ontology designed to facilitate the text mining of clinical reports in English [1,2]. At the core of ESSO are 279 clinical concepts (for example, fever, confusion, headache, hallucination, fatigue) grouped into eight syndrome categories (rash, hemorrhagic, botulism, neurological, constitutional, influenza-like-illness, respiratory, and gastrointestinal). In addition to syndrome groupings, each concept is linked to synonyms, variant spellings and UMLS Concept Unique Identifiers.

May 02, 2019

Using cKASS to facilitate knowledge authoring and sharing for syndromic surveillance

Mining text for real-time syndromic surveillance usually requires a comprehensive knowledge base (KB) which contains detailed information about concepts relevant to the domain, such as disease names, symptoms, drugs, and radiology findings. Two such resources are the Biocaster Ontology [1] and the Extended Syndromic Surveillance Ontology (ESSO) [2]. However, both these resources are difficult to manipulate, customize, reuse and extend without knowledge of ontology development environments (like Protege) and Semantic Web standards (like RDF and OWL).

May 02, 2019

Contact Us

NSSP Community of Practice



This website is supported by Cooperative Agreement # 6NU38OT000297-02-01 Strengthening Public Health Systems and Services through National Partnerships to Improve and Protect the Nation's Health between the Centers for Disease Control and Prevention (CDC) and the Council of State and Territorial Epidemiologists. Its contents are solely the responsibility of the authors and do not necessarily represent the official views of CDC. CDC is not responsible for Section 508 compliance (accessibility) on private websites.

Site created by Fusani Applications