Hello!

Hello!

This is a site about work I have done related to Hidden Markov Models (HMM) and part-of-speech tagging as well as some applications in healthcare, specifically electronic health record (EHR) data and high-cost claimant analysis.

Background This covers some of the statistical background information for part-of-speech tagging applications of hidden markov models, as outlined in the excellent (and free!) Speech & Language Processing book by Jurafsky and Martin.

Model workflow This page covers from a high level the steps that need to be taken to go from data to working HMM.

R Examples Making heavy use of the HMM package, I will show two examples of HMM’s as POS taggers, and possibly later I will add examples from other data like medical insurance claims (synthetic, of course!).

Python Examples Pomegranate is a Python package that offers an easy interface to build HMM’s, and I may include an example on that, but my main focus for this page will be creating a HMM from scratch in python using numpy ndarrays, including the use of log10 transformations to model longer sequences without running into numerical underflow problems.

HMM Applications

Natural language processing:

  1. Part-of-speech (POS) tagging
  2. Speech recognition
  3. Sign language recognition

Healthcare:

  1. Disease progression / phenotyping
  2. Behavior modeling

Scientific Research

  1. Biological sequence analysis
  2. Parkinson’s Disease Detection Using Gait Analysis

Finance

  1. Time series prediction