Hello!
Hello!
This is a site about work I have done related to Hidden Markov Models (HMM) and part-of-speech tagging as well as some applications in healthcare, specifically electronic health record (EHR) data and high-cost claimant analysis.
Background This covers some of the statistical background information for part-of-speech tagging applications of hidden markov models, as outlined in the excellent (and free!) Speech & Language Processing book by Jurafsky and Martin.
Model workflow This page covers from a high level the steps that need to be taken to go from data to working HMM.
R Examples Making heavy use of the HMM package, I will show two examples of HMM’s as POS taggers, and possibly later I will add examples from other data like medical insurance claims (synthetic, of course!).
Python Examples Pomegranate is a Python package that offers an easy interface to build HMM’s, and I may include an example on that, but my main focus for this page will be creating a HMM from scratch in python using numpy ndarrays, including the use of log10 transformations to model longer sequences without running into numerical underflow problems.
HMM Applications
Natural language processing:
- Part-of-speech (POS) tagging
- Speech recognition
- Sign language recognition
Healthcare:
- Disease progression / phenotyping
- Behavior modeling
Scientific Research
- Biological sequence analysis
- Parkinson’s Disease Detection Using Gait Analysis
Finance
- Time series prediction