Flag job

Report

Machine Learning Intern

Min Experience

0 years

Location

remote

JobType

internship

About the job

Info This job is sourced from a job board

About the role

I designed and built a full-stack AI-driven system that automates the extraction of structured data from unstructured clinical PDFs. I developed a hybrid OCR pipeline using EasyOCR and PyMuPDF, combined with custom regex and lightweight NLP models, to ensure 100% data field completion. I engineered custom SpaCy NER models to accurately capture patient demographics, accident details, and provider information across varied medical document layouts. Additionally, I implemented realtime report generation workflows, converting structured data into professionally formatted medical chronology PDFs. My solution improved extraction accuracy by 35% compared to standard libraries, enhanced processing speeds for large datasets, and enforced HIPAA-aligned practices for data fidelity.

About the company

Simplify is a common application for jobs & internships. Autofill job applications anywhere on the web, get notified when new jobs open, & seamlessly track your applications.

Skills

python
data science
javascript
pandas
machine learning
natural language processing (nlp)
spacy