|
Data Scientist - Natural Language Processing (remote)
MultiPlan Inc.Remote
JOB DESCRIPTION
Do you have a passion for Natural Language Processing (NLP), Deep Learning (DL), Information Retrieval, Machine Comprehension, Question Answering/Conversational AI? Are you excited about using your skills to help individuals improve their health?
Our Data Science team applies Advanced Analytics, Artificial Intelligence and Machine Learning to produce solutions that reduce customer healthcare costs and improve outcomes and business processes.
ROLE SUMMARY
Your work will require deep acquaintance with MultiPlan’s data, seeking to find ways to deliver value through advanced feature engineering. From your models you will deliver actionable insights that can be incorporated into existing products as well as new programs. As a data scientist part of a cross functional team you will work closely with Data & Engineering to deliver production ready code that is robust and ready to scale.
We are looking for an experienced Natural Language Processing (NLP) Data Scientist with strong technical skills, a passion for creating technology to improve health care, and an attitude of creativity and continual learning.
You’ll enjoy the flexibility to work remotely * from anywhere within the U.S. as you take on some tough challenges.
PRIMARY RESPONSIBILITIES
- Develop Natural Language Processing (NLP) models from unstructured healthcare data such as provider notes, EMR attached documentation, etc and identify opportunities, pose business questions and make valuable discoveries leading to prototype development and product improvement.
- Solves unique and complex problems within healthcare payments and claims processing with broad impact on the business and patient care.
- Design, develop, tune, and test machine learning and deep learning solutions and systems.
- Make use of state-of-the-art NLP model architectures such as BERT (and derivatives like BioBERT, RoBERTa, etc.), BiLSTM, and XLNet in NLP pipelines
- Collaborates with peers and senior leadership to ensure activities are appropriately integrated into the strategic direction, as well as the mission and values of the company.
- Collaborates with technology teams in developing technical solutions that will result in accurate and efficient data collection, data management and data reconciliation.
- Autonomously drive the development of machine learning models on a cloud-based infrastructure.
- Mine and analyze large structured and unstructured datasets. Identify the data attributes that influence the outcome, define, and monitor metrics, create data narratives, and builds tools to drive decisions.
- Work across diverse teams, perspectives and opinions and quickly build consensus.
- Create visualizations of data that make distributions, trends, and results easy to understand for business leaders.
- Diagnose business need, analyze business process, data flows, technical artifacts and extract understanding of how the system or business process works as input into projects / solutions.
- Proactively develops, maintains, and evangelizes technical knowledge in Machine Learning and adjacent technologies. Keeps up to date on current trends and best practices.
- Performs assessments and listens to internal/external customers to understand and anticipate their needs and determine their priorities in the context of the overall enterprise.
- Operate within established methodologies, procedures, and guidelines; work independently in a fast-paced, agile environment.
[color=rgba(0, 0, 0, 0.85)]REQUIREMENTS
To pursue actionable knowledge through data in this role, a Data Scientist II must have the following:
- Bachelor’s or Master’s degree in Statistics, Operations Research, Applied Mathematics, Computer Science, Robotics, Physics or similarly quantitative/technical field
- 5+ years of experience designing and developing in Python
- Deep understanding of NLP tools and frameworks such as Spark NLP, BERT, spaCy, HuggingFace, Flair, NLTK, etc
- 5+ years of experience querying relational databases using SQL
- 5+ years of experience in statistical computing and graphics using tools such as PySpark, Domino Data Lab or other data science platforms
- Knowledge of Git/GitHub integration with CI/CD tools
- Problem solving, critical thinking, creativity, organizational, design, interpersonal, communication (written, verbal and listening) skills.
|
|