Summary
About the Role
Major Accountabilities :
Work with technical and domain experts (SMEs) to plan and deliver on data42 (and partner) objectives relying on preclinical data assets.
Supervise the operation of end-to-end data pipelines enabling the delivery of high-value preclinical data.
Be responsive to new preclinical data needs raised by data42 partners.
Responsible for the ongoing development, maintenance, and individualization of the preclinical data pipelines
Support the implementation of essential data quality processes and metrics to ensure high-quality data products are delivered to collaborators.
Support data unification and harmonization efforts performed by preclinical data SMEs.
Support the team’s semantics approach to ensure data and metadata standards are met.
Implement and support the FAIR principles through the data pipelines implemented and operated within this role.
Collaborate with governance bodies and data strategy roles to ensure alignment of activities.
Support the linking of preclinical data to other data assets including clinical data, in-vitro assay data and molecular/omics data.
Support strategic decision making around data standards such as CDISC SEND.
Serve as a single point of contact for all preclinical data pipeline related needs, questions, and concerns within data42.
Define the roadmap and strategic planning of the preclinical data pipeline team.
In conjunction with a scrum master, run the team through an agile methodology.
Collaboratively maintain documentation of the preclinical data pipelines
Key Performance Indicators
Demonstration of scalable platform implementation and operation.
Improving metrics of data quality and FAIR compliance.
Responsiveness to changes to preclinical data requirements/priorities.
Collaboration with other data pipeline teams
Impact on the organization:
Drive one of the most important data assets through the data42 platform, at scale, bringing insight-generation opportunities to hundreds of consumers of data42’s platform and data
Ideal Background Master’s degree or higher in Computer Science, Data Engineering, Biostatistics or Bioinformatics
Experience/Professional requirements:
8+ years working with preclinical in-vivo study data e.g. from toxicology and pharmacokinetics studies
Understanding of preclinical in-vivo study designs, conduct and data collection
Detailed understanding of CDISC SEND standard
Experience with big data processing platforms e.g. Spark
Operational experience running large data operations
Knowledge of data pipeline and architectural decisions/trade-offs.
Proficient in SQL. Pyspark experience desirable.
Experience with semantics technologies and approaches
Experience working in an environment that leverages agile methodologies.
Honesty and Transparency are core attributes
Why Novartis: Helping people with disease and their families takes more than innovative science. It takes a community of smart, passionate people like you. Collaborating, supporting and inspiring each other. Combining to achieve breakthroughs that change patients’ lives. Ready to create a brighter future together? https://www.novartis.com/about/strategy/people-and-culture
Join our Novartis Network: Not the right Novartis role for you? Sign up to our talent community to stay connected and learn about suitable career opportunities as soon as they come up: https://talentnetwork.novartis.com/network