**Job Function**:
Data Analytics & Computational Sciences
**Job Sub Function**:
Data Science
**Job Category**:
Scientific/Technology
**All Job Posting Locations**:
Cornellà de Llobregat, Barcelona, Spain, Madrid, Spain
Johnson and Johnson Innovative Medicine (J&J; IM), a pharmaceutical company of Johnson & Johnson is recruiting for a Vector Data Engineer. This position has a primary location of Barcelona, Spain. The secondary location is Madrid. This is a hybrid role.
Our expertise in Innovative Medicine is informed and inspired by patients, whose insights fuel our science-based advancements. Visionaries like you work in teams that save lives by developing the medicines of tomorrow.
**Position Summary**:
**Key Responsibilities**:
- Design and implement vector embedding models to unify diverse biomedical data modalities, including:
- EEG, PPG, accelerometry, biosensor signals
- Speech and physiological recordings
- Medical imaging (MRI, PET)
- Develop quality control protocols for mobile-captured images and transform pixel-based images into vector representations.
- Adapt and extend custom embedding methods from academic and research settings to build scalable foundation models.
- Collaborate with cross-functional teams to integrate multimodal data for machine learning and clinical insight generation.
- Contribute to the identification of digital biomarkers and predictive patterns in neurological and psychiatric conditions.
**Qualifications**:
- MS/PhD in Computer Science, Electrical Engineering, Biomedical, or related field.
- Minimum 3 years of experience in multimodal data modeling, machine learning, or biomedical signal processing.
- Familiarity with large pre-trained multimodal models (e.g., CLIP, MedCLIP, Flamingo) and biomedical/time-series adaptations (BioBERT, TimeSformer).
- Strong proficiency in Python, with hands-on experience using PyTorch, TensorFlow, Hugging Face Transformers, and scikit-learn.
- Signal & Speech Processing: Experience with Librosa, SpeechBrain, or similar libraries for acoustic and temporal feature extraction.
- Proficiency in NiBabel for MRI and PET imaging, and familiarity with FSL, FreeSurfer, and NiLearn for neuroimaging analysis.
- Multimodal Fusion: Experience with CLIP-like architectures and contrastive/self-supervised learning for multimodal data integration.
- Understanding of clinical trial data and longitudinal monitoring frameworks, as well as experience with large-scale dataset curation and embedding evaluation.**
**Strategic Impact**:
- AI can perform semantic queries and reasoning over governed datasets.
- Vector database infrastructure scales efficiently and complies with governance and lineage standards.
- Accelerated insight generation across discovery, translational, and clinical domains.
**#JRDDS