Let’s breathe life into great tech ideas! With 3,000 people globally, Intellias is a company where benchmark technological solutions are born. Join in and take your part in digitalizing the world.
We are exploring cutting-edge OCR and metadata extraction from PDF documents. OCR and document intelligence are rapidly evolving fields, with open-source models like DeepSeek OCR and LightOn OCR pushing the boundaries.
We are seeking an experienced engineer to help us build high-precision solutions for PDF-to-Markdown and PDF-to-HTML conversion, particularly for complex documents with diverse layouts.
Key Responsibilities:
* Research, evaluate, and fine-tune open-source OCR and document intelligence models for text and layout extraction from complex PDFs.
* Develop end-to-end solutions for PDF-to-Markdown / PDF-to-HTML conversion, preserving text structure, formatting, and layout accuracy.
* Build tools for data preprocessing, annotation, and quality evaluation of OCR outputs.
* Implement post-processing techniques, text alignment, and metadata extraction to improve model precision.
* Collaborate closely with research and engineering teams to integrate OCR pipelines into production-ready systems.
* Stay current with advancements in document AI, multimodal learning, and OCR research.
Required Skills & Experience:
* 5+ years of experience in Machine Learning, with at least 2 years focused on OCR, Document AI, or vision-language models.
* Strong expertise in Python, PyTorch, and Hugging Face Transformers (training, fine-tuning, inference).
* Solid understanding of ComputerVision and its implementation
* Hands-on experience deploying LLM / VLM models on vLLM or similar high-performance inference frameworks.
* Deep understanding of OCR pipelines, layout parsing, and document structure recognition (PDFs, scanned docs, tables, mixed content).
* Familiarity with cloud infrastructure and GPU-based inference pipelines.
* Research-oriented mindset with the ability to experiment, analyze results, and iterate quickly.
* Excellent communication and documentation skills.
At Intellias, where technology takes center stage, people always come before processes. We're dedicated to cultivating a tech-savvy environment that empowers individuals to unlock their true potential and achieve extraordinary results. Our customized benefits not only prioritize your well-being but also charge your professional growth, making this opportunity an ideal match for tech enthusiasts like you.