Data Scientist, Agentic AI

Atlanta, GA or Jersey City, NJ

EXL (NASDAQ: EXLS) is a global analytics and digital solutions company that partners with clients to improve business outcomes and unlock growth. Bringing together domain expertise with robust data, powerful analytics, cloud and AI to create agile, scalable solutions and execute complex operations for the world’s leading corporations. EXL was founded on the core values of innovation, collaboration, excellence, integrity and respect creating value from data to ensure faster decision-making and transforming operating models. Key industries including Insurance, Healthcare, Banking and Financial Services, Media, and Retail among others.

Headquartered in New York, our team is over 55,000 strong, with more than 50 offices spanning six continents. For information, visit www.exlservice.com.

About the Role

We’re seeking a Junior Data Scientist with hands-on experience in agentic AI systems, large language models (LLMs), and transformer-based architectures. As a member of our Digital Solutions team, you’ll contribute to building and optimizing intelligent systems that reason, adapt, and act autonomously. This is a dynamic role suited for candidates who are eager to innovate with state-of-the-art language models and vector-based search technologies.

Key Responsibilities

Design and optimize prompts for task-specific performance across Claude, GPT, LLaMA, and open-source LLMs.
Work on retrieval-augmented generation (RAG) pipelines leveraging vector search (e.g., Pinecone, Weaviate, FAISS, Chroma, pgvector etc).
Convert unstructured data (e.g., PDFs, scanned docs, images) into structured formats using OCR, document parsing, document classification, and document‐centric Vision LLMs.
Build and maintain agentic AI workflows using frameworks like LangChain, LangGraph, or AutoGen.
Develop autonomous agent systems capable of multi-step reasoning and execution.
Fine-tune foundation models (e.g., LLaMA, BERT, GPT, Mistral) using Hugging Face Transformers, OpenAI APIs, or LangChain.
Apply transformer architectures and embedding techniques to domain-specific problems.
Collaborate with cross-functional teams including senior ML engineers and product managers to deliver scalable GenAI solutions.
Stay up-to-date with advancements in LLM fine-tuning, RAG strategies, and autonomous agent research.

Required Qualifications

Bachelor’s or Master’s degree in Computer Science, Data Science, Machine Learning, or a related technical field.
2–5 years of experience in AI/ML Applications, GenAI development, or LLM-focused roles.
Hands-on experience with prompt engineering and LLM-based task chaining.
Experience extracting structured data from unstructured documents using popular Python libraries such as pytesseract, EasyOCR, Doctr, vision-parse, LayoutLM, Donut.
Proficiency in Python, including ML libraries like PyTorch, Hugging Face Transformers, Pandas, Scikit-learn.
Working knowledge of agentic AI frameworks (LangChain, AutoGen, LangGraph or CrewAI).
Experience querying and integrating vector databases (e.g., Pinecone, Weaviate) for semantic search and RAG.
Strong foundation in data analysis and ability to extract useful insights to guide from LLMs.

Nice to Have

Experience working with Palantir Foundry or AIP platforms.
Understanding of prompt engineering and instruction optimization.
Experience integrating LLMs into production pipelines.
Exposure to reinforcement learning or self-improving AI agents.
Contributions to open-source LLM/AI projects.