AI/ML Engineer

We are seeking a highly skilled and experienced Lead AI/ML Engineer with 8 to 10 years of experience to spearhead the development, optimization, and deployment of cutting-edge AI models. In this role, you will bridge the gap between state-of-the-art Generative AI (LLMs, RAG) and resource-constrained environments (Edge AI, Embedded Systems). You will design robust NLP pipelines, optimize complex deep learning models for real-time execution, and deploy them onto Jets on platforms.

If you thrive on making massive models run efficiently on tiny hardware, this role is for you.

Key Responsibilities :

– LLM & RAG Systems : Design, architect, and implement production-grade Retrieval-Augmented Generation (RAG) pipelines utilizing advanced Vector Databases (e.g., Pinecone, Milvus, Qdrant, Chroma) and orchestration frameworks like LangChain or LlamaIndex.

– Model Fine-Tuning : Fine-tune open-source LLMs and Vision-Language Models (VLMs) using techniques like LoRA, QLoRA, and P-tuning for domain-specific applications.

– Core NLP Tasks : Develop and maintain core NLP pipelines including Named Entity Recognition (NER), text classification, sentiment analysis, and semantic search using Hugging Face Transformers.

– Prompt Engineering : Architect complex, system-level prompts and implement guardrails to ensure deterministic, safe, and context-aware model outputs.

– Model Compression : Apply advanced optimization techniques, including Quantization (INT8/FP16 calibration, PTQ, QAT), pruning, and knowledge distillation, to reduce model footprint without sacrificing accuracy.

– Compilers & Runtimes : Convert and optimize deep learning models from PyTorch to deployment-ready formats using ONNX Runtime and NVIDIA TensorRT.

– Hardware Deployment : Deploy, benchmark, and profiles models directly on NVIDIA Jetson Platforms (Jetson Nano, Orin Nano, Orin NX/AGX) ensuring optimal utilization of GPU and DLA (Deep Learning Accelerator) cores.

– Embedded Linux : Develop within Embedded Linux environments, including writing efficient C++/Python wrapper code, managing dependencies, and flashing/configuring Jetson boards.

– Real-time Applications : Architect end-to-end, low-latency, real-time AI applications capable of processing streaming data (text, audio, or video) at the edge.

Required Skills & Qualifications :

– Core Languages : Expert-level proficiency in Python and standard data science libraries (NumPy, Pandas, Scikit-learn). Familiarity with C++ for edge deployment is a strong plus.

– Deep Learning Frameworks : Advanced hands-on experience with PyTorch and the Hugging Face ecosystem.

– Edge Optimization Tools : Deep technical understanding of TensorRT, ONNX, OpenVINO, and quantization frameworks (e.g., BitsAndBytes, AWQ, GPTQ).

– Frameworks & Infrastructure : Proven experience with LangChain, Docker, and interacting with specialized Vector Databases.

– Hardware Ecosystem : Direct experience working with NVIDIA Jetson Nano or higher-end Jetson Orin hardware modules, JetPack SDK, and DeepStream SDK.

Experience :

– 8-10 years of professional experience in AI/ML engineering, with at least 3+ years specifically focused on Edge AI deployment or LLM engineering.

Are you interested in this position?

Apply by clicking on the “Apply Now” button below!

#AlbionarcJobs#FintechJobs

#AsiaJobs#MiddleEastCareers

#TechTalent#FintechRecruitment

#FinanceOpportunities#

AI/ML Engineer

- $0.00

Are you interested in this position?

- $0.00

Armendes Ltd

Contact

Employers

Albionarc

Legal

AI/ML Engineer

AI/ML Engineer

- $0.00

Are you interested in this position?

- $0.00

Armendes Ltd

Related Fintech Jobs

Generative AI Engineer/Architect

Senior Software Engineer – Artificial..

Embedded Hardware Test Engineer

Full Stack Engineer (Cloud + AI Exposure)

Contact

Employers

Albionarc

Legal