Ikastar
Website:
ikastar.com
Job details:
Company Description
Ikastar is a professional IT services company committed to delivering exceptional value to both customers and employees. With extensive experience in consulting, the company specializes in providing premium software services through Managed Services, Staff Augmentation, and End-to-End solutions. Ikastar takes pride in offering tailored solutions designed to meet the unique needs of its clients, fostering innovation and excellence. Join a company dedicated to driving success and advancing technology in diverse industries.
Role Description
We’re looking for a Senior Python professional with deep experience in Generative AI, RAG, and production-grade ML systems. In this role, you’ll design and ship retrieval-augmented solutions that reliably ground LLM outputs in trusted data, building everything from ingestion pipelines and vector search to evaluation, monitoring, and deployment. You’ll collaborate with product, data, and platform teams to turn complex business problems into robust, scalable AI capabilities.
What You’ll Do
- Design & Build RAG Systems: Architect end-to-end pipelines—document ingestion, chunking/segmentation, embeddings, vector storage, retrieval orchestration, and response generation with LLMs.
- Productionize Generative AI: Implement scalable services/APIs for chat, Q&A, summarization, agents/tools, and workflow automations.
- Optimize Retrieval Quality: Experiment with chunking strategies, hybrid search (BM25 + dense), re-ranking, metadata filtering, query rewriting, and multi-hop retrieval.
- Model Integration: Integrate with OSS and managed LLMs (e.g., Azure OpenAI, OpenAI, Anthropic, AWS Bedrock) and choose models based on task, latency, and cost.
- Evaluation & Guardrails: Build evaluation frameworks (factuality, grounding, hallucination rates, latency, cost), red-team prompts, and safety/PII guardrails.
- MLOps & Reliability: Own CI/CD, model/data versioning, observability, feature flags, canary releases, and dependency management for AI services.
- Data & ETL Pipelines: Implement robust pipelines for structured/unstructured data (PDFs, HTML, Confluence, SharePoint, Slack, databases) with incremental updates.
- Performance & Cost: Profile bottlenecks, tune memory/CPU/GPU usage, cache intelligently, and optimize embedding/model/token costs.
- Security & Compliance: Enforce data residency, encryption, secret management, and tenant isolation; support audits and compliance (e.g., SOC2, ISO).
- Mentorship & Leadership: Provide technical leadership, code reviews, design documents, and mentor engineers on LLM/RAG best practices.
Must-Have Qualifications
- Strong Python expertise (typing, asyncio, packaging, testing, performance profiling).
- RAG Architecture: Proven experience building RAG systems in production.
- Vector Search: Hands-on with vector DBs (e.g., FAISS, Pinecone, Weaviate, Milvus, Qdrant, Elasticsearch/OpenSearch KNN).
- LLM Tooling: Practical experience with LangChain, LlamaIndex, or custom orchestration frameworks; function/tool calling; agent patterns.
- Embeddings & Models: Familiarity with OpenAI/ Azure OpenAI, Cohere, Anthropic, Hugging Face; selecting embeddings (e.g., text-embedding-3, bge) for tasks.
- Data Processing: Experience with PyPDF, BeautifulSoup, unstructured, Tesseract, or similar for doc parsing/OCR.
- APIs & Services: Build and operate FastAPI/Flask microservices; gRPC/REST; auth (OAuth2/JWT).
- Cloud & DevOps: Deploy on Azure/AWS/GCP; containers (Docker), orchestration (Kubernetes), infra-as-code (Terraform), monitoring (Prometheus/Grafana, OpenTelemetry).
- Testing & Quality: Unit/integration tests, offline/online evals for LLMs, dataset curation, prompt/version management.
Click on Apply to know more.