R Systems
Website:
rsystems.com
Job details:
The Data Modeler will serve as a core contributor to the Databricks optimisation initiative, responsible for transforming complex business requirements — including investment, portfolio, and financial reporting use cases — into scalable, high-performance data models that power analytical and operational workloads. The ideal candidate brings deep expertise in Lakehouse architecture, dimensional and data vault modelling, hands-on Databricks experience, and a solid understanding of investment domain concepts such as portfolio management, asset classes, securities, trades, and risk metrics.
Key objectives of this engagement include improving query performance, reducing pipeline processing time, standardising data assets across layers, and enabling self-serve analytics at scale — particularly for investment reporting, P&L attribution, and regulatory compliance use cases.
Data Modeling,Dimensional Modeling,Modeller,Star Schema,Databricks Unified Data Analytics Platform
Data Modelling & Architecture
▸ Design and develop enterprise-scale conceptual, logical, and physical data models aligned with business and technical requirements.
▸ Build and optimise data models across Medallion Architecture layers (Bronze → Silver → Gold) in Azure Databricks / Delta Lake.
▸ Define and enforce data modelling standards, naming conventions, and best practices across the engagement.
▸ Collaborate with data architects to ensure models are aligned with the overall Lakehouse architecture strategy.
Databricks Optimisation
▸ Analyse existing Delta tables, schemas, and pipelines to identify performance bottlenecks and structural inefficiencies.
▸ Implement optimisation strategies including Z-Ordering, partitioning, data skipping, liquid clustering, and Delta table compaction (OPTIMIZE / VACUUM).
▸ Design efficient file layout strategies to minimise data scans and improve read/write performance.
▸ Tune Spark configurations and cluster settings in collaboration with data engineers for optimal pipeline execution.
▸ Evaluate and recommend transitions between data formats (e.g., Parquet → Delta) where applicable.
Data Governance & Quality
▸ Define and implement data quality rules, constraints, and validation frameworks within Unity Catalog.
▸ Establish and maintain data lineage documentation, cataloguing, and metadata management.
▸ Implement column-level security, row-level security, and access control policies in Unity Catalog.
▸ Ensure all models comply with organisational data governance standards and regulatory requirements.
Collaboration & Stakeholder Engagement
▸ Partner with business analysts, data engineers, and BI developers to translate requirements into robust data models.
▸ Conduct data modelling workshops and design reviews with cross-functional teams.
▸ Document models using ER diagrams, data dictionaries, and lineage maps to ensure knowledge transfer.
▸ Provide technical guidance and mentoring to junior data engineers on modelling best practices.
Investment Domain — Data Modelling
▸ Design and model investment data entities including portfolios, securities, instruments, trades, positions, benchmarks, and counterparties.
▸ Build data models to support core investment workflows: portfolio valuation, NAV calculation, P&L attribution, risk exposure, and performance reporting.
▸ Model reference data sets for asset classes (equities, fixed income, derivatives, alternatives), market data feeds, and pricing data.
▸ Support regulatory and compliance reporting requirements such as MIFID II, EMIR, Basel III, or FATCA through well-governed data models.
▸ Collaborate with investment analysts, quants, and front-office teams to understand data consumption patterns and model accordingly.
▸ Design models that integrate data from custodians, order management systems (OMS), execution management systems (EMS), and fund administrators.
▸ Ensure investment data models support both historical point-in-time analysis and current-state reporting requirements.
Migration & Modernisation
▸ Support data platform migrations — including on-premises to Azure Databricks and ADF to Microsoft Fabric transitions.
▸ Assess legacy data models and schemas and redesign them for cloud-native Lakehouse deployment.
▸ Contribute to ADF pipeline design and Databricks notebook development where required.
Required Skills & Qualifications
Investment Domain Knowledge — Must Have
▸ 6+ years of experience working in Investment Management, Asset Management, Wealth Management, or Capital Markets data environments.
▸ Strong understanding of investment instruments: equities, fixed income, derivatives (futures, options, swaps), ETFs, and alternative assets.
▸ Hands-on experience modelling investment data concepts: portfolios, positions, trades, NAV, P&L, benchmarks, and risk metrics.
▸ Experience integrating data from financial data vendors such as Bloomberg
▸ Understanding of front-to-back investment operations: trade lifecycle, settlement, reconciliation, and corporate actions.
Technical Skills — Must Have
▸ 6+ years of hands-on experience in data modelling (dimensional, relational, and/or Data Vault 2.0).
▸ Strong proficiency in Azure Databricks, Delta Lake, and PySpark.
▸ Deep understanding of Medallion/Lakehouse Architecture (Bronze, Silver, Gold layers).
▸ Proven Expertise In Databricks Performance Optimisation Techniques
– Partitioning, Z-Ordering, Liquid Clustering
– OPTIMIZE, VACUUM, Auto-Optimize, Delta table compaction
– File size management and data skipping strategies
▸ Experience with Unity Catalog for data governance, access control, and lineage.
▸ Proficiency in SQL (complex queries, window functions, CTEs, query optimisation).
▸ Experience with Python / PySpark for data transformation and pipeline development.
▸ Familiarity with Azure Data Factory (ADF) and Microsoft Fabric.
▸ Hands-on experience with data modelling tools such as erwin, dbdiagram.io, Lucidchart, or equivalent.
Soft Skills & Professional Attributes
▸ Strong analytical and problem-solving skills with a structured approach to complex data challenges.
▸ Excellent written and verbal communication; ability to convey technical concepts to non-technical stakeholders.
▸ Proven ability to manage multiple deliverables simultaneously in a fast-paced project environment.
▸ Self-driven, proactive, and capable of taking ownership end-to-end.
▸ Collaborative mindset with experience working in Agile / scrum delivery models.
Education & Certifications
▸ Bachelor's or Master's degree in Computer Science, Information Systems, Data Engineering, or a related field.
▸ Databricks Certified Data Engineer Associate / Professional — preferred.
▸ Microsoft Certified: Azure Data Engineer Associate (DP-203) — preferred.
▸ Any recognised data modelling or data architecture certification is an advantage.
Click on Apply to know more.