backend
Featured
Datarul - Enterprise Data Governance Platform
Enterprise SaaS data governance platform for managing metadata, lineage, and PII across client databases. Built as a polyglot microservices system (.NET Core, Python, Go) on Kubernetes, with Business Glossary, Data Dictionary, Report Catalog, Data Lineage, and Data Quality features. Includes automated metadata discovery across SQL/NoSQL databases, AI-assisted PII detection, and interactive lineage visualization built with React, D3.js, and RedisGraph.
Project Details
Role
Senior Software Engineer
Timeline
July 2023 - February 2025
Tech Stack
.NET Core
Python
Go
React
TypeScript
PostgreSQL
MongoDB
Elasticsearch
Redis
RabbitMQ
Docker
Kubernetes
GraphQL
gRPC

Key Features
- Business Glossary: Term standardization and corporate knowledge management with synonym detection
- Data Dictionary: Automated metadata discovery across SQL/NoSQL databases with scheduled imports and change tracking
- Report Catalog: Centralized report repository with impact analysis, version control, and cross-environment consistency
- Data Lineage: Interactive graph visualization showing end-to-end data flow from source systems through transformations to reports
- AI-assisted data classification for sensitive data (PII, PCI, PHI) tagging
- Natural language search across metadata using Elasticsearch
- Role-Based Access Control (RBAC) with Active Directory/LDAP integration and granular permissions
- RESTful and GraphQL APIs for integration with existing enterprise tools
- Scheduled metadata synchronization with configurable import frequencies and conflict resolution
- Historical versioning of metadata changes with audit trails
- Multi-tenancy support with data isolation between clients
Challenges
- Indexing metadata across heterogeneous data sources (Oracle, SQL Server, PostgreSQL, MongoDB, etc.)
- Building a lineage parser that handles complex SQL with CTEs, subqueries, and window functions
- Implementing graph traversal for impact analysis across many data dependencies
- Designing multi-tenant architecture with strict data isolation
- Balancing eventual consistency across microservices with audit/integrity requirements
- Supporting deployment on both on-premise air-gapped environments and cloud infrastructure
Solutions
- Event-driven microservices using domain-driven design with bounded contexts
- Custom SQL parser using ANTLR generating abstract syntax trees for lineage extraction
- Graph database (RedisGraph) with optimized traversal for impact analysis
- Multi-tenant PostgreSQL schema with row-level security
- Adapter pattern for pluggable integration with new data sources
- Elasticsearch cluster with custom analyzers for metadata search
- Containerized services with Kubernetes enabling hybrid on-premise/cloud deployments
Project Gallery

Business Glossary - Creating corporate memory and standardizing business terms

Data Dictionary - Managing database assets under one roof

Report Catalog - Monitoring reporting tools and tracking changes

Data Lineage - Visualizing and analyzing data flow with diagrams

Data Quality - Monitoring data accuracy, consistency, and reliability