About us

Hello, I’m Soni

Real Time Streaming Specialist | Data Architect | Data, Automation And Analytics| Tech Writer

I am a data architect with 13 years of experience, specializing in streaming technologies. I have a passion for setting up infrastructures, designing architectures, and building robust data pipelines that seamlessly integrate with DevOps practices. During my time at AWS and Amazon, I honed my skills as a Data Architect/Engineer, successfully taking multi-petabyte workloads to production while ensuring minimal operational burden. My expertise spans both Streaming and Batch Analytics, allowing me to handle real-time applications with precision and efficiency.

I’m committed to democratizing knowledge and breaking down the complexities of Big Data. For the past two years, I’ve been writing about Spark Streaming, contributing 28 blogs on the topic. Additionally, I’ve had the privilege of coaching over 100 individuals, helping them crack data interviews and grow their careers in data. My work not only focuses on technical insights but also on broader topics like salary negotiation and financial literacy, empowering professionals in the data industry.

What is my expertise?

Spark, Pyspark
90%
Real Time Data Streaming (Delta, Kafka, Kinesis, Event Hub)
90%
SQL, Python
90%
Databricks, Delta Live Tables
85%
Database- Redshift, Snwoflake, Big Query, Oracle & Postgres
80%
Modern Data Stack (Data Build Tool, Dagster & Metabase)
70%
Continuous Integration & Continuous Deployment ( Github Actions, Azure DevOps )
70%
Data Vizualization ( Tableau, Power Bi, Metabase )
70%
No-SQL Databases ( Dynamo, Cosmos & Mongo)
60%
Terraform & CDK
50%

2021-2023

Specialist Data Architect | Databricks

  • Work on solving Streaming & scaling challenges with Big Data across customers
  • Author blogs with best practices and solution accelerators
  • Enhanced productivity for Audantic by reducing their line of code by 66% and reduced their development time of Lake House by 86%.
  • Significantly reduced carbon footprint for a customer and co-authored a blog post detailing design and implementation.
  • Work with strategic customers to solve challenging technical problems, provide business value, guide product roadmap, and ensure success on the Databricks platform.
  • Collaborate with teams at large on strategic programs to scale our organization, design & build internal accelerators, and share best practices.
  • Hire and mentor new talent

2015-2021

Data Engineer/ Data Architect| Amazon/ AWS

  • Built real time data processing platform and CI/CD pipelines with ACID properties while ensuring operational excellence
  • Craft and support 50+ datasets by processing multiple TBs of data per day serving 600+ Analysts and Economists
  • Responsible for building GDPR & CCPA compliant data lake
  • Strategize customer cloud adoption journey and leverage the power of the cloud to run big data analytics at scale
  • Interact with customers and identify their pain points across the system and drive projects to remove them
  • Work with customers as a trusted advisor to drive visibility to best practices, access to internal resources, expertise on the very latest features, and accountability for the AWS platform to help them achieve their business goals and objectives
  • Migrated tables with huge volumes (100-350 TB) from Oracle to Redshift, by using massively parallel processing architecture which improved the availability of critical datasets by 96%
  • Interacted with appropriate departments to develop and accomplish a road map to ensure that systems run smoothly in projected peak traffic.

2014-2015

ETL & Reporting Developer | ZS Associates

  • Designed and developed a multi-country (23) data warehouse by transforming domain knowledge acquired from the client into a technical artifact.
  • Solely responsible for data warehouse performance tuning and recognized for quickly diagnosing system issues

2012-2014

Business Intelligence Reporting | Tata Consultancy Services

  • Developed Key Performance Indicators for business reports using SAP Business Objects to drive analytics
  • Awarded with the LIREL honor (Leadership, Integrity, Respect for individual, Excellence, Learning & sharing), the core 5 values of Tata Consultancy Services

Aug 2015- May 2017

Master of Computing Science in Big Data

Simon Fraser University, Canada

2022

Databricks Developer

2021

Azure Fundamentals

2019

AWS Developer

2018

AWS Solutions Architect

Scroll to Top