What People Say About Me?

From our blog

Solving Delta Table Concurrency Issues

on September 30, 2023

Solving Delta Table Concurrency Issues Delta Lake is a powerful technology for bringing ACID transactions to your data lakes. It allows multiple operations to be performed on a dataset concurrently. However, dealing with concurrent operations can sometimes be tricky and may lead to issues such as ConcurrentAppendException, ConcurrentDeleteReadException, and ConcurrentDeleteDeleteException. In this blog post, we will explore why these issues occur and how to handle them effectively using a Python function, and how to avoid them with table design and using isolation levels and write conflicts.

Continue reading

Databricks SQL Dashboards Guide: Tips and Tricks to Master Thems

on September 15, 2023

Databricks SQL Dashboards Guide: Tips and Tricks to Master Them Welcome to the world of Databricks SQL Dashboards! You’re in the right place if you want to learn how to go beyond just building visualizations and add some tricks to your arsenal. This guide will walk you through creating, managing, and optimizing your Databricks SQL dashboards. 1. Getting Started with Viewing and Organizing Dashboards: Accessing Your Dashboards: Navigate to the workspace browser and click “Workspace” in the sidebar.

Continue reading

Optimizing Databricks SQL: Achieving Blazing-Fast Query Speeds at Scale

on September 12, 2023

Optimizing Databricks SQL: Achieving Blazing-Fast Query Speeds at Scale In this data age, delivering a seamless user experience is paramount. While there are numerous ways to measure this experience, one metric stands tall when evaluating the responsiveness of applications and databases: the P99 latency. Especially vital for SQL queries, this seemingly esoteric number is, in reality, a powerful gauge of the experience we provide to our customers. Why is it so crucial?

Continue reading

Simplifying Real-time Data Processing with Spark Streaming’s foreachBatch with working code

on June 6, 2023

Simplifying Real-time Data Processing with Spark Streaming’s foreachBatch with working code Comprehensive guide to implementing a fully operational Streaming Pipeline that can be tailored to your specific needs. In this working example, you will learn how to parameterize the ForEachBatch function. Spark Streaming & foreachBatch Spark Streaming is a powerful tool for processing streaming data. It allows you to process data as it arrives, without having to wait for the entire dataset to be available.

Continue reading