Posts

Showing posts with the label Cloud

The Cloud: Why Does It Matter?

Image
  A floating cotton candy that brings rain, hail and snow. Since modern day technology is a relatively new invention, we tend to warp the meanings of words a lot.  For instance, a tablet used to be medicine, but now it's a portable computer with touch input.   Naming conventions also draw inspiration from nature, like "ecosystem," "virus," "bug," and "mouse." Today I would like to talk about the cloud as it is now a word that is being used outside of conversations about the weather.  I first learnt about the intricacies of clouds back in Geography class, from cirrus clouds to cumulonimbus clouds, as well as what they meant.   Spot a cumulonimbus cloud? Brace for a thunderstorm! (You're welcome!) Now, "cloud" metaphorically represents remote data storage and processing—ubiquitous, vast, and intangible, like an actual cloud. "So, Toni, what's so special about the cloud?" To answer, let's imagine a cloud-less 2023...

Data Wrangling : Best Practices For Working With Big Datasets

Image
  A matrix of dots that you did not bother to count Let's face it, working with exponentially expanding datasets can be both exciting and overwhelming. Imagine dealing with a dataset of 22 million rows - that's a lot of information to process!  The question is, can your ETL process handle it?  This is a problem that I faced this week, and I had to find a way to improve the performance of the process that updates a dashboard as it was taking a decade to update. The initial question that crossed my mind was, "What was the actual size of this dataset?" At first, I mistakenly assumed that the dataset contained between 1 and 6 million rows. A quick COUNT(*) query made it clear that my estimate was way off and also provided me with some clarity to the problem. That dataset was a behemoth, it probably had its own gravitational pull! The sad truth was that the process was not scalable, and it was clear that immediate improvements were necessary. Here are a three tips that I h...