Posts

Showing posts with the label SQL

Learning SQL : Make it a Habit

Image
Pronounced 'seek-will' short for Structured Query Language Like it or not, SQL has been the backbone of data management for decades with 70% of companies use SQL as their primary tool for accessing and analyzing data  (Stack Overflow Developer Survey 2023). My first encounter with SQL was back in 2020, during the pandemic, in a virtual Enterprise Database Systems class. We were writing simple queries, such as SELECT * FROM students WHERE grade = 'A'.  At the time, I thought, “That’s all there is to it? How hard can this be?” By the middle of 2022, I was working with massive, real-world datasets that demanded writing complex queries spanning hundreds of lines of code. Suddenly, SQL wasn’t so simple anymore.  The depth required in professional settings can be overwhelming, but it’s also rewarding. Let me be honest: SQL isn’t my favorite language. But as I’ve learned, you don’t have to love something to be good at it.  (I once disliked Physics, but I still managed to ...

Improving SQL Query Performance : Indexes

Image
  An index is analogous to the Table of Contents in a book.  Prior to exploring what indexes entail, I'd like to begin by discussing 'what SQL is.' SQL, often pronounced as 'sequel,' stands for Structured Query Language .  It is known for being declarative, enabling you to specify what you want without needing to focus on how to achieve it. However, beyond your initial query (SELECT * FROM menu WHERE food = 'burger'), the computer still needs to know how to go about finding those burgers, and that's where an Execution Plan comes into play. An SQL Execution Plan is a detailed outline or strategy devised by the database management system (DBMS) to execute a query efficiently, specifying the steps required such as scanning, filtering, and joining data to retrieve the requested information. Now, imagine scanning a table like flipping through pages in a book – the more pages, the longer it takes. So, keep in mind that scanning time is directly proportional ...

FAQ : How can I get into the field of data?

Image
  A simple pie chart that looks like Pac-Man Many people frequently approach me with questions like, 'Toni, how did you get a job in Data Analytics?' or 'Do you have any advice for someone trying to break into the field?'  As someone who was in their shoes just a year ago, I understand the confusion and frustration that comes with trying to find your way in.  While I don't consider myself a data guru, I've gained a reputation as an expert in the field. However, I want to make it clear that I don't have some grand story about how I solved a complex data problem. In fact, I still classify myself as a beginner, constantly learning and using my blog as a way to reflect on the problems I encounter.  Despite this, I firmly believe that anyone can get a job in this field with the right mindset, network, portfolio, and a bit of luck. So, to help those trying to get into the data field, here are six essential points to keep in mind: Excel is now your best friend This...

Data Wrangling : Best Practices For Working With Big Datasets

Image
  A matrix of dots that you did not bother to count Let's face it, working with exponentially expanding datasets can be both exciting and overwhelming. Imagine dealing with a dataset of 22 million rows - that's a lot of information to process!  The question is, can your ETL process handle it?  This is a problem that I faced this week, and I had to find a way to improve the performance of the process that updates a dashboard as it was taking a decade to update. The initial question that crossed my mind was, "What was the actual size of this dataset?" At first, I mistakenly assumed that the dataset contained between 1 and 6 million rows. A quick COUNT(*) query made it clear that my estimate was way off and also provided me with some clarity to the problem. That dataset was a behemoth, it probably had its own gravitational pull! The sad truth was that the process was not scalable, and it was clear that immediate improvements were necessary. Here are a three tips that I h...