Data Profiling with Python's Great Expectations
pip install great_expectations
As a Data Quality Analyst, I constantly seek tools to enhance my daily workflow, and Great Expectations is a recent discovery.
Before getting into what this Python library has to offer, let's address a fundamental question:
What is Data Profiling?
In order to effectively tackle any problem, one must first understand it.
Data Profiling is the art of uncovering and investigating data quality issues, such as duplication, missing values, and inconsistency.
It is essentially determining the baseline level of quality of a dataset in terms of the six data quality dimensions : Completeness, Uniqueness, Validity, Timeliness, Accuracy and Consistency.
Great Expectations Data Profiling Example
Now, picture yourself as a pastry chef whose mission is to ensure that every currants roll emerging from your bakery is nothing short of delightful.
Let's explore how data profiling in Great Expectations guarantees they meet your standards.
Comments
Post a Comment