File Types: Excel vs CSV


They open in the same app. That’s where the similarities end.


As someone who works with large amounts of data (and has made a lot of mistakes), I'm writing this for anyone still fuzzy on the fundamental differences between Excel (.xlsx) and Comma-Separated Values (.csv) files.


If you’re expected to automate data flows, please don’t ignore this one. This post is for you!

On the surface, these two look like identical twins:

Tomayto.xlsx
Tomahto.csv

But if you look closely, you’ll notice the first difference.


If you noticed that Tomayto.xlsx is 7× bigger in Size than Tomahto.csv —you’re right!

And here’s the key detail: both files are EMPTY.

So, what's going on? They both look the same to us, but the computer sees it differently. 



What the Computer Really Sees?


A CSV file is basically: 
name,age,city\nKairon,17,Mars.

An Excel file, on the other hand, is a ZIP archive containing multiple XML files describing sheets, cells, styles, formulas, metadata, timestamps, fonts, colors…

You can literally rename file.xlsx → file.zip and unzip it:

XML for Excel File




Encoding: Why Text Isn’t as Simple as It Looks

Encoding is how computers translate human text (letters, accents, symbols) into 1s and 0s.

CSV files use your system’s default text encoding (usually UTF-8), but Excel often guesses the encoding—and sometimes guesses wrong, which can break special characters.
A classic example: names like MĂ¼ller suddenly turn into MĂƒ¼ller the moment you open a CSV in Excel.


A Lesson I've Learnt the Hard Way

A CSV file can only hold one sheet. 

When you click Save as CSV” in Excel, only the active sheet gets saved. All other sheets are silently dropped. No warning. No mercy.

Why? Because CSV doesn’t understand sheets. It’s just rows and columns in a straight line—

no tabs, no formulas, no metadata. Excel is a workbook. CSV is a single table.

If you need to export an Excel file with multiple sheets to CSV, don’t use “Save As.”

Instead:

  • Export each sheet as its own CSV file, or

  • Combine the sheets into one table first (with a column like sheet_name), then export



Use Cases

Understand the structure, and the choice makes itself!

Excel for ad-hoc analysis.

Think: Finance reports with conditional formatting and macros
Primary User: Humans

 CSV for everything scalable.


Think: Database imports, Machine learning datasets, ETL jobs, APIs
Primary User: Machines

Final Thought

Excel and CSV look similar, but they live in completely different layers of the data stack.

One is a user interface.
The other is a data transport format.

Once you internalize that, a lot of data problems suddenly make sense.

And you’ll break fewer pipelines. 

Next time you're tempted to "just save as Excel," ask: Do I need the Ferrari or the bicycle?


Resources




 

Comments

Popular posts from this blog

Understanding Outliers

Wait! Before You Throw Away Your Laptop...

October Reflections

Learning Something New: EDA on Guitars

The Human Side of Tech: Emotional Intelligence

Is it OK to kick a robot dog?

Tableau, Power BI and My Preference

Women In STEM : Challenges and Advantages

No Laptop November

Work-Life Balance: Be Intentional About It