Home Machine Learning BigQuery Strategies For Re-Creating Pandas’ Prime EDA Capabilities | by Tom Ellyatt | Feb, 2024

BigQuery Strategies For Re-Creating Pandas’ Prime EDA Capabilities | by Tom Ellyatt | Feb, 2024

0
BigQuery Strategies For Re-Creating Pandas’ Prime EDA Capabilities | by Tom Ellyatt | Feb, 2024

[ad_1]

On this information, we’ll discover the best way to re-create key Pandas features used for EDA equivalent to describe and corr in BigQuery

Picture created utilizing DALL-E

Transitioning from BigQuery/SQL to Python may be fairly eye-opening, particularly within the context of knowledge evaluation. I typically discover myself writing intensive queries to govern and analyze knowledge in BigQuery SQL. It’s a robust language, however it will possibly get fairly heavy.

Now, once I switched to Python, I used to be stunned by how streamlined sure duties had been. Python’s libraries, like pandas, permit you to carry out knowledge manipulations and analyses that will be cumbersome in SQL.

I discovered just a few Pandas features like DESCRIBE, CORR, and ISNULL().SUM() tremendous helpful, and wished they had been in BigQuery. This bought me exploring different cool EDA features in pandas and impressed me to put in writing this text. Right here, I’m sharing the strategies and code I got here up with in BigQuery to match a number of the greatest pandas EDA features.

Let’s get caught in!

On this article, we’ll check out these 13 features:

  1. Head / Tail
  2. Columns
  3. Dtypes
  4. Nunique
  5. Distinctive
  6. ISNA / ISNULL()
  7. ISNULL().SUM()
  8. DropNA
  9. Form
  10. Corr
  11. Nlargest
  12. Pattern
  13. Describe

All through this text, we’ll mess around with the favored mtcars dataset. The mtcars dataset is a publicly accessible built-in dataset in R. It contains 11 options of 32 cars from the 1974 Motor Pattern US journal.

My picture, screenshot taken from R Studio
Panda Icon Supply — Flaticon (hyperlink)

While you first have a look at a dataset, contemplate ‘Head’ and ‘Tail’ as the back and front pages…

[ad_2]