[ad_1]
On this information, we’ll discover the best way to re-create key Pandas features used for EDA equivalent to describe and corr in BigQuery
Transitioning from BigQuery/SQL to Python may be fairly eye-opening, particularly within the context of knowledge evaluation. I typically discover myself writing intensive queries to govern and analyze knowledge in BigQuery SQL. It’s a robust language, however it will possibly get fairly heavy.
Now, once I switched to Python, I used to be stunned by how streamlined sure duties had been. Python’s libraries, like pandas, permit you to carry out knowledge manipulations and analyses that will be cumbersome in SQL.
I discovered just a few Pandas features like DESCRIBE, CORR, and ISNULL().SUM() tremendous helpful, and wished they had been in BigQuery. This bought me exploring different cool EDA features in pandas and impressed me to put in writing this text. Right here, I’m sharing the strategies and code I got here up with in BigQuery to match a number of the greatest pandas EDA features.
Let’s get caught in!
On this article, we’ll check out these 13 features:
- Head / Tail
- Columns
- Dtypes
- Nunique
- Distinctive
- ISNA / ISNULL()
- ISNULL().SUM()
- DropNA
- Form
- Corr
- Nlargest
- Pattern
- Describe
All through this text, we’ll mess around with the favored mtcars dataset. The mtcars dataset is a publicly accessible built-in dataset in R. It contains 11 options of 32 cars from the 1974 Motor Pattern US journal.
While you first have a look at a dataset, contemplate ‘Head’ and ‘Tail’ as the back and front pages…
[ad_2]