Introduction
There is a common misconception that writing code for data science involves sitting in a dark room and effortlessly typing Matrix-esque syntax onto a vast
array of monitor screens.
In reality coding is generally a slow, non-linear, often frustrating and iterative process involving back tracking, second guessing, Google searching, posting questions on
Stack Overflow and (most commonly in my experience) copy/pasting and modifying old code that you or someone else has
previously written.
This post is a running log of Python syntax that I constantly find myself referring back to and that has proven to be a useful reference for my team and other colleagues.
Implementing in Python
Link to my GitHub repo:
Useful-Python-Syntax
Some of the syntax examples covered are:
Scheduling Python functions
Return vs. Print
Loops, % and .format syntax
The 4 inbuilt data structures of Python
if __name__ == '__main__'
Random seeds
Subsetting a DataFrame
Importing multiple files with glob
Working with large data 1: Chunking
Working with large data 2: Random Sampling
Scaling data
Using an API in Python
Lambda functions
Creating DataFrames in a loop
Plotting with Matplotlib