Back to AI/ML

Python for AI/ML

Master Python programming for AI and Machine Learning. Learn NumPy, Pandas, Matplotlib, and essential Python concepts for building ML applications.

Video Tutorial

Why Python for AI/ML?

Python is the most popular language for AI and Machine Learning due to its simplicity, extensive libraries, and strong community support. It offers powerful frameworks like NumPy, Pandas, TensorFlow, and PyTorch.

Examples:

# Python advantages for AI/ML:
- Simple and readable syntax
- Rich ecosystem of ML libraries
- Strong community and resources
- Excellent for prototyping
- Cross-platform compatibility

Key reasons Python dominates AI/ML development

Python Basics for AI/ML

Understanding Python fundamentals is essential before diving into AI/ML. Let's cover the core concepts you'll use frequently.

Examples:

# Variables and data types
name = "AI Model"
accuracy = 0.95
is_trained = True
layers = [128, 64, 32]

# Print information
print(f"Model: {name}, Accuracy: {accuracy * 100}%")

Basic Python variables and string formatting

# Lists and list comprehension
numbers = [1, 2, 3, 4, 5]
squared = [x**2 for x in numbers]
print(squared)  # [1, 4, 9, 16, 25]

# Dictionary for storing model config
config = {
    'learning_rate': 0.001,
    'epochs': 100,
    'batch_size': 32
}

Lists, comprehensions, and dictionaries

NumPy for Numerical Computing

NumPy is the foundation of numerical computing in Python. It provides efficient array operations essential for ML.

Examples:

import numpy as np

# Create arrays
arr = np.array([1, 2, 3, 4, 5])
matrix = np.array([[1, 2], [3, 4], [5, 6]])

# Array operations
print(arr * 2)  # [2, 4, 6, 8, 10]
print(arr.mean())  # 3.0
print(matrix.shape)  # (3, 2)

Creating and manipulating NumPy arrays

# Matrix operations
A = np.array([[1, 2], [3, 4]])
B = np.array([[5, 6], [7, 8]])

# Matrix multiplication
C = np.dot(A, B)
print(C)

# Element-wise operations
D = A + B
print(D)

Matrix operations crucial for neural networks

# Generate random data (useful for testing)
random_data = np.random.randn(100, 5)  # 100 samples, 5 features
random_labels = np.random.randint(0, 2, 100)  # Binary labels

print(f"Data shape: {random_data.shape}")
print(f"Labels shape: {random_labels.shape}")

Generate random data for testing ML models

Pandas for Data Manipulation

Pandas is essential for data manipulation and analysis. It provides DataFrames, which are perfect for handling structured data.

Examples:

import pandas as pd

# Create DataFrame
data = {
    'name': ['Alice', 'Bob', 'Charlie'],
    'age': [25, 30, 35],
    'score': [85, 90, 95]
}
df = pd.DataFrame(data)
print(df)

Creating a Pandas DataFrame

# Read CSV file
df = pd.read_csv('data.csv')

# Basic exploration
print(df.head())  # First 5 rows
print(df.info())  # Data types and null counts
print(df.describe())  # Statistical summary

# Select columns
ages = df['age']
subset = df[['name', 'score']]

Loading and exploring data with Pandas

# Data filtering and manipulation
high_scorers = df[df['score'] > 90]

# Group by and aggregate
avg_by_category = df.groupby('category')['score'].mean()

# Handle missing values
df.fillna(0, inplace=True)
df.dropna(inplace=True)

Filtering, grouping, and handling missing data

Matplotlib for Visualization

Visualization is crucial for understanding data and model performance. Matplotlib is the go-to library for creating plots.

Examples:

import matplotlib.pyplot as plt
import numpy as np

# Line plot
x = np.linspace(0, 10, 100)
y = np.sin(x)

plt.figure(figsize=(10, 6))
plt.plot(x, y, label='sin(x)')
plt.xlabel('X axis')
plt.ylabel('Y axis')
plt.title('Sine Wave')
plt.legend()
plt.grid(True)
plt.show()

Creating a basic line plot

# Scatter plot for data visualization
plt.figure(figsize=(8, 6))
plt.scatter(df['feature1'], df['feature2'], 
            c=df['label'], cmap='viridis', alpha=0.6)
plt.xlabel('Feature 1')
plt.ylabel('Feature 2')
plt.title('Feature Relationship')
plt.colorbar(label='Class')
plt.show()

Scatter plot for visualizing feature relationships

Functions and Classes

Writing reusable code with functions and classes is essential for building scalable ML projects.

Examples:

# Function for data preprocessing
def preprocess_data(data, scale=True):
    """Preprocess input data"""
    # Remove missing values
    data = data.dropna()
    
    if scale:
        from sklearn.preprocessing import StandardScaler
        scaler = StandardScaler()
        data = scaler.fit_transform(data)
    
    return data

# Use the function
clean_data = preprocess_data(raw_data, scale=True)

Creating reusable preprocessing function

# Class for ML model wrapper
class MLModel:
    def __init__(self, model_type='linear'):
        self.model_type = model_type
        self.model = None
        self.is_trained = False
    
    def train(self, X, y):
        """Train the model"""
        from sklearn.linear_model import LinearRegression
        self.model = LinearRegression()
        self.model.fit(X, y)
        self.is_trained = True
    
    def predict(self, X):
        """Make predictions"""
        if not self.is_trained:
            raise ValueError("Model not trained yet!")
        return self.model.predict(X)

# Usage
ml_model = MLModel()
ml_model.train(X_train, y_train)
predictions = ml_model.predict(X_test)

Creating a class to encapsulate ML model logic

File I/O and Data Loading

Loading and saving data is a fundamental skill. Python provides multiple ways to work with different file formats.

Examples:

import pandas as pd
import pickle

# Read CSV
df = pd.read_csv('data.csv')

# Read Excel
df = pd.read_excel('data.xlsx')

# Read JSON
df = pd.read_json('data.json')

# Save DataFrame
df.to_csv('output.csv', index=False)

Reading and writing various data formats

# Save Python objects with pickle
import pickle

# Save model
with open('model.pkl', 'wb') as f:
    pickle.dump(model, f)

# Load model
with open('model.pkl', 'rb') as f:
    loaded_model = pickle.load(f)

Saving and loading Python objects

Virtual Environments and Package Management

Managing dependencies and creating isolated environments is crucial for reproducible ML projects.

Examples:

# Create virtual environment
python -m venv ml_env

# Activate (Windows)
ml_env\Scripts\activate

# Activate (Mac/Linux)
source ml_env/bin/activate

Creating and activating virtual environment

# Install packages
pip install numpy pandas scikit-learn matplotlib

# Save dependencies
pip freeze > requirements.txt

# Install from requirements
pip install -r requirements.txt

Managing project dependencies

Quick Reference

Essential Libraries

  • numpy - Numerical computing
  • pandas - Data manipulation
  • matplotlib - Data visualization
  • scikit-learn - ML algorithms

Best Practices

  • ✓ Use virtual environments
  • ✓ Write modular, reusable code
  • ✓ Document your functions
  • ✓ Follow PEP 8 style guide