Deep Learning

To provide a detailed step-by-step explanation of Kubernetes in the context of the files you’ve uploaded, I’ll use your project as an example. Here’s an overview and breakdown:

1. What is Kubernetes?

Kubernetes is an open-source container orchestration platform for managing containerized applications across a cluster of machines. It automates deployment, scaling, and operations of application containers.

Key Concepts:

Pods: Smallest deployable units, encapsulating one or more containers.
Services: Expose a set of Pods as a network service.
Deployments: Define how Pods are created and managed.
Jobs: Ensure specific tasks are completed (e.g., model training).
Volumes: Provide persistent storage for containers.

2. Overview of Your Project

Your project involves:

Flask API: predict.py serves predictions.
Streamlit App: app.py interacts with users to send requests to the API.
Model Training: train.py trains and saves a linear regression model.
Kubernetes Deployment: Managed using YAML files (deployment.yaml, service.yaml, train-job.yaml) and run_pipeline.sh.

3. Steps to Deploy with Kubernetes

Step 1: Containerize the Application

Kubernetes uses Docker containers. Your Dockerfile ensures:

The environment is consistent.
Dependencies for predict.py are installed.
The application is runnable.

Example Dockerfile (assumed from context):

FROM python:3.8-slim
WORKDIR /app
COPY . /app
RUN pip install -r requirements.txt
CMD ["python", "predict.py"]

Step 2: Kubernetes Job for Training

Your run_pipeline.sh creates a Kubernetes Job to train the model.

Key Steps in Training Job:

Volume mounts provide the dataset (dataset.csv) and a path to save model.pkl.
Job YAML dynamically applies training logic using train.py.

Snippet from run_pipeline.sh:

kubectl apply -f - <<EOF
apiVersion: batch/v1
kind: Job
metadata:
  name: train-job
spec:
  template:
    spec:
      containers:
      - name: train-job
        image: $DOCKER_IMAGE
        command: ["python", "train.py"]
      volumes:
      - name: dataset-volume
        hostPath:
          path: /mnt/data/dataset.csv
EOF

Step 3: API Deployment

After training, the Flask API (predict.py) is deployed. Kubernetes Deployment YAML defines:

Number of replicas.
Image to use (from Docker Hub).
Port configuration.

Deployment YAML Example:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: flask-api-deployment
spec:
  replicas: 2
  selector:
    matchLabels:
      app: flask-api
  template:
    metadata:
      labels:
        app: flask-api
    spec:
      containers:
      - name: flask-api
        image: modeha/flask-api:latest
        ports:
        - containerPort: 5000

Step 4: Exposing the API

A Kubernetes Service exposes the API internally or externally (e.g., via NodePort).

Service YAML Example:

apiVersion: v1
kind: Service
metadata:
  name: flask-api-service
spec:
  selector:
    app: flask-api
  ports:
  - protocol: TCP
    port: 80
    targetPort: 5000
  type: NodePort

Step 5: Using the Streamlit Interface

Your Streamlit app (app.py) sends requests to the API to predict house prices based on user inputs.

4. Running the Pipeline

Build and Push Docker Image:

docker build -t modeha/my-app:latest .
docker push modeha/my-app:latest

Run the Pipeline Script:
```
./run_pipeline.sh my-app
```
This:
- Kills processes blocking the required port.
- Trains the model (train.py) using a Kubernetes Job.
- Deploys the API and exposes it.
Access the API via Streamlit:
- Launch app.py:
```
streamlit run app.py
```
- Input house features and get predictions.

5. Next Steps

Scaling: Adjust replicas in your Deployment YAML to scale the API.
Monitoring: Use Kubernetes tools like kubectl logs, Prometheus, or Grafana.
CI/CD Integration: Automate deployments with Jenkins, GitHub Actions, or other CI/CD tools.