price-aggregator-microservices

๐Ÿ›๏ธ System Architecture Documentation

Price Aggregator Microservices Architecture

Version: 1.0.0
Last Updated: 2026-02-17
Status: Production-Ready


๐Ÿ“ Architecture Overview

This document describes the architecture, design decisions, and data flow for the Price Aggregator Microservices system. The system follows a microservices architecture pattern with clear service boundaries, internal communication, and a layered security model.


๐ŸŽฏ Design Principles

  1. Separation of Concerns: Each service has a single, well-defined responsibility
  2. Security by Default: Services not exposed unless necessary
  3. Fault Tolerance: Services can fail independently without bringing down the system
  4. Scalability: Each service can be scaled independently based on load
  5. Observability: Health checks and logging for all services
  6. Infrastructure as Code: Everything defined in version-controlled Docker configurations

๐Ÿ—๏ธ System Components

High-Level Architecture Diagram

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                         External Layer                           โ”‚
โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”                                                  โ”‚
โ”‚  โ”‚  Internet  โ”‚                                                  โ”‚
โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”˜                                                  โ”‚
โ”‚        โ”‚                                                         โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
         โ”‚
    โ”Œโ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”
    โ”‚  User    โ”‚
    โ””โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”˜
         โ”‚
         โ”‚ HTTP
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                      Presentation Layer                          โ”‚
โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”          โ”‚
โ”‚  โ”‚          Frontend (React + Nginx)                  โ”‚          โ”‚
โ”‚  โ”‚          Port: 3000 (Exposed)                      โ”‚          โ”‚
โ”‚  โ”‚  - User Interface                                  โ”‚          โ”‚
โ”‚  โ”‚  - Product Search                                  โ”‚          โ”‚
โ”‚  โ”‚  - Results Display                                 โ”‚          โ”‚
โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜          โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                          โ”‚
                          โ”‚ HTTP REST
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                      Application Layer                           โ”‚
โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”          โ”‚
โ”‚  โ”‚       Node.js API Gateway (Express)                โ”‚          โ”‚
โ”‚  โ”‚       Port: 5000 (Exposed)                         โ”‚          โ”‚
โ”‚  โ”‚  - Authentication (JWT)                            โ”‚          โ”‚
โ”‚  โ”‚  - Rate Limiting                                   โ”‚          โ”‚
โ”‚  โ”‚  - Request Routing                                 โ”‚          โ”‚
โ”‚  โ”‚  - Input Validation                                โ”‚          โ”‚
โ”‚  โ”‚  - Service Orchestration                           โ”‚          โ”‚
โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜          โ”‚
โ”‚         โ”‚                            โ”‚                           โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
          โ”‚                            โ”‚
          โ”‚ Internal HTTP              โ”‚ MongoDB Protocol
          โ”‚                            โ”‚
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                      Service Layer                               โ”‚
โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”     โ”‚
โ”‚  โ”‚  Python Collector (FastAPI)โ”‚    โ”‚    MongoDB            โ”‚     โ”‚
โ”‚  โ”‚  Port: 8000 (Internal)     โ”‚    โ”‚    Port: 27017        โ”‚     โ”‚
โ”‚  โ”‚  - Web Scraping            โ”‚    โ”‚    (Internal)         โ”‚     โ”‚
โ”‚  โ”‚  - Data Collection         โ”‚    โ”‚    - User Data        โ”‚     โ”‚
โ”‚  โ”‚  - Price Aggregation       โ”‚    โ”‚    - Product Cache    โ”‚     โ”‚
โ”‚  โ”‚  - Data Normalization      โ”‚    โ”‚    - Auth Storage     โ”‚     โ”‚
โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜     โ”‚
โ”‚               โ”‚                                                  โ”‚
โ”‚               โ”‚ Redis Protocol                                   โ”‚
โ”‚               โ”‚                                                  โ”‚
โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”                                  โ”‚
โ”‚  โ”‚       Redis Cache          โ”‚                                  โ”‚
โ”‚  โ”‚       Port: 6379 (Internal)โ”‚                                  โ”‚
โ”‚  โ”‚  - Session Storage         โ”‚                                  โ”‚
โ”‚  โ”‚  - Query Cache             โ”‚                                  โ”‚
โ”‚  โ”‚  - Rate Limit Store        โ”‚                                  โ”‚
โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜                                  โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

๐Ÿ”„ Data Flow

1. Product Search Flow

โ”Œโ”€โ”€โ”€โ”€โ”     โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”     โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”     โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚Userโ”œโ”€โ”€โ”€โ”€โ–ถโ”‚ Frontend โ”œโ”€โ”€โ”€โ”€โ–ถโ”‚ Gateway  โ”œโ”€โ”€โ”€โ”€โ–ถโ”‚   Python     โ”‚
โ””โ”€โ”€โ”€โ”€โ”˜     โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜     โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜     โ”‚  Collector   โ”‚
                                             โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                                                    โ”‚
                                                    โ–ผ
                                             โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
                                             โ”‚    Redis     โ”‚
                                             โ”‚    Cache     โ”‚
                                             โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

1. User enters search query in Frontend
2. Frontend sends HTTP GET to Gateway (/search?query=laptop)
3. Gateway validates request and checks authentication
4. Gateway forwards request to Python Collector (internal)
5. Python Collector checks Redis cache for existing results
6. If cache miss, scrapes product data from sources
7. Python Collector normalizes and returns data
8. Gateway adds metadata and returns to Frontend
9. Frontend displays results to user

2. Authentication Flow

โ”Œโ”€โ”€โ”€โ”€โ”     โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”     โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”           โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚Userโ”œโ”€โ”€โ”€โ”€โ–ถโ”‚ Frontend โ”œโ”€โ”€โ”€โ”€โ–ถโ”‚ Gateway  โ”œโ”€โ”€โ”€โ”€โ–ถ      โ”‚ MongoDB  โ”‚
โ””โ”€โ”€โ”€โ”€โ”˜     โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜     โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜           โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                                โ”‚
                                โ–ผ
                            โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
                            โ”‚  Redis  โ”‚
                            โ”‚ Session โ”‚
                            โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Login Flow:
1. User submits credentials via Frontend
2. Frontend POSTs to /auth/login
3. Gateway validates credentials against MongoDB
4. Gateway generates JWT token
5. Gateway stores session in Redis
6. Gateway returns token to Frontend
7. Frontend stores token for subsequent requests

3. Data Collection Flow (Internal)

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”     โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”         โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚   Gateway    โ”œโ”€โ”€โ”€โ”€โ–ถโ”‚   Python    โ”œโ”€โ”€โ”€โ”€โ–ถ    โ”‚  External  โ”‚
โ”‚              โ”‚     โ”‚  Collector  โ”‚         โ”‚  E-commerceโ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜     โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”˜         โ”‚   Sites    โ”‚
                            โ”‚                โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                            โ–ผ
                     โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
                     โ”‚    Redis     โ”‚
                     โ”‚    Cache     โ”‚
                     โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

1. Gateway calls Python Collector endpoint
2. Collector initiates parallel scraping tasks
3. Collector fetches data from multiple sources
4. Data is normalized and deduplicated
5. Results cached in Redis with TTL
6. Aggregated data returned to Gateway

๐ŸŒ Network Architecture

Network Topology

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚           Docker Bridge Network (internal-network)   โ”‚
โ”‚                 Subnet: 172.28.0.0/16                โ”‚
โ”‚                                                      โ”‚
โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”            โ”‚
โ”‚  โ”‚ Frontend โ”‚  โ”‚ Gateway  โ”‚  โ”‚  Python  โ”‚            โ”‚
โ”‚  โ”‚          โ”‚  โ”‚          โ”‚  โ”‚Collector โ”‚            โ”‚
โ”‚  โ””โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”˜  โ””โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”˜  โ””โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”˜            โ”‚
โ”‚       โ”‚             โ”‚             โ”‚                  โ”‚
โ”‚       โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜                  โ”‚
โ”‚                     โ”‚                                โ”‚
โ”‚            โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”                       โ”‚
โ”‚            โ”‚                 โ”‚                       โ”‚
โ”‚       โ”Œโ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”       โ”Œโ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”                  โ”‚
โ”‚       โ”‚ MongoDB โ”‚       โ”‚  Redis  โ”‚                  โ”‚
โ”‚       โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜       โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜                  โ”‚
โ”‚                                                      โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
         โ”‚              โ”‚
    Port 3000       Port 5000
         โ”‚              โ”‚
    Exposed to Host Network

Port Exposure Strategy

Service Internal Port Exposed Port Access Level
Frontend 3000 3000 Public
Node Gateway 5000 5000 Public
Python Collector 8000 - Internal Only
MongoDB 27017 - Internal Only
Redis 6379 - Internal Only

Security Rationale:


๐Ÿ” Security Architecture

Defense in Depth Layers

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ Layer 1: Network Isolation                          โ”‚
โ”‚ - Internal Docker network                           โ”‚
โ”‚ - No external exposure for backend services         โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                    โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ Layer 2: Container Security                         โ”‚
โ”‚ - Non-root users in all containers                  โ”‚
โ”‚ - Minimal base images (Alpine, Slim)                โ”‚
โ”‚ - Read-only file systems where possible             โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                    โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ Layer 3: Application Security                       โ”‚
โ”‚ - JWT authentication                                โ”‚
โ”‚ - BCrypt password hashing                           โ”‚
โ”‚ - Rate limiting                                     โ”‚
โ”‚ - Input validation                                  โ”‚
โ”‚ - Helmet.js security headers                        โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                    โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ Layer 4: Data Security                              โ”‚
โ”‚ - Environment-based secrets                         โ”‚
โ”‚ - Encrypted connections                             โ”‚
โ”‚ - Database authentication                           โ”‚
โ”‚ - Password-protected Redis                          โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Authentication & Authorization

Request Flow with JWT:

1. Client Request
   โ”œโ”€โ–ถ Authorization: Bearer <JWT_TOKEN>
   โ”‚
2. Gateway Middleware
   โ”œโ”€โ–ถ Extract token from header
   โ”œโ”€โ–ถ Verify signature with JWT_SECRET
   โ”œโ”€โ–ถ Check expiration
   โ”œโ”€โ–ถ Decode payload
   โ”‚
3. Decision
   โ”œโ”€โ–ถ Valid: Attach user context to request โ†’ Continue
   โ””โ”€โ–ถ Invalid: Return 401 Unauthorized โ†’ Reject

Service-to-Service Communication

All internal communication uses container name DNS resolution:

// Gateway โ†’ Python Collector
const PYTHON_URL = process.env.PYTHON_SERVICE_URL;
// "http://python-collector:8000"

// Gateway โ†’ MongoDB
const MONGO_URI = process.env.MONGO_URI;
// "mongodb://admin:pass@mongodb:27017/db"

// Gateway โ†’ Redis
const REDIS_URL = process.env.REDIS_URL;
// "redis://:password@redis:6379/0"

Benefits:


๐Ÿ“Š Service Communication Patterns

Synchronous Communication (REST)

Frontend โ”โ”โ”โ”โ”โ”HTTPโ”โ”โ”โ”โ–ถ Gateway โ”โ”โ”โ”โ”โ”HTTPโ”โ”โ”โ”โ–ถ Python
                          โ”‚
                          โ”œโ”โ”โ”โ”โ”MongoDBโ”โ”โ”โ”โ–ถ Database
                          โ”‚
                          โ””โ”โ”โ”โ”โ”Redisโ”โ”โ”โ”โ”โ–ถ Cache

Protocol: HTTP/HTTPS REST
Format: JSON
Pattern: Request-Response

Example:

GET /search?query=laptop HTTP/1.1
Host: node-gateway:5000
Authorization: Bearer eyJhbG...

Asynchronous Communication (Future)

For scalability, consider adding:


๐Ÿ’พ Data Architecture

Data Storage Strategy

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚              MongoDB (Primary Database)            โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ Collections:                                       โ”‚
โ”‚  - users: User accounts and profiles               โ”‚
โ”‚  - products: Cached product data (optional)        โ”‚
โ”‚  - searches: Search history (analytics)            โ”‚
โ”‚  - sessions: Active user sessions                  โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                Redis (Cache Layer)                 โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ Key Patterns:                                      โ”‚
โ”‚  - search:<query_hash>: Cached search results      โ”‚
โ”‚  - session:<user_id>: User session data            โ”‚
โ”‚  - rate_limit:<ip>: API rate limiting              โ”‚
โ”‚  - product:<id>: Individual product cache          โ”‚
โ”‚                                                    โ”‚
โ”‚ TTL Strategy:                                      โ”‚
โ”‚  - Search results: 1 hour                          โ”‚
โ”‚  - Sessions: 24 hours                              โ”‚
โ”‚  - Rate limits: 15 minutes                         โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Data Persistence

Docker Volumes:

mongodb_data/
  โ””โ”€โ”€ Persistent MongoDB data files

mongodb_config/
  โ””โ”€โ”€ MongoDB configuration

redis_data/
  โ””โ”€โ”€ Redis RDB/AOF persistence

Backup Strategy:


๐Ÿ”„ Deployment Architecture

Container Orchestration

Docker Compose Dependency Graph: frontend
  โ”‚
  โ”‚ depends_on
  โ–ผ
  node-gateway
  โ”‚
  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
  โ”‚         โ”‚         โ”‚
  depends_on  depends_on  depends_on
  โ”‚         โ”‚         โ”‚
  โ–ผ         โ–ผ         โ–ผ
  python-    mongodb    redis
  collector
  โ”‚
  depends_on
  โ”‚
  โ–ผ
  redis

Health Check Strategy

All services implement health checks:

healthcheck:
  test: [health check command]
  interval: 30s # Check every 30 seconds
  timeout: 3s # Fail if no response in 3s
  retries: 3 # Try 3 times before marking unhealthy
  start_period: 10s # Grace period after container start

Benefits:


๐Ÿ“ˆ Scalability Considerations

Horizontal Scaling

Each service can be scaled independently:

# Scale Python collectors for heavy scraping
docker compose up -d --scale python-collector=3

# Scale gateway for high traffic
docker compose up -d --scale node-gateway=2

Load Balancing (Future)

                โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
                โ”‚ Load Balancerโ”‚
                โ”‚  (Nginx)     โ”‚
                โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                       โ”‚
        โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
        โ”‚              โ”‚              โ”‚
   โ”Œโ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”    โ”Œโ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”    โ”Œโ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”
   โ”‚Gateway 1โ”‚    โ”‚Gateway 2โ”‚    โ”‚Gateway 3โ”‚
   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Database Scaling

MongoDB Replica Set (Future):

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”       โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”       โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ Primary โ”‚โ—€โ”€โ”€โ”€โ”€โ”€โ–ถโ”‚Secondaryโ”‚โ—€โ”€โ”€โ”€โ”€โ”€โ–ถโ”‚Secondaryโ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜       โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜       โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
     โ”‚                โ”‚                  โ”‚
     โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                      โ”‚
                   Clients

๐Ÿ› ๏ธ Technology Decisions

Why Microservices?

Benefit Implementation
Independent Deployment Each service has its own container
Technology Diversity Node.js for API, Python for scraping
Fault Isolation One service failure doesnโ€™t cascade
Team Autonomy Teams can own individual services
Scalability Scale services independently

Why Docker?

Why This Tech Stack?

Component Reason
React Modern UI, component reusability, large ecosystem
Express Mature, middleware-friendly, Node.js ecosystem
FastAPI High performance, async support, auto-generated docs
MongoDB Flexible schema, JSON-native, good for product data
Redis Fast in-memory cache, pub/sub support, simple APIs

๐Ÿ”ฎ Future Enhancements

Phase 2: Production Hardening

Phase 3: Feature Expansion

Phase 4: Scale Optimization


๐Ÿ“š References


๐Ÿ“ž Architecture Review

For questions about architecture decisions:


Document Maintained By: DevOps Team
Review Cycle: Quarterly
Last Architecture Review: 2026-02-17