Devaj Mody

Devaj Mody

Backend & AI engineer building intelligent systems


2024 -

Pursuing my MS in Software Engineering at Rochester Institute of Technology.
Coursework focuses on training and fine-tuning LLMs, exploring neural network architectures, reading about interpretability, VLAs, and agent AI.
Working on the Talos Robotics project with vision transformers and ROS/Gazebo simulation.

2022 - 2024

Led a 5-engineer team at PhyFarm shipping AI-powered farm automation to 500+ farms.
Built a custom RL agent in PyTorch for nutrient dosing, 60% faster than PID controllers.
Developed an LLM fine-tuned on 100K agricultural documents for crop diagnostics.

2021 - 2022

Built data pipelines at Purplle processing 500M events/month from 7M+ MAUs.
Feature store with BigQuery and Bigtable for user profiles, reducing inference latency 85%.
Migrated monolithic .NET services to Go microservices, cutting cloud costs $15K/month.

2019 - 2021

Architected the AdTech platform at BookMyShow serving 100K ads/second at <5ms p99 latency to 5M+ MAUs.
Deployed Aerospike cluster with cross-DC replication achieving 99.99% uptime.
Built priority queue system with Kafka and NodeJS for customer support, reducing response time 80%.

bio

I'm a backend and AI engineer building scalable and intelligent systems.
Currently pursuing my MS in Software Engineering at RIT, working on robotics and AI agents.
Previously led engineering at PhyFarm, built ML infrastructure at Purplle, and scaled AdTech at BookMyShow.

projects

Endo Automation is an AI-powered farm automation startup that won Sarah Ramsey Strong Fund recognition. Built multi-agent system with specialized agents processing real-time sensor telemetry and RAG pipeline indexing 50K+ agricultural documents.

LangGraph Pinecone Go Kubernetes

Talos Robotics is a vision-based tracking platform for robotic arm control using vision transformers for real-time object detection and pose estimation. Built ROS/Gazebo simulation with React telemetry interface.

PyTorch ROS Gazebo React

Deep Agent System is a LangGraph-based agent framework with planning and long-term memory. Includes tool-use with function calling for web search, code execution, and document retrieval, evaluated across 500+ test scenarios.

LangGraph Claude API Supabase

LLaMA From Scratch is a from-scratch PyTorch implementation of LLaMA 2 with RMSNorm, rotary embeddings, and grouped-query attention. Achieves 3x memory efficiency with gradient checkpointing, plus KV caching and flash attention for inference.

PyTorch Transformers

LoRA/QLoRA Fine-tuning enables parameter-efficient fine-tuning of LLaMA for domain-specific tasks, reducing compute 90%. QLoRA with 4-bit quantization runs 7B models on consumer GPUs (24GB VRAM).

PyTorch LoRA Quantization

Vision-Language Pipeline integrates VLMs for multi-modal understanding. OCR + LLM pipeline extracts structured data from 10K+ scanned forms at 95% accuracy. Fine-tuned CLIP embeddings improve retrieval 40%.

CLIP OCR VLMs

Agricultural RAG is a retrieval-augmented generation pipeline fine-tuned on 100K agricultural documents, serving 5K+ queries/month at 92% accuracy with multi-step agent workflows for crop diagnostics.

LLaMA Ollama LangChain

RL Nutrient Controller is a custom actor-critic reinforcement learning agent for nutrient dosing, achieving target EC/pH 60% faster than PID controllers, saving $2K per installation.

PyTorch RL Embedded

Algorithmic Trading Platform enables users to design, configure, and backtest HFT strategies. Backtesting engine processes 10M+ historical trades with sub-second simulation times across multiple exchanges.

Python Solidity Web3.py

DeFi Smart Contracts is a suite of Solidity contracts for automated trade execution and profit-sharing deployed on Ethereum and BSC with $500K+ TVL, integrated with Uniswap and PancakeSwap.

Solidity Ethereum BSC

Real-Time Market Data Pipeline aggregates prices from 20+ CEX/DEX sources, handling 50K events/second at <10ms latency for live trading decisions.

Python Streaming

AdTech Platform serves 100K ads/second at <5ms p99 latency to 5M+ MAUs, with Aerospike cluster and cross-DC replication achieving 99.99% uptime. Increased CTR 25% and revenue $3M annually.

Go Aerospike Kafka

ML Feature Store reduces model inference latency 85% (50s to 6s) for real-time pricing decisions. Processes 500M events/month from 7M+ MAUs enabling 8% conversion rate improvement.

BigQuery Bigtable

IoT Event Infrastructure is an event-driven system handling 1K concurrent IoT connections at <200ms p99 latency, with a data ingestion pipeline processing 600K sensor readings daily.

Go MQTT InfluxDB

Hardware Abstraction Layer supports 15+ sensor types (UART/I2C/SPI) and industrial PLCs via Modbus RTU over RS485, reducing integration time 70% across deployments.

Modbus RS485 I2C/SPI

Real-Time Control Systems implements control loops with 10ms response time for irrigation and dosing at 99.9% uptime, using Kalman filtering for multi-sensor fusion to improve stability 3x.

Go Embedded Kalman

Self-Service DevOps Platform reduces provisioning time from 2 hours to 3 minutes for 20+ teams. Infrastructure as code with automated CI/CD pipelines.

Terraform GitHub Actions

writing

Blog posts and technical writing coming soon...