Devaj Mody

2024 -

Pursuing my MS in Software Engineering at Rochester Institute of Technology.
Coursework focuses on training and fine-tuning LLMs, exploring neural network architectures, reading about interpretability, VLAs, and agent AI.
Working on the Talos Robotics project with vision transformers and ROS/Gazebo simulation.

2022 - 2024

Led a 5-engineer team at PhyFarm shipping AI-powered farm automation to 500+ farms.
Built a custom RL agent in PyTorch for nutrient dosing, 60% faster than PID controllers.
Developed an LLM fine-tuned on 100K agricultural documents for crop diagnostics.

2021 - 2022

Built data pipelines at Purplle processing 500M events/month from 7M+ MAUs.
Feature store with BigQuery and Bigtable for user profiles, reducing inference latency 85%.
Migrated monolithic .NET services to Go microservices, cutting cloud costs $15K/month.

2019 - 2021

Architected the AdTech platform at BookMyShow serving 100K ads/second at <5ms p99 latency to 5M+ MAUs.
Deployed Aerospike cluster with cross-DC replication achieving 99.99% uptime.
Built priority queue system with Kafka and NodeJS for customer support, reducing response time 80%.

projects

Endo Automation is an AI-powered farm automation startup that won Sarah Ramsey Strong Fund recognition. Built multi-agent system with specialized agents processing real-time sensor telemetry and RAG pipeline indexing 50K+ agricultural documents.

LangGraph Pinecone Go Kubernetes

Talos Robotics is a vision-based tracking platform for robotic arm control using vision transformers for real-time object detection and pose estimation. Built ROS/Gazebo simulation with React telemetry interface.

PyTorch ROS Gazebo React

Deep Agent System is a LangGraph-based agent framework with planning and long-term memory. Includes tool-use with function calling for web search, code execution, and document retrieval, evaluated across 500+ test scenarios.

LangGraph Claude API Supabase

LLaMA From Scratch is a from-scratch PyTorch implementation of LLaMA 2 with RMSNorm, rotary embeddings, and grouped-query attention. Achieves 3x memory efficiency with gradient checkpointing, plus KV caching and flash attention for inference.

PyTorch Transformers

LoRA/QLoRA Fine-tuning enables parameter-efficient fine-tuning of LLaMA for domain-specific tasks, reducing compute 90%. QLoRA with 4-bit quantization runs 7B models on consumer GPUs (24GB VRAM).

PyTorch LoRA Quantization

Vision-Language Pipeline integrates VLMs for multi-modal understanding. OCR + LLM pipeline extracts structured data from 10K+ scanned forms at 95% accuracy. Fine-tuned CLIP embeddings improve retrieval 40%.

CLIP OCR VLMs

Agricultural RAG is a retrieval-augmented generation pipeline fine-tuned on 100K agricultural documents, serving 5K+ queries/month at 92% accuracy with multi-step agent workflows for crop diagnostics.

LLaMA Ollama LangChain

RL Nutrient Controller is a custom actor-critic reinforcement learning agent for nutrient dosing, achieving target EC/pH 60% faster than PID controllers, saving $2K per installation.

PyTorch RL Embedded

Algorithmic Trading Platform enables users to design, configure, and backtest HFT strategies. Backtesting engine processes 10M+ historical trades with sub-second simulation times across multiple exchanges.

Python Solidity Web3.py

DeFi Smart Contracts is a suite of Solidity contracts for automated trade execution and profit-sharing deployed on Ethereum and BSC with $500K+ TVL, integrated with Uniswap and PancakeSwap.

Solidity Ethereum BSC

Real-Time Market Data Pipeline aggregates prices from 20+ CEX/DEX sources, handling 50K events/second at <10ms latency for live trading decisions.

Python Streaming

AdTech Platform serves 100K ads/second at <5ms p99 latency to 5M+ MAUs, with Aerospike cluster and cross-DC replication achieving 99.99% uptime. Increased CTR 25% and revenue $3M annually.

Go Aerospike Kafka

ML Feature Store reduces model inference latency 85% (50s to 6s) for real-time pricing decisions. Processes 500M events/month from 7M+ MAUs enabling 8% conversion rate improvement.

BigQuery Bigtable

IoT Event Infrastructure is an event-driven system handling 1K concurrent IoT connections at <200ms p99 latency, with a data ingestion pipeline processing 600K sensor readings daily.

Go MQTT InfluxDB

Hardware Abstraction Layer supports 15+ sensor types (UART/I2C/SPI) and industrial PLCs via Modbus RTU over RS485, reducing integration time 70% across deployments.

Modbus RS485 I2C/SPI

Real-Time Control Systems implements control loops with 10ms response time for irrigation and dosing at 99.9% uptime, using Kalman filtering for multi-sensor fusion to improve stability 3x.

Go Embedded Kalman

Self-Service DevOps Platform reduces provisioning time from 2 hours to 3 minutes for 20+ teams. Infrastructure as code with automated CI/CD pipelines.

Terraform GitHub Actions

Devaj Mody

bio

projects

writing