cv
General Information
| Full Name | Ronak Haresh Chhatbar |
| Location | Buffalo, NY |
| [email protected] | |
| Website | alphapibeta.com |
| GitHub | alphapibeta |
| Languages | English, Hindi, Gujarati, Telugu |
Professional Summary
- Production AI systems engineer working across the full inference stack — GPU optimization, real-time computer vision, and agentic LLM orchestration — from Jetson edge hardware to enterprise cloud. Five years building and shipping production CV and agentic AI systems. Currently architecting core modules of Centific's Verity enterprise vision platform.
Experience
-
Dec 2024 - Present AI Engineer — Verity Multimodal Agentic Platform
Centific - Owned the inference-to-orchestration stack on Verity — Centific's NVIDIA-accelerated enterprise vision platform — cutting 24/7 facility monitoring costs 33% by replacing continuous human surveillance with autonomous AI-driven incident detection.
- Engineered real-time CV pipeline processing 50+ concurrent camera streams — GPU-accelerated ingestion, frame-level object detection, chained LLM inference — delivering sub-second detection latency end-to-end.
- Reduced incident response time from 15 to 5 minutes by designing a LangGraph stateful agent graph with conditional routing across specialized sub-agents invoking live sensor queries, maintenance APIs, and compliance report generators.
- Reduced inference infrastructure costs 20% by engineering Verity's model serving layer with TensorRT-LLM — concurrent multi-model execution with GPU memory partitioning across co-deployed vision and language models.
-
May 2023 - Dec 2024 Graduate Research Assistant
Spatial AI & Robotics Lab, University at Buffalo - Increased visual odometry inference efficiency 33% by developing C++ plugins integrating a custom visual navigation optimizer with a TensorRT backend for real-time pose estimation on resource-constrained platforms.
- Led backend development of robotranking.org, a robotics benchmarking platform adopted by the international robotics research community.
-
Sep 2020 - Aug 2022 Computer Vision Engineer
Tensorgo Technologies, Hyderabad - Improved model throughput 25% and inference performance 40% via TensorRT and DeepStream mixed-precision tuning on Jetson NX and Nano — sustaining 30–40 FPS under production memory constraints.
- Achieved 8% accuracy improvement in contactless heart rate estimation (rPPG from video) training on 20,000+ images across BP4D+, UBFC-1, and UBFC-2 datasets.
- Increased meeting analytics accuracy 16% by integrating real-time speaker segmentation into the emYt+ compliance platform via an ASR pipeline for enterprise Zoom and Webex.
-
May 2019 - Aug 2020 Machine Learning Engineer
Wavelabs Technologies, Hyderabad - Delivered sub-2-second threat identification at 30–40 FPS on resource-constrained Jetson Nano — weapon detection system integrated with iOS/Android apps, trained on 150,000+ labeled images.
- Generated 15% revenue increase by developing dynamic pricing algorithms across 10,000+ monthly transactions for financial services clients.
- Reduced resource utilization 35% and deployment time 20% by containerizing model serving with Docker and AWS SageMaker.
Education
-
Jan 2024 M.S. Computer Science — GPA 3.4/4.0
University at Buffalo, The State University of New York - Coursework: Operating Systems, Analysis of Algorithms, Biometrics Image Analysis, Reinforcement Learning, Computer Vision, High Performance Computing
-
2019 B.E. Computer Science — GPA 3.6/4.0
Jawaharlal Nehru Technological University, Hyderabad - Coursework: Machine Learning, Cloud Computing, Data Structures & Algorithms, Computer Networks, Probability & Statistics
Projects
-
2024 - Present Self-Hosted Multi-Model AI Stack
- Architected and operate a live production AI stack — SGLang serving a 20B LLM at 63 tok/s (2×GPU tensor-parallel), Gemma 4 vision at 94 tok/s on RTX 2060, NanoOWL OWL-ViT TRT engine on Jetson Orin Nano (~300ms) for open-vocabulary object detection.
- Custom Python agentic loop with MCP tool-calling, RAG over Qdrant (BGE embeddings + cross-encoder reranker, <150ms round-trip), and end-to-end voice pipeline (ASR→LLM→TTS, first audio under 4 seconds).
-
2023 - 2024 GPU Computing Research Portfolio
- Hessian Matrix Inversion — 526× GPU speedup over CPU baseline via LU decomposition with cuSOLVER and Python bindings.
- CUDA Performance Profiler — automated Nsight Compute metrics collection with interactive Streamlit dashboard for systematic kernel optimization.
- Convolution Optimization Analysis — published analysis across 18 optimization metrics with Nsight Compute profiling visualization.
Skills
-
GPU Computing & Inference Optimization
- CUDA, TensorRT, TensorRT-LLM, NVIDIA DeepStream, Nsight Compute, Mixed Precision, Triton Inference Server, OpenMP
-
Agentic AI & LLM Systems
- LangGraph, LangChain, Model Context Protocol (MCP), SGLang, vLLM, LiteLLM, RAG Pipelines, Qdrant, LangFuse
-
Computer Vision & Machine Learning
- PyTorch, TensorFlow, OpenCV, ONNX, Real-Time Video Processing, Object Detection, ASR Integration, Biometric Systems
-
Infrastructure & Languages
- Python, C/C++, CUDA, SQL, Docker, Kubernetes, AWS, Jetson (Orin/NX/Nano), Linux, PostgreSQL, Vector Databases