cv | Ronak Haresh Chhatbar

General Information

Full Name	Ronak Haresh Chhatbar
Location	Buffalo, NY
Email	[email protected]
Website	alphapibeta.com
GitHub	alphapibeta
Languages	English, Hindi, Gujarati, Telugu

Professional Summary

Production AI systems engineer working across the full inference stack — GPU optimization, real-time computer vision, and agentic LLM orchestration — from Jetson edge hardware to enterprise cloud. Five years building and shipping production CV and agentic AI systems. Currently architecting core modules of Centific's Verity enterprise vision platform.

Experience

Dec 2024 - Present
AI Engineer — Verity Multimodal Agentic Platform

Centific
- Owned the inference-to-orchestration stack on Verity — Centific's NVIDIA-accelerated enterprise vision platform — cutting 24/7 facility monitoring costs 33% by replacing continuous human surveillance with autonomous AI-driven incident detection.
- Engineered real-time CV pipeline processing 50+ concurrent camera streams — GPU-accelerated ingestion, frame-level object detection, chained LLM inference — delivering sub-second detection latency end-to-end.
- Reduced incident response time from 15 to 5 minutes by designing a LangGraph stateful agent graph with conditional routing across specialized sub-agents invoking live sensor queries, maintenance APIs, and compliance report generators.
- Reduced inference infrastructure costs 20% by engineering Verity's model serving layer with TensorRT-LLM — concurrent multi-model execution with GPU memory partitioning across co-deployed vision and language models.
May 2023 - Dec 2024
Graduate Research Assistant

Spatial AI & Robotics Lab, University at Buffalo
- Increased visual odometry inference efficiency 33% by developing C++ plugins integrating a custom visual navigation optimizer with a TensorRT backend for real-time pose estimation on resource-constrained platforms.
- Led backend development of robotranking.org, a robotics benchmarking platform adopted by the international robotics research community.
Sep 2020 - Aug 2022
Computer Vision Engineer

Tensorgo Technologies, Hyderabad
- Improved model throughput 25% and inference performance 40% via TensorRT and DeepStream mixed-precision tuning on Jetson NX and Nano — sustaining 30–40 FPS under production memory constraints.
- Achieved 8% accuracy improvement in contactless heart rate estimation (rPPG from video) training on 20,000+ images across BP4D+, UBFC-1, and UBFC-2 datasets.
- Increased meeting analytics accuracy 16% by integrating real-time speaker segmentation into the emYt+ compliance platform via an ASR pipeline for enterprise Zoom and Webex.
May 2019 - Aug 2020
Machine Learning Engineer

Wavelabs Technologies, Hyderabad
- Delivered sub-2-second threat identification at 30–40 FPS on resource-constrained Jetson Nano — weapon detection system integrated with iOS/Android apps, trained on 150,000+ labeled images.
- Generated 15% revenue increase by developing dynamic pricing algorithms across 10,000+ monthly transactions for financial services clients.
- Reduced resource utilization 35% and deployment time 20% by containerizing model serving with Docker and AWS SageMaker.

Education

Jan 2024
M.S. Computer Science — GPA 3.4/4.0

University at Buffalo, The State University of New York
- Coursework: Operating Systems, Analysis of Algorithms, Biometrics Image Analysis, Reinforcement Learning, Computer Vision, High Performance Computing
2019
B.E. Computer Science — GPA 3.6/4.0

Jawaharlal Nehru Technological University, Hyderabad
- Coursework: Machine Learning, Cloud Computing, Data Structures & Algorithms, Computer Networks, Probability & Statistics

Projects

2024 - Present
Self-Hosted Multi-Model AI Stack
- Architected and operate a live production AI stack — SGLang serving a 20B LLM at 63 tok/s (2×GPU tensor-parallel), Gemma 4 vision at 94 tok/s on RTX 2060, NanoOWL OWL-ViT TRT engine on Jetson Orin Nano (~300ms) for open-vocabulary object detection.
- Custom Python agentic loop with MCP tool-calling, RAG over Qdrant (BGE embeddings + cross-encoder reranker, <150ms round-trip), and end-to-end voice pipeline (ASR→LLM→TTS, first audio under 4 seconds).
2023 - 2024
GPU Computing Research Portfolio
- Hessian Matrix Inversion — 526× GPU speedup over CPU baseline via LU decomposition with cuSOLVER and Python bindings.
- CUDA Performance Profiler — automated Nsight Compute metrics collection with interactive Streamlit dashboard for systematic kernel optimization.
- Convolution Optimization Analysis — published analysis across 18 optimization metrics with Nsight Compute profiling visualization.

Skills

GPU Computing & Inference Optimization
- CUDA, TensorRT, TensorRT-LLM, NVIDIA DeepStream, Nsight Compute, Mixed Precision, Triton Inference Server, OpenMP
Agentic AI & LLM Systems
- LangGraph, LangChain, Model Context Protocol (MCP), SGLang, vLLM, LiteLLM, RAG Pipelines, Qdrant, LangFuse
Computer Vision & Machine Learning
- PyTorch, TensorFlow, OpenCV, ONNX, Real-Time Video Processing, Object Detection, ASR Integration, Biometric Systems
Infrastructure & Languages
- Python, C/C++, CUDA, SQL, Docker, Kubernetes, AWS, Jetson (Orin/NX/Nano), Linux, PostgreSQL, Vector Databases

General Information

Professional Summary

Experience

AI Engineer — Verity Multimodal Agentic Platform

Graduate Research Assistant

Computer Vision Engineer

Machine Learning Engineer

Education

M.S. Computer Science — GPA 3.4/4.0

B.E. Computer Science — GPA 3.6/4.0

Projects

Self-Hosted Multi-Model AI Stack

GPU Computing Research Portfolio

Skills

GPU Computing & Inference Optimization

Agentic AI & LLM Systems

Computer Vision & Machine Learning

Infrastructure & Languages