cv

General Information

Full Name Ronak Haresh Chhatbar
Location Buffalo, NY
Email [email protected]
Website alphapibeta.com
GitHub alphapibeta
Languages English, Hindi, Gujarati, Telugu

Professional Summary

  • Production AI systems engineer working across the full inference stack — GPU optimization, real-time computer vision, and agentic LLM orchestration — from Jetson edge hardware to enterprise cloud. Five years building and shipping production CV and agentic AI systems. Currently architecting core modules of Centific's Verity enterprise vision platform.

Experience

  • Dec 2024 - Present
    AI Engineer — Verity Multimodal Agentic Platform
    Centific
    • Owned the inference-to-orchestration stack on Verity — Centific's NVIDIA-accelerated enterprise vision platform — cutting 24/7 facility monitoring costs 33% by replacing continuous human surveillance with autonomous AI-driven incident detection.
    • Engineered real-time CV pipeline processing 50+ concurrent camera streams — GPU-accelerated ingestion, frame-level object detection, chained LLM inference — delivering sub-second detection latency end-to-end.
    • Reduced incident response time from 15 to 5 minutes by designing a LangGraph stateful agent graph with conditional routing across specialized sub-agents invoking live sensor queries, maintenance APIs, and compliance report generators.
    • Reduced inference infrastructure costs 20% by engineering Verity's model serving layer with TensorRT-LLM — concurrent multi-model execution with GPU memory partitioning across co-deployed vision and language models.
  • May 2023 - Dec 2024
    Graduate Research Assistant
    Spatial AI & Robotics Lab, University at Buffalo
    • Increased visual odometry inference efficiency 33% by developing C++ plugins integrating a custom visual navigation optimizer with a TensorRT backend for real-time pose estimation on resource-constrained platforms.
    • Led backend development of robotranking.org, a robotics benchmarking platform adopted by the international robotics research community.
  • Sep 2020 - Aug 2022
    Computer Vision Engineer
    Tensorgo Technologies, Hyderabad
    • Improved model throughput 25% and inference performance 40% via TensorRT and DeepStream mixed-precision tuning on Jetson NX and Nano — sustaining 30–40 FPS under production memory constraints.
    • Achieved 8% accuracy improvement in contactless heart rate estimation (rPPG from video) training on 20,000+ images across BP4D+, UBFC-1, and UBFC-2 datasets.
    • Increased meeting analytics accuracy 16% by integrating real-time speaker segmentation into the emYt+ compliance platform via an ASR pipeline for enterprise Zoom and Webex.
  • May 2019 - Aug 2020
    Machine Learning Engineer
    Wavelabs Technologies, Hyderabad
    • Delivered sub-2-second threat identification at 30–40 FPS on resource-constrained Jetson Nano — weapon detection system integrated with iOS/Android apps, trained on 150,000+ labeled images.
    • Generated 15% revenue increase by developing dynamic pricing algorithms across 10,000+ monthly transactions for financial services clients.
    • Reduced resource utilization 35% and deployment time 20% by containerizing model serving with Docker and AWS SageMaker.

Education

  • Jan 2024
    M.S. Computer Science — GPA 3.4/4.0
    University at Buffalo, The State University of New York
    • Coursework: Operating Systems, Analysis of Algorithms, Biometrics Image Analysis, Reinforcement Learning, Computer Vision, High Performance Computing
  • 2019
    B.E. Computer Science — GPA 3.6/4.0
    Jawaharlal Nehru Technological University, Hyderabad
    • Coursework: Machine Learning, Cloud Computing, Data Structures & Algorithms, Computer Networks, Probability & Statistics

Projects

  • 2024 - Present
    Self-Hosted Multi-Model AI Stack
    • Architected and operate a live production AI stack — SGLang serving a 20B LLM at 63 tok/s (2×GPU tensor-parallel), Gemma 4 vision at 94 tok/s on RTX 2060, NanoOWL OWL-ViT TRT engine on Jetson Orin Nano (~300ms) for open-vocabulary object detection.
    • Custom Python agentic loop with MCP tool-calling, RAG over Qdrant (BGE embeddings + cross-encoder reranker, <150ms round-trip), and end-to-end voice pipeline (ASR→LLM→TTS, first audio under 4 seconds).
  • 2023 - 2024
    GPU Computing Research Portfolio
    • Hessian Matrix Inversion — 526× GPU speedup over CPU baseline via LU decomposition with cuSOLVER and Python bindings.
    • CUDA Performance Profiler — automated Nsight Compute metrics collection with interactive Streamlit dashboard for systematic kernel optimization.
    • Convolution Optimization Analysis — published analysis across 18 optimization metrics with Nsight Compute profiling visualization.

Skills

  • GPU Computing & Inference Optimization
    • CUDA, TensorRT, TensorRT-LLM, NVIDIA DeepStream, Nsight Compute, Mixed Precision, Triton Inference Server, OpenMP
  • Agentic AI & LLM Systems
    • LangGraph, LangChain, Model Context Protocol (MCP), SGLang, vLLM, LiteLLM, RAG Pipelines, Qdrant, LangFuse
  • Computer Vision & Machine Learning
    • PyTorch, TensorFlow, OpenCV, ONNX, Real-Time Video Processing, Object Detection, ASR Integration, Biometric Systems
  • Infrastructure & Languages
    • Python, C/C++, CUDA, SQL, Docker, Kubernetes, AWS, Jetson (Orin/NX/Nano), Linux, PostgreSQL, Vector Databases