I started out in control engineering rather than ML, and it still shapes how I think. My CVPR Spotlight paper came out of that background: if you treat training a deep network as a feedback-control problem, you can use a PID controller to drive the optimizer (later extended to IEEE TNNLS, IF 11.368). I did my Master's at Tsinghua working on deep learning optimization, pose estimation, and face recognition, with papers at CVPR, IEEE TNNLS, and Pattern Recognition, and a U.S. patent for automated-checkout tracking.
Working at Instacart, TikTok, Meta FAIR, and AiFi taught me that most of what makes production ML succeed or fail happens around the model, not in it. At TikTok, replacing roughly 800 narrow perception models with a single multimodal LLM only worked once INT8 quantization and tensor-parallel serving on Inf2 had cut inference cost in half. At AiFi, we used cameras only, no shelf sensors, which kept the hardware cheap but put all the pressure on detection and tracking. I'm comfortable across the whole stack: research, large-scale training and inference, deployment on Docker, Kubernetes, TensorRT and edge, and the serving and product layers on top. I also review for CVPR, ECCV, ICCV, and IEEE TPAMI.