I'm a Software Engineer at AMD, where I work on optimization of rendering pipelines using modern machine learning solutions in FidelityFX: upscaling, frame generation, ray regeneration, radiance caching, on proprietary AMD RDNA graphics accelerator architectures.
Previously research student at CERNCMS experiment (Next Generation Trigger). Focused on benchamarking heterogeneous architectures towards 40 MHz processing; efficient heterogeneous ML inference inside cmssw framework; new tau tagging approach with parallel clustering.
At Intel R&D, I worked on GPU Software Development (AI/CV) in AI Graphics Software for DirectX 3D team. At Nokia, I contributed to internal tools for analyzing eNB machine logs and developed an in-house source code management system for SoC hardware solutions.
I'm interested in accelerated heterogeneous computing and applied machine learning. Most of my research focuses on pushing boundaries of algorithmic optimization with various accelerator architectures and applying AI to real-time autonomous systems.
Originally presented at the 19th International Conference on Dependability of Computer Systems with a distinction for oral presentation, this research was later refined and published as a post-conference journal article.
Hardware accelerated W boson decay to three charged pions analysis algorithm for 40 MHz processing. Interface for efficient machine learning workloads in CMS software and concept of new approach of reconstruction and identification of hadronically decaying τ leptons at low pT
Foundations for Framework eXtensions (ffx) is an implementation of work published as part of Efficient Data Movement for Machine Learning Inference in Heterogeneous CMS Software in form of header-only library without cmssw framework dependencies.
What would your favorite book character look like? Our AI identifies characters from text and visualizes them realistically, bringing even non-adapted works to life for education, literary studies, and book promotion.
High-performance, real-time cone detection system for autonomous Formula Student vehicles, utilizing YOLO architecture optimized with NVIDIA TensorRT to achieve low-latency, high-accuracy inference on end device.
Research and evaluation of various SLAM algorithms for autonomous Formula Student vehicles, focusing on real-time performance and accuracy in dynamic environments.
A method for addressing real-time path planning challenges through Delaunay triangulation, random trees, and dense network representations for geometry-driven raceline optimization in Formula Student autonomous systems.
Reference Letters
Recommendations, endorsements, and other relevant references from professors, mentors, and supervisors who have collaborated with me on research, projects, or academic coursework are available upon request.