Islamabad, Pakistan · GMT+5 ·

Muhammad Uzair. RL researcher & AI engineer building multi-agent systems and agentic LLM products.

Multiple IEEE papers on multi-agent DRL. Most recently a Mitacs Globalink intern at McMaster on guided policy optimization. By day, shipping voice and email agents at Adept Tech Solutions.

IEEE publications
Mitacs
Globalink 2025
NUST
B.E. Software

01 / About

A short introduction.

I graduated from NUST in 2025 with a B.E. in software engineering. My focus is reinforcement learning research, with two IEEE publications on multi-agent DRL.

Most recently I spent a summer at McMaster University on a Mitacs Globalink internship, working on guided policy optimization. I integrated human advisory signals into PPO training loops, which accelerated convergence over the baseline.

Alongside research, I work as an AI engineer at Adept Tech Solutions in Islamabad, building voice agents on VAPI and Deepgram, RAG retrieval systems with pgvector, and LLM orchestration over FastAPI for enterprise clients.

02 / Peer reviewed

Publications.

  1. [01]

    Multiagent Reinforcement Learning for Joint Spectrum and Energy Optimization in CR-NOMA Enabled Internet of Unmanned Agents

    Saleha Ahmed, Muhammad Uzair, Syed Asad Ullah, et al.

    IEEE Internet of Things Journal · 2025

    A cooperative multi agent DRL framework for CR-NOMA IoT, where distributed agents jointly learn spectrum access and power control policies under partial observability.

    pdf doi
  2. [02]

    Energy Efficient Uplink Communications for Wireless Powered Networks with EH Diversity: A DRL-driven Strategy

    Saleha Ahmed, Muhammad Uzair, Syed Asad Ullah, et al.

    IEEE International Conference on Communications (ICC) · 2025

    DRL driven transmit power control for energy harvesting uplink nodes, evaluated against MRC, SC, and EGC diversity combining schemes under Rayleigh fading.

03 / Selected work

Recent projects.

04 / Timeline

Experience & education.

  1. industry

    Nov 2025 to Present

    AI Engineer · Adept Tech Solutions

    Islamabad, PK

    • Built end to end voice AI pipelines on VAPI and Deepgram with under 400ms transcription latency.
    • Engineered a multi agent LLM orchestration system over FastAPI microservices and PostgreSQL.
    • Shipped a RAG retrieval layer with MPNet (768 dimensional) embeddings and pgvector.
    • Deployed email intelligence agents with intent detection, processing 500+ messages per day.
  2. research

    Jun 2025 to Aug 2025

    Research Intern, Mitacs Globalink · McMaster University

    Hamilton, ON

    • Fully funded Mitacs internship on guided policy optimization in sequential decision making.
    • Benchmarked REINFORCE and PPO in Gymnasium PacMan; tuned reward shaping and entropy regularization.
    • Integrated human advisory signals via subjective logic belief modeling, which accelerated convergence over baseline PPO.
  3. research

    Jun 2024 to Sep 2025

    Research Collaborator · Information Processing and Transmission Lab

    Islamabad, PK

    • Benchmarked DDPG, TD3, and PPO for continuous action transmit power control under stochastic fading.
    • Developed a multi agent DRL framework for joint spectrum access and power control in CR-NOMA IoT.
    • Analyzed MRC, SC, and EGC diversity combining under Rayleigh fading.
  4. education

    Nov 2021 to Jun 2025

    B.E. Software Engineering · National University of Sciences and Technology

    Islamabad, PK

    • CGPA 3.61 / 4.0.
    • Coursework: Machine Learning, Reinforcement Learning, Large Language Models, Probability & Statistics.
    • 4x FBISE HSSC merit scholarship recipient.

05 / Get in touch

Contact.

available

Open to Masters opportunities in RL, AI, LLMs and multi-agent systems, and to AI engineering roles.

Email is the fastest way to reach me, and I usually respond within a day. Grab my CV below if you're evaluating me for a role or program.