Akhilesh Raj - Personal Academic Website

Welcome to my personal website!

I am a Ph.D. candidate at Vanderbilt University specializing in high-performance computing (HPC) and reinforcement learning for energy-efficient systems. My research focuses on optimizing the performance and power consumption of HPC systems using AI-driven approaches, including reinforcement learning, for efficient resource management.

I am passionate about advancing AI technologies and their applications in real-time systems, cybersecurity for operational technology (OT) networks, and the development of intelligent infrastructure for future computing systems.

My current work is supported by collaborations with institutions like Argonne National Laboratory and involves developing energy efficient computational architecture. In parallel, I also work on advanced testbeds for evaluating and testing cybersecurity agents.

On this website, you will find details about my research, publications, talks, and projects. Please feel free to explore and contact me if you have any questions or would like to collaborate.

Research Interests

  • Reinforcement Learning: Applying reinforcement learning techniques to optimize energy consumption and performance in HPC systems.

    Energy-efficiency has become an integral aspect of modern computing infrastructure design, impacting the performance, cost, scalability, and durability of production systems. The incorporation of power actuating and sensing capabilities in CPU designs is indicative of this, enabling the deployment of system software that can actively monitor and adjust energy consumption and performance at runtime.

    While reinforcement learning (RL) would seem ideal for the design of such energy efficiency control systems, online training present challenges: from the lack of proper models to set up an adequate simulated environment, to perturbation (noise), and reliability issues if training is deployed on a live system.

    In this paper, we discuss the use of offline reinforcement learning (RL), as an alternative approach for the design of an autonomous CPU power controller, with the goal of improving the energy efficiency of parallel applications at runtime. Offline RL sidesteps the issues of online RL training by leveraging a dataset of state transitions collected from arbitrary policies prior to training.

    Our methodology applies Offline RL to a grey-box approach to energy efficiency, combining online application-agnostic performance data (heartbeats) and hardware performance counters to ensure scientific objectives are met with limited performance degradation. Evaluating our method on a variety of compute-bound and memory-bound benchmarks and controlling power on a live system through Intel’s Running Average Power Limit (RAPL), we demonstrate that such an offline-trained agent can result in energy consumption reduction at a reasonable performance degration cost.

  • Cybersecurity: Developing advanced testbeds for testing cybersecurity agents and mitigation strategies in OT networks.

    In the ongoing era of Fourth Industrial Revolution (4IR), where the manufacturing and industrial processes are being integrated with information technologies including Artificial Intelligence (AI), advanced testbeds are a must to ensure an error-free, scalable and gradual transition.

    The network-integrated Operational Technology (OT) used in manufacturing industries also requires advanced testbeds to test and evaluate cybersecurity agents deployed to mitigate adversarial attacks. In this paper, we describe a testbed designed for testing and evaluating attack-resilient network agents, also known as blue agents.

    Using open-source software stacks and network protocols, we develop an easy-to-deploy, OpenStack-based network emulator that closely emulates industrial use cases. We also demonstrate the utility of this testbed by initiating various attack types that lead to OT plant instability.

    The various stages of the experiments, leading to this instability, can be detected and recorded using the measurement tools available within the testbed.

    • Chemical Plant: Demonstration of the RL testbeds.
  • Power Grid: Demonstration of the RL testbeds.

Work Experience:

Free-Lance Projects:

  • DropVault Tech: Freelanced as prototype engineer to develop smart and safe delivery boxes.

Contact Information

Thank you for visiting my website!