Skip to content

Quantum Computing News

  • Home
  • Quantum News
    • Quantum Computing
    • Quantum Hardware and Software
    • Quantum Startups and Funding
    • Quantum Computing Stocks
    • Quantum Research and Security
  • IMP Links
    • About Us
    • Contact Us
    • Privacy & Policies
  1. Home
  2. Quantum Computing
  3. Soft Actor Critic Algorithm Enables Quantum Control With RL
Quantum Computing

Soft Actor Critic Algorithm Enables Quantum Control With RL

Posted on June 28, 2025 by HemaSumanth7 min read
Soft Actor Critic Algorithm Enables Quantum Control With RL

Soft Actor Critic Algorithm

According to a major development in quantum technology, reinforcement learning (RL), in particular the soft actor-critic (SAC) algorithm, has been effectively used to optimise the control of quantum systems, improving the accuracy of background magnetic field magnitude estimation, according to Quantum News. This innovation provides a potent new method for creating control schemes, particularly in intricate quantum settings where conventional analytical solutions are challenging to find.

The search for increased magnetic field measurement precision is essential for many contemporary technologies, from materials research to medical imaging. However, external disturbances like decoherence and a lack of understanding about the features of the system frequently make it difficult to achieve optimal sensitivity in quantum systems. Researchers have shown the enormous potential of intelligent control in quantum systems by tackling these challenges, including a group from the University of Ottawa headed by Logan W. Cooke and Stefanie Czischek. Their research, which is presented in a paper titled “Reinforcement Learning for Optimal Control of Spin Magnetometers,” demonstrates how effective these techniques are.

You can also read Virtual-Z Gates And Symmetric Collation In Quantum Circuits

Understanding Reinforcement Learning and the Soft Actor-Critic Algorithm

A powerful machine learning framework called reinforcement learning teaches an agent to make choices by interacting with its surroundings in order to accomplish a certain objective. RL agents learn via trial and error under the guidance of a reward function, as opposed to classical supervised learning, which depends on pre-existing labelled datasets.

An RL setup in quantum control consists of the following essential elements:

  • State (s): Depicted by the quantum system’s density matrix (ρ(t)).
  • Control Action (a): The application of time-dependent signals, like microwaves or laser pulses, to the quantum system with the goal of achieving a desired temporal evolution.
  • Scalar Reward (r): A feedback signal that indicates how well the selected action worked, usually based on how closely the system resembles its desired state (fidelity).
  • Policy (π): The agent’s learnt strategy that associates actions with observable states. Finding the best course of action (π*) that maximises the anticipated cumulative benefit over time is the goal.

One model-free reinforcement learning method that is particularly highlighted is the soft actor-critic (SAC) algorithm. Being “model-free” means that the SAC agent does not require explicit analytical modelling of the behaviour of the quantum system; instead, it learns optimal control methods directly from the system’s dynamics through interaction with a simulated quantum system. This skill is especially useful in situations when it is difficult to produce accurate analytical solutions for intricate quantum systems. The agent investigates the dynamics of the system in a simulated environment to find efficient control solutions.

You can also read Nu Quantum Introduced World’s First Quantum Networking Unit

This method works quite well for creating pulse sequences for a spin-based magnetometer, which measures magnetic fields by looking at how atomic spins behave. Maximising the accuracy of magnetic field magnitude estimation is the aim. Modulating physical control signals in quantum systems requires a strong framework for managing continuous action spaces, which the SAC algorithm provides by design. The study also emphasises how the Quantum Fisher Information, a metric that is helpful for multi-parameter estimation and measures how sensitive a quantum state is to changes in parameters, is used to frame the control problem.

Advanced Techniques and System Optimisation

A number of complex methods and computer resources are necessary for the effective application of RL in quantum control:

  • Generalisation Across Parameters: Good generalisation over unobserved Hamiltonian parameters was shown by the trained RL agent, demonstrating its versatility across different quantum sensor configurations. Although its performance demonstrated sensitivity to beginning state purity and pulse duration, its generalisation ability greatly increases the applicability of this strategy.
  • Physics-Constrained Reinforcement Learning: The RL problem is stated with physics-based restrictions to guarantee that solutions are both optimal and physically realistic. This entails limiting the range of potential solutions by taking into account constraints like signal area and bandwidth. Limiting the maximum number of numerical solver steps (N_max) needed to simulate quantum state dynamics is a crucial restriction. In actual experiments, this constraint encourages adiabatic quantum state dynamics, which results in slower, more resilient system changes that are less susceptible to leakage faults and more resilient to time-dependent noise.
  • Reward Shaping and Smoothness Penalties: Advanced reward functions are used to direct learning and make it easier to find smooth, experimentally realistic control signals. Smoothness is ensured by applying a Gaussian convolution filter on the control signals prior to simulation. By lowering the number of necessary solver steps, smoother waveforms greatly speed up simulation durations, provide a clearer explanation of quantum state evolution, and are simpler to implement experimentally. The reward function lowers rewards for undesirable non-coherent dynamics and penalises non-smooth signals. Realistic hardware capabilities are mirrored by constraining pulse amplitudes to begin and stop at zero.
  • Computational Efficiency and Parallelisation: For effective numerical computing and machine learning implementation, scientists used robust computational tools including the Julia programming language and packages like Differential Equations.jl and PyTorch. Moreover, synchronous parallel optimisation of several RL agents on a single GPU was made possible by JAX, which has just-in-time compilation and automatic differentiation. When combined with the N_max restriction, this parallelisation reduces processing bottlenecks and makes hyper-parameter research more effective. According to studies, GPU parallelisation can increase the pace of quantum simulations by up to two orders of magnitude every environment step.

You can also read Lipkin Meshkov Glick Model on Neutral Atom Quantum Computer

Demonstrated Success Across Quantum Systems

Three generally applicable quantum systems are used in the study to validate this limited RL approach:

  • Multi-level Lambda Systems:The RL methodology demonstrated robustness to dissipation and time-dependent noise, achieving nearly two orders of magnitude lower infidelity than previous methods for population transfer in these systems (common in quantum dots, atoms, and circuit quantum electro-dynamics). Interestingly, in contrast to several previously suggested strategies, the learnt pulses were physically feasible.
  • Rydberg Gates: Rydberg gates are essential for atomic quantum computers. By optimising Rydberg gates, RL was able to achieve greater fidelities at lower pulse energy and noise resilience that earlier methods were unable to achieve. A straightforward implementation of a C-Z gate, for example, obtained a faithfulness of 0.9996.
  • Superconducting Transmon Qubits: The technique found a new, physically feasible reset waveform for qubit reset, called Heaviside-Corrected Gaussian Square, or HCGS, which reached 0.9997 fidelity under practical bandwidth constraints, an order of magnitude higher reset fidelity than any prior work. High-fidelity unconditional reset on existing noisy intermediate-scale quantum devices is made possible by this waveform, which also makes experimental calibration easier.

Although SAC is specifically highlighted in the first news articles, the comprehensive research paper (from which the second set of excerpts is taken) shows that another potent RL algorithm, Proximal Policy Optimisation (PPO), proved especially useful for the intricate, constrained quantum control problems that were examined. In terms of mean fidelity and convergence speed, PPO continuously performed significantly better than other RL options such as DDPG and TD3. This demonstrates how RL frameworks have been more successful overall in these difficult situations.

You can also read Quantum Annealing Correction Tackles Spin-Glass Problems

Addressing Noise and Future Outlook

The effect of noise on RL agents was also investigated. By using feedback during the learning process, multi-step reinforcement learning systems demonstrated increased robustness under challenging noise environments. This capacity is important because effective description of real quantum systems necessitates the inclusion of non-unitary dynamics due to their intrinsic openness and noise.

A possible drawback of the physics-constrained RL implementation is that it might restrict the investigation of control techniques involving extremely quick and non-adiabatic quantum dynamics, even while it greatly improves computational efficiency and solution quality. Furthermore, its efficacy depends on precise quantum system modelling, which suggests that strong models would first need to be developed for complicated real-world devices or black-box devices.

You can also read Zuchongzhi 3.0 Quantum Computer Authority With 105 Qubits

Notwithstanding these drawbacks, this study marks a significant advancement in automating and enhancing quantum control. High-dimensional state spaces and multi-qubit systems will be studied to apply this paradigm to quantum systems. Building generalized quantum control policies that adapt to different qubits, studying adaptive constraint mechanisms, and combining reinforcement learning (RL) with Bayesian optimisation are future research priorities. In the end, verifying these answers on actual quantum hardware is essential to hastening the development of quantum technologies in practice.

Tags

Quantum computing reinforcement learningQuantum controlQuantum reinforcement learningReinforcement LearningReinforcement learning quantum computingSoft actor criticSoft actor critic pytorchSoft actor-critic

Written by

HemaSumanth

Myself Hemavathi graduated in 2018, working as Content writer at Govindtech Solutions. Passionate at Tech News & latest technologies. Desire to improve skills in Tech writing.

Post navigation

Previous: How Quantum Digital Twins Transform Quantum Memory System
Next: Tensor Networks quantum computing improves MPF Approaches

Keep reading

QbitSoft

Scaleway & QbitSoft Launch European Quantum Adoption Program

4 min read
USC Quantum Computing

USC Quantum Computing Advances National Security Research

5 min read
SuperQ Quantum Computing Inc. at Toronto Tech Week 2026

SuperQ Quantum Computing Inc. at Toronto Tech Week 2026

4 min read

Leave a Reply Cancel reply

You must be logged in to post a comment.

Categories

  • Scaleway & QbitSoft Launch European Quantum Adoption Program Scaleway & QbitSoft Launch European Quantum Adoption Program May 23, 2026
  • USC Quantum Computing Advances National Security Research USC Quantum Computing Advances National Security Research May 23, 2026
  • SuperQ Quantum Computing Inc. at Toronto Tech Week 2026 SuperQ Quantum Computing Inc. at Toronto Tech Week 2026 May 23, 2026
  • WISER and Fraunhofer ITWM Showcase QML Applications WISER and Fraunhofer ITWM Showcase QML Applications May 22, 2026
  • Quantum X Labs Integrates Google Data for Error Correction Quantum X Labs Integrates Google Data for Error Correction May 22, 2026
  • SEALSQ and IC’Alps Expand Post-Quantum Security Technologies SEALSQ and IC’Alps Expand Post-Quantum Security Technologies May 21, 2026
  • MTSU Events: Quantum Valley Initiative Launches with MTE MTSU Events: Quantum Valley Initiative Launches with MTE May 20, 2026
  • How Cloud Quantum Computers Could Become More Trustworthy How Cloud Quantum Computers Could Become More Trustworthy May 20, 2026
  • Quantinuum Expands Quantum Leadership with Synopsys Quantum Quantinuum Expands Quantum Leadership with Synopsys Quantum May 20, 2026
View all
  • QeM Inc Reaches Milestone with Q1 2026 Financial Results QeM Inc Reaches Milestone with Q1 2026 Financial Results May 23, 2026
  • Arqit Quantum Stock News: 2026 First Half Financial Results Arqit Quantum Stock News: 2026 First Half Financial Results May 22, 2026
  • Sygaldry Technologies Raises $139M to Quantum AI Systems Sygaldry Technologies Raises $139M to Quantum AI Systems May 18, 2026
  • NSF Launches $1.5B X-Labs to Drive Future Technologies NSF Launches $1.5B X-Labs to Drive Future Technologies May 16, 2026
  • IQM and Real Asset Acquisition Corp. Plan $1.8B SPAC Deal IQM and Real Asset Acquisition Corp. Plan $1.8B SPAC Deal May 16, 2026
  • Infleqtion Q1 Financial Results and Quantum Growth Outlook Infleqtion Q1 Financial Results and Quantum Growth Outlook May 15, 2026
  • Xanadu First Quarter Financial Results & Business Milestones Xanadu First Quarter Financial Results & Business Milestones May 15, 2026
  • Santander Launches The Quantum AI Leap Innovation Challenge Santander Launches The Quantum AI Leap Innovation Challenge May 15, 2026
  • CSUSM Launches Quantum STEM Education With National Funding CSUSM Launches Quantum STEM Education With National Funding May 14, 2026
View all
  • QTREX AME Technology May Alter Quantum Hardware Connectivity QTREX AME Technology May Alter Quantum Hardware Connectivity May 23, 2026
  • Quantum Spain: The Operational Era of MareNostrum-ONA Quantum Spain: The Operational Era of MareNostrum-ONA May 23, 2026
  • NVision Inc Announces PIQC for Practical Quantum Computing NVision Inc Announces PIQC for Practical Quantum Computing May 22, 2026
  • Xanadu QROM Innovation Ends Seven-Year Quantum Memory Stall Xanadu QROM Innovation Ends Seven-Year Quantum Memory Stall May 22, 2026
  • GlobalFoundries Quantum Computing Rise Drives U.S. Research GlobalFoundries Quantum Computing Rise Drives U.S. Research May 22, 2026
  • BlueQubit Platform Expands Access to Quantum AI Tools BlueQubit Platform Expands Access to Quantum AI Tools May 22, 2026
  • Oracle and Classiq Introduce Quantum AI Agents for OCI Oracle and Classiq Introduce Quantum AI Agents for OCI May 21, 2026
  • Kipu Quantum: Classical Surrogates for Quantum-Enhanced AI Kipu Quantum: Classical Surrogates for Quantum-Enhanced AI May 21, 2026
  • Picosecond low-Power Antiferromagnetic Quantum Switch Picosecond low-Power Antiferromagnetic Quantum Switch May 21, 2026
View all
  • Terra Quantum Quantum-Secure Platform for U.S. Air Force Terra Quantum Quantum-Secure Platform for U.S. Air Force May 23, 2026
  • Merqury Cybersecurity and Terra Quantum’s Secured Data Link Merqury Cybersecurity and Terra Quantum’s Secured Data Link May 23, 2026
  • ESL Shipping Ltd & QMill Companys Fleet Optimization project ESL Shipping Ltd & QMill Companys Fleet Optimization project May 23, 2026
  • Pasqals Logical Qubits Beat Physical Qubits on Real Hardware Pasqals Logical Qubits Beat Physical Qubits on Real Hardware May 22, 2026
  • Rail Vision Limited Adds Google Dataset to QEC Transformer Rail Vision Limited Adds Google Dataset to QEC Transformer May 22, 2026
  • Infleqtion Advances Neutral-Atom Quantum Computing Infleqtion Advances Neutral-Atom Quantum Computing May 21, 2026
  • Quantinuum News in bp Collaboration Targets Seismic Image Quantinuum News in bp Collaboration Targets Seismic Image May 21, 2026
  • ParityQC Achieves 52-Qubit Quantum Fourier Transform on IBM ParityQC Achieves 52-Qubit Quantum Fourier Transform on IBM May 21, 2026
  • PacketLight And Quantum XChange Inc Optical Network Security PacketLight And Quantum XChange Inc Optical Network Security May 21, 2026
View all
  • Quantum Computing Funding: $2B Federal Investment in U.S Quantum Computing Funding: $2B Federal Investment in U.S May 22, 2026
  • Quantum Bridge Technologies Funds $8M For Quantum Security Quantum Bridge Technologies Funds $8M For Quantum Security May 21, 2026
  • Nord Quantique Inc Raises $30M in Quantum Computing Funding Nord Quantique Inc Raises $30M in Quantum Computing Funding May 20, 2026
  • ScaLab: Advances Quantum Computing At Clemson University ScaLab: Advances Quantum Computing At Clemson University May 19, 2026
  • National Quantum Mission India Advances Quantum Innovation National Quantum Mission India Advances Quantum Innovation May 18, 2026
  • Amaravati Leads Quantum Computing in Andhra Pradesh Amaravati Leads Quantum Computing in Andhra Pradesh May 18, 2026
  • Wisconsin Technology Council Spotlights Quantum Industries Wisconsin Technology Council Spotlights Quantum Industries May 18, 2026
View all

Search

Latest Posts

  • Scaleway & QbitSoft Launch European Quantum Adoption Program May 23, 2026
  • Terra Quantum Quantum-Secure Platform for U.S. Air Force May 23, 2026
  • Merqury Cybersecurity and Terra Quantum’s Secured Data Link May 23, 2026
  • USC Quantum Computing Advances National Security Research May 23, 2026
  • QTREX AME Technology May Alter Quantum Hardware Connectivity May 23, 2026

Tutorials

  • Quantum Computing
  • IoT
  • Machine Learning
  • PostgreSql
  • BlockChain
  • Kubernettes

Calculators

  • AI-Tools
  • IP Tools
  • Domain Tools
  • SEO Tools
  • Developer Tools
  • Image & File Tools

Imp Links

  • Free Online Compilers
  • Code Minifier
  • Maths2HTML
  • Online Exams
  • Youtube Trend
  • Processor News
© 2026 Quantum Computing News. All rights reserved.
Back to top