ErrorChannelV0

Implementation of ErrorChannelV0 environment

Author: Jay Shah (@Jayshah25)

Contact: jay.shah@qrlqai.com

License: Apache-2.0

class qrl.env.core.error_channel.ErrorChannelV0(n_qubits=3, faulty_qubits=None, max_steps=10, seed=None, ffmpeg=False)[source]

Bases: QuantumEnv

Multi-qubit error mitigation environment with bit-flip noise.

ErrorChannelV0 is a gymnasium.Env-compatible environment that models a noisy multi-qubit quantum system affected by independent bit-flip error channels. Each qubit may experience noise with a different probability, and the agent’s task is to apply corrective Pauli-X operations to recover the target computational basis state |0…0⟩.

The environment captures a simplified quantum error mitigation scenario, where the agent sequentially selects qubits on which to apply corrections based on observed measurement probabilities.

Key properties

  • Action space: Discrete choice of qubit index on which to apply an X gate.

  • Observation space: Probability distribution over all computational basis

states (shape (2**n_qubits,)). - Reward: Negative mean-squared error between the corrected distribution and the ideal |0…0⟩ distribution. - Termination: Success when perfect correction is achieved or truncation at max_steps.

Rendering

The render() method visualizes the mitigation process using a side-by-side animation that compares ideal, noisy, and corrected probability distributions, along with a dynamically updated circuit diagram showing applied corrections.

Input Parameters

n_qubitsint

Number of qubits in the system.

faulty_qubitsdict[int, float] or None

Mapping from qubit indices to bit-flip probabilities.

max_stepsint

Maximum number of correction steps per episode.

seedint or None

Random seed for reproducibility.

ffmpegbool

Whether to use FFmpeg to save animations as ,p4 or save it as GIFs with Pillow.

See also

tutorials/error_channel

Tutorial on multi-qubit error mitigation with bit-flip noise.

get_reward(action)[source]

Apply a correction action and compute the reward.

The selected qubit index is appended to the correction list, the noisy and corrected circuits are evaluated, and the reward is computed as the negative mean-squared error between the corrected and target probability distributions.

Parameters:

action (int) – Index of the qubit on which a Pauli-X correction is applied.

Returns:

Reward value defined as the negative mean-squared error between the corrected probability distribution and the target distribution.

Return type:

float

render(save_path_without_extension=None, interval_ms=600)[source]

Render the error-mitigation process as an animated visualization.

The animation consists of: - A bar chart comparing ideal, noisy, and corrected probability distributions for each computational basis state. - A dynamically updated ASCII-style circuit diagram showing the applied correction operations.

Parameters:
  • save_path_without_extension (str or None, optional) – Path (without file extension) to save the animation. If provided, the animation is saved using the configured writer (MP4 for FFmpeg or GIF for Pillow). If None, the animation is displayed interactively.

  • interval_ms (int, optional) – Time between animation frames in milliseconds. Default is 600.

Returns:

This method produces a visualization but does not return a value.

Return type:

None

reset(*, seed=None)[source]

Reset the environment to the initial noisy state.

Clears the correction history, resets the step counter, and evaluates the noisy circuit without any corrective operations.

Parameters:

seed (int or None, optional) – Random seed for reproducibility. If provided, reinitializes the internal random number generator.

Returns:

observation – Initial corrected probability distribution over computational basis states (identical to the noisy distribution at reset), with dtype float32.

Return type:

np.ndarray

step(action)[source]

Execute one environment step.

Applies a correction action, updates internal state and history, computes the reward, and checks termination conditions.

Parameters:

action (int) – Index of the qubit on which a Pauli-X correction is applied.

Returns:

  • observation (np.ndarray) – Corrected probability distribution over computational basis states, with dtype float32.

  • reward (float) – Negative mean-squared error between the corrected and target distributions.

  • done (bool) – True if the episode has terminated due to reaching the maximum number of steps or achieving perfect correction.

  • info (dict) – Dictionary containing metadata about the environment, including the mapping of faulty qubits.