ExpressibilityV0

Implementation of ExpressibilityV0 environment

Author: Jay Shah (@Jayshah25)

License: Apache-2.0

class qrl.env.core.expressibility.ExpressibilityV0(n_qubits=4, max_blocks=12, max_steps=20, n_pairs_eval=120, bins=50, lambda_depth=0.002, lambda_2q=0.002, terminate_bonus=0.1, device_name='default.qubit', seed=None, allow_all_to_all=False, ffmpeg=False)[source]

Bases: QuantumEnv

Parameterized circuit expressibility optimization environment.

ExpressibilityV0 is a gymnasium.Env-compatible environment that models the construction of parameterized quantum circuits with high expressibility. In the context of variational quantum algorithms, expressibility measures how well an ansatz can explore the Hilbert space of quantum states relative to the Haar-random distribution.

The agent incrementally builds a circuit by adding or removing predefined rotation and entangling blocks, or by explicitly terminating construction. Rewards encourage circuits whose fidelity distribution closely matches the Haar distribution, while penalizing excessive circuit depth and two-qubit gate usage.

Key properties

Action space: Discrete set of architectural edits (add/remove blocks or

terminate construction). - Observation space: Vector of circuit statistics summarizing depth, parameter count, entanglement, and recent expressibility estimates (shape (7,)). - Reward: Negative KL divergence to the Haar distribution with regularization penalties for depth and two-qubit gates. - Termination: Explicit termination by the agent or truncation at max_steps.

Rendering

The render() method visualizes expressibility optimization via a two-panel animation showing the circuit’s fidelity distribution compared to the Haar-random distribution alongside a block-level diagram of the evolving circuit architecture.

Input Parameters

n_qubitsint: Number of qubits in the circuit.
max_blocksint: Maximum number of blocks allowed in the circuit.
max_stepsint: Maximum number of construction steps per episode.
n_pairs_evalint: Number of random state pairs used to estimate expressibility.
binsint: Number of histogram bins for fidelity distributions.
lambda_depthfloat: Penalty weight for circuit depth.
lambda_2qfloat: Penalty weight for two-qubit gate usage.
terminate_bonusfloat: Bonus reward for explicit termination.
device_namestr: PennyLane device backend used for simulation.
seedint or None: Random seed for reproducibility.
allow_all_to_allbool: Whether to allow all-to-all entangling blocks.
ffmpegbool: Whether to use FFmpeg when saving animations.