bppy.gym package
Submodules
bppy.gym.bp_action_space module
- class bppy.gym.bp_action_space.BPActionSpace(event_list)[source]
Bases:
DiscreteDefines a custom action space for BPrograms, which inherits from the Discrete space class. The actions correspond to selectable events in the BProgram, and this class helps to manage and validate those actions within the environment.
Attributes
- event_listlist
A list of possible events in the BProgram.
bppy.gym.bp_env module
- class bppy.gym.bp_env.BPEnv(bprogram_generator, action_list, observation_space=None, action_space=None, reward_function=None)[source]
Bases:
EnvCustom environment that adheres to OpenAI’s Gym interface for simulating the behavior of a BProgram. This class is useful for integrating BPrograms into reinforcement learning algorithms.
Attributes
- action_space
BPActionSpace Defines the space of possible actions, i.e., events.
- observation_space
BPObservationSpace Defines the space of possible observations.
- reward_functionfunction, optional
A custom function to compute the reward.
- reset(seed=None, options=None)[source]
Resets the environment to its initial state.
Parameters
- seedint, optional
A seed for the random number generator.
- optionsdict, optional
Additional options for resetting the environment.
Returns
- observationobject
The initial state of the environment.
- infodict
Not used for this environment.
- step(action)[source]
Executes the given action and returns the resulting observation, reward, done flag, and additional information.
Parameters
- actionint
An index representing the event to be executed.
Returns
- observationobject
The state of the environment after executing the action.
- rewardfloat
The reward obtained by executing the action.
- donebool
Whether the episode has ended.
- truncatedbool
Not used for this environment.
- infodict
Additional information for debugging.
- action_space
bppy.gym.bp_observation_space module
- class bppy.gym.bp_observation_space.BPObservationSpace(nvec: NDArray[np.integer[Any]] | list[int], dtype: str | type[np.integer[Any]] = <class 'numpy.int64'>, seed: int | np.random.Generator | None = None)[source]
Bases:
MultiDiscreteA base class used to represent a BProgram-based observation space. This is an abstract class that requires the implementation of bp_state_to_gym_space methods.
- bp_state_to_gym_space(bthreads_states)[source]
Abstract method that transforms the bprogram’s state, received as a list of bthreads statements, to a gym space representation.
- from_jsonable(sample_n)[source]
Convert a JSONable data type to a batch of samples from this space.
- property np_random
Lazily seed the PRNG since this is expensive and only needed if sampling from this space.
As
seed()is not guaranteed to set the _np_random for particular seeds. We add a check afterseed()to set a new random number generator.
- sample()[source]
Generates a single random sample this space.
- Args:
- mask: An optional mask for multi-discrete, expects tuples with a np.ndarray mask in the position of each
action with shape (n,) where n is the number of actions and dtype=np.int8. Only mask values == 1 are possible to sample unless all mask values for an action are 0 then the default action 0 is sampled.
- Returns:
An np.ndarray of shape space.shape
- property shape
Has stricter type than
gym.Space- never None.
bppy.gym.simple_bp_observation_space module
- class bppy.gym.simple_bp_observation_space.SimpleBPObservationSpace(bprogram_generator, event_list, dtype=<class 'numpy.int64'>, seed=None)[source]
Bases:
BPObservationSpaceDefines a simple observation space for BPrograms that inherit from
BPObservationSpace. This class is responsible for mapping the state of the BProgram’s bthreads to a Gym-compatible observation space.Attributes
- bprogram_generatorcallable
A function that generates an instance of the BProgram.
- event_listlist
A list of possible events in the BProgram.
- bp_state_to_gym_space(bthreads_states)[source]
Converts the state of the BProgram’s bthreads to the Gym-compatible observation space.
Parameters
- bthreads_stateslist
A list representing the bthreads current statements.
Returns
- gym_spacenp.ndarray
An array representing the Gym-compatible observation space.