bppy.gym package

Submodules

bppy.gym.bp_action_space module

class bppy.gym.bp_action_space.BPActionSpace(event_list)[source]

Bases: Discrete

Defines a custom action space for BPrograms, which inherits from the Discrete space class. The actions correspond to selectable events in the BProgram, and this class helps to manage and validate those actions within the environment.

Attributes

event_listlist: A list of possible events in the BProgram.

contains(x)[source]

Checks whether the provided value corresponds to a valid action.

Parameters

xint, np.generic, or np.ndarray: The value to check for validity.

Returns

validbool: True if the value is a valid action, False otherwise.

sample()[source]

Randomly samples an action from the set of possible actions.

Returns

actionint: The index of the randomly selected action.

bppy.gym.bp_env module

class bppy.gym.bp_env.BPEnv(bprogram_generator, action_list, observation_space=None, action_space=None, reward_function=None)[source]

Bases: Env

Custom environment that adheres to OpenAI’s Gym interface for simulating the behavior of a BProgram. This class is useful for integrating BPrograms into reinforcement learning algorithms.

Attributes

action_spaceBPActionSpace: Defines the space of possible actions, i.e., events.
observation_spaceBPObservationSpace: Defines the space of possible observations.
reward_functionfunction, optional: A custom function to compute the reward.

close()[source]: Closes the environment, releasing any resources.

get_state()[source]: Returns the current state of the environment.

render(mode='human')[source]: Not implemented for this environment.

reset(seed=None, options=None)[source]

Resets the environment to its initial state.

Parameters

seedint, optional: A seed for the random number generator.
optionsdict, optional: Additional options for resetting the environment.

Returns

observationobject: The initial state of the environment.
infodict: Not used for this environment.

step(action)[source]

Executes the given action and returns the resulting observation, reward, done flag, and additional information.

Parameters

actionint: An index representing the event to be executed.

Returns

observationobject: The state of the environment after executing the action.
rewardfloat: The reward obtained by executing the action.
donebool: Whether the episode has ended.
truncatedbool: Not used for this environment.
infodict: Additional information for debugging.

bppy.gym.bp_observation_space module

class bppy.gym.bp_observation_space.BPObservationSpace(nvec: NDArray[np.integer[Any]] | list[int], dtype: str | type[np.integer[Any]] = <class 'numpy.int64'>, seed: int | np.random.Generator | None = None)[source]

Bases: MultiDiscrete

A base class used to represent a BProgram-based observation space. This is an abstract class that requires the implementation of bp_state_to_gym_space methods.

bp_state_to_gym_space(bthreads_states)[source]: Abstract method that transforms the bprogram’s state, received as a list of bthreads statements, to a gym space representation.

contains(x)[source]: Return boolean specifying if x is a valid member of this space.

from_jsonable(sample_n)[source]: Convert a JSONable data type to a batch of samples from this space.

property np_random

Lazily seed the PRNG since this is expensive and only needed if sampling from this space.

As seed() is not guaranteed to set the _np_random for particular seeds. We add a check after seed() to set a new random number generator.

sample()[source]

Generates a single random sample this space.

Args:

mask: An optional mask for multi-discrete, expects tuples with a np.ndarray mask in the position of each: action with shape (n,) where n is the number of actions and dtype=np.int8. Only mask values == 1 are possible to sample unless all mask values for an action are 0 then the default action 0 is sampled.

Returns:

An np.ndarray of shape space.shape

seed(seed=None)[source]: Seed the PRNG of this space and possibly the PRNGs of subspaces.

property shape: Has stricter type than gym.Space - never None.

to_jsonable(sample_n)[source]: Convert a batch of samples from this space to a JSONable data type.

bppy.gym.simple_bp_observation_space module

class bppy.gym.simple_bp_observation_space.SimpleBPObservationSpace(bprogram_generator, event_list, dtype=<class 'numpy.int64'>, seed=None)[source]

Bases: BPObservationSpace

Defines a simple observation space for BPrograms that inherit from BPObservationSpace. This class is responsible for mapping the state of the BProgram’s bthreads to a Gym-compatible observation space.

Attributes

bprogram_generatorcallable: A function that generates an instance of the BProgram.
event_listlist: A list of possible events in the BProgram.

bp_state_to_gym_space(bthreads_states)[source]

Converts the state of the BProgram’s bthreads to the Gym-compatible observation space.

Parameters

bthreads_stateslist: A list representing the bthreads current statements.

Returns

gym_spacenp.ndarray: An array representing the Gym-compatible observation space.

compute_bthread_state_space()[source]

bppy.gym package

Submodules

bppy.gym.bp_action_space module

Attributes

Parameters

Returns

Returns

bppy.gym.bp_env module

Attributes

Parameters

Returns

Parameters

Returns

bppy.gym.bp_observation_space module

bppy.gym.simple_bp_observation_space module

Attributes

Parameters

Returns

Module contents