在OpenAI体育馆中是否可以将状态保持为隐藏状态,并且仅使某些变量对玩家可见?

问题描述

我想创建一个游戏,在游戏中玩家的行为和奖励/后果都有提前时间,因此,我不想与玩家完全分享观察,但仍然坚持下去,因为这是需要的为了未来。有办法吗? 如果我在init中创建变量并对其进行更新,则该变量对于游戏的每个实例都是可见的,因此玩家已经比我所知道的要了解得多。

解决方法

您对Cartpole的需求的一个大概例子是这样的:

import gym
from gym.utils import seeding
import numpy as np


class myEnv(gym.Env):
    def __init__(self,*args,**kwargs):
        """
            Define all the necessary stuff here
        """
        self.env = gym.make('CartPole-v1') # add stuff here to define game params
        self.action_space = self.env.action_space
        self.observation_space = self.env.observation_space
        self.past_actions = []
        self.delay = 2  # to have a delay of two timesteps

    def reset(self):
        """
            Define the reset
        """
        self.observation = self.env.reset()
        return self.observation

    def step(self,action):
        """
            Add the delay of actions here
        """
        self.past_actions.append(action)    # to keep track of actions
        reward = 0; done = 0; info = {}     # reward,done and info are 0,{} for first two timesteps
        if len(self.past_actions) > self.delay:
            present_action = self.past_actions.pop(0)
            # change observation,reward,done,info 
            # according to the action 'delay' timesteps ago
            self.observation,info = self.env.step(present_action)
        return self.observation,info

    def seed(self,seed=0):
        """
            Define seed method here
        """
        self.np_random,seed = seeding.np_random(seed)
        return self.env.seed(seed=seed)

    def render(self,mode="human",**kwargs):
        """
            Define rendering method here
        """
        return self.env.render(*args,**kwargs)
    
    def close(self):
        """
            Define close method here
        """
        return self.env.close()