Sure! The code is dependent on a bunch of procedures in other files, but I can show you the part that is producing the error:
def solve(h, num_samp, bmdp, belief, state):
action_dist = dist.Categorical(ps=Variable(torch.FloatTensor([1/len(bmdp.pomdp.actions) for _ in bmdp.pomdp.actions])),vs=list(range(len(bmdp.pomdp.actions))))
qs = [bmdp.reward_func(belief, a) for a in bmdp.pomdp.actions]
for i in range(len(qs)):
marg = pyro.infer.Marginal(pyro.infer.Importance(sample_reward, num_samples=num_samp))
exp_val = 1/num_samp*sum([marg(h-1,bmdp,belief,action_dist,state) for _ in range(num_samp)])
qs[i] += exp_val
return qs[np.argmax(np.array([q[1] for q in qs]))][0]
The above call on marg leads to the issue. Also for reference here is the sample_reward function:
def sample_reward(h, bmdp, belief, action_dist, state):
if h == 0:
return 0
act_ind = pyro.sample('act', action_dist)[0]
# print(act_ind)
act = bmdp.pomdp.actions[act_ind]
new_bel, state = bmdp.bel_sampler(belief, act, state)
return bmdp.reward_func(belief, act) + (bmdp.pomdp.disc*sample_reward(h-1,bmdp,new_bel,action_dist,state) if state != 'complete' else 0)
Thanks for your help!
I’m not sure why the formatting messed up my indentation is solve. I have the proper indents in my actual code
So one thing I noticed after some investigation is that the marginal seems to want Torch Variable(FloatTensors()) as input. In fact I can’t seem to get these programs to run otherwise. Does everything that I pass to the marginal have to be a Variable wrapping a Tensor? Because I could make that edit but it would require some substantial re-structuring of previous code. I want to be sure that that is necessary before diving in. Also how should I wrap things that aren’t floats, ints, and arrays i.e. objects?