Hierarchical ordinal model - for loop vs. plate

I have a set of users who rate a number of items. I have the features of the items as well as their rating, and I’d like to find the preference of each user for the identified features. Also, I would like to find an aggregated preferences. For this, I did as follows:

  1. I defined a vector w_star as the aggregated preferences (follows a Dirichlet dist.)
  2. I defined a vector w_i for each user, and assume that it is around w_star
  3. I used the cutting plane idea in the ordinal regression to compute the cutting planes as well as the preferences w.r.t to features.
  4. I ordered the input features X and the associated rating y according to the users, and the indices are stored in another variable intrvl. So, for example, the features and the rating of the first user is stored in indices intrvl[0]:intrvl[1].

Here is the model I developed in Numpyro by using for loop:

def PyroOrdinalHierarchical(X, y, nYlevels, intrvl, feature_no, user_no):

 w_star = numpyro.sample('w_star', dist.Dirichlet(.01 * np.ones(feature_no)))
gamma_star = numpyro.sample("gamma_star", dist.Gamma(0.01, 0.01))

for i in range(user_no):
    c_y = numpyro.sample("c_y" + str(i), dist.TransformedDistribution(dist.Normal(0, 1).expand([nYlevels - 1]),
                                                    dist.transforms.OrderedTransform()))
    numpyro.sample('c_y_smp'+str(i), dist.Normal(0,1), obs=c_y)

    w = numpyro.sample("w" + str(i), dist.Dirichlet(gamma_star*w_star))
    
   
    idx = range(intrvl[i],intrvl[i+1])

    eta =   X[idx] @ w
    numpyro.sample('Y'+ str(i), dist.OrderedLogistic(eta, c_y), obs=y[idx])

The model is very slow, I guess mainly because I used for loop and not numpyro.plate. But I do not know if I can develop this model by using plate since I need to have access to the index of the user.

Any help is highly appreciated!

1 Like

Hi @majidmohammadi, I think you can use mask to make sure that your arrays have consistent shapes. For example,

mask = idx_to_mask(intrvl)  # shape user_no x max(intrvl[1:] - intrvl[:-1]) =: user_no x m
X_reshape = pad_and_reshape(X)  # shape user_no x m x feature_no
y_reshape = pad_and_reshape(X)  # shape user_no x m 
with plate("user_no", user_no, dim=-2):
    c_y = ...
    w = ...
    with plate("m", m, dim=-1):
        eta = X_reshape @ w
        sample("Y", OrderedLogistic(eta, c_y).mask(mask), obs=y_reshape)

The idea is to get the max number of indices used for every users, then reshape X and y to that format (for each user, padding by 0 if the actual number of indices are less than that max number).

2 Likes

Thanks a lot for the quick reply @fehiepsi! I am not sure I fully understood the code you provided!

First of all, I did not get what ‘m’ is! Is it the overall number of data in X? if so, as far as I could understand the code your provided multiply all the data to w and then mask out the data points that belong to other uses. If it is the case, is it efficient to conduct the heavy matrix to vector multiplication for all data?

Further, if the above understanding is correct, why do we need to have another plate over ‘m’? can we just say that ‘eta’ as a vector should be used in the sampling of orderedLogistic?

Would also appreciate it if you could kindly explain further how to implement idx_to-mask.

m is the maximum number of indices for a user. Please see the comment in the first line of code for how to compute it. This is just a bit of preprocessing your data. You can just use the for loop for simplicity, kind of:
m = max(intrvl[1:]-intrvl[:-1])
mask = zeros(num_users, m)
for each i in num_users:
mask[i, :intrvl[i+1]-intrvl[i]] = True

There might be bugs in my interpretation but I hope you can get the idea.

@fehiepsi Thanks a lot! I got the idea on how to implement it

There is still a minor problem I could not find a workaround for. Specifically, the following line runs into an error:

because X_reshape is of size (user_no,m,feature_no), but w is of size (user_no, 1, feature_no). And I run into the following error:

TypeError: dot_general requires contracting dimensions to have the same shape, got [feature_no] and [1].

I need to change the size of w to (user_no,feature_no,1) and that should resolve the problem. I did the following, but does not work:

w = numpyro.sample("w", dist.Dirichlet(gamma_star*w_star), sample_shape=(feature_no,1))

Is there anyway to specific the size of w?

I think you can do: eta = (X_reshape * w).sum(-1) instead of using @.

1 Like

@fehiepsi Thanks a lot! it works now :slight_smile: