I’ve been looking into horseshoe regression for the purpose of model selection.
In my case, I have several equations, for each of which I would like to independently “select”/ shrink towards the best model.
For instance, one of the equations could be (in the linear case):
Blood_pressure = a1 * BMI + a2 * Physical_inactivity + a3 * Smoking
In this case I’m quite confident about the directions that a1-3 should have, so I might use an informative prior, like an inverse gamma prior that is mostly consistent with positive values.
but I also want to include possible (2nd order) interaction terms, so the fullest model would be:
Blood_pressure = a1 * BMI + a2 * Physical_activity + a3 * Smoking + a4 * BMI * Physical_activity + a5 * BMI * Smoking + a6 * Physical_activity * Smoking.
For these interaction terms (a4-a6), I would like to use horseshoe priors that are centered at zero.
First: Is this a good way of conducting model selection and selecting the interaction terms?
Second: Should I simply use horseshoe priors for all parameters? It would be nice if I can somehow encode my confidence in positive values.
Third: I would like to combine these equations later on into a system of equations. In this case, it would make sense to drop any terms that have highly shrunk parameters. Is there any criterion I could use to exclude a term from consequent analyses (e.g., the 91% credibility interval is between [-0.01, 0.01] or somethingp; or is this dropping of terms against the Bayesian model selection spirit ?