Good questions! If you're new to bayesian methods in general, i recommend reading the SVI tutorials to get a basic understanding first.
What is the point of defining 2 sets of parameters (in model and guide)?
by parameters, i assume you mean parameters of the priors (correct me if im wrong). the gaussian distributions in the model are our priors p(z). the analogous distributions in the guide are our approximating distributions q(z). we sample from q(z) when running VI to calculate the ELBO (see below).
Why is the parameter in model drawn from a constant distribution
not sure what you mean by this - we initialize the parameters for the gaussian priors, then used those priors and place them over the parameters of our neural net module.
Why do we need argument for the guide function if its never reference during training and inference?
good catch, it actually isn't used in the guide in this case, but pyro requires both model and guides to have the same type signature.
in general, in SVI, you are trying to minimize the kl divergence between your approximating distribution and the true [unknown] posterior. that's the purpose of the guide. your model is modeling the latents to the data you observe.