--- title: "Gaussian VCMoE Tutorial" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Gaussian VCMoE Tutorial} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, setup, include=FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.width = 7, fig.height = 4.5, message = FALSE, warning = FALSE ) ``` This tutorial gives a compact Gaussian workflow for using `VCMoE`: simulation, fitting, diagnostics, coefficient plots, analytic simultaneous confidence bands, and bootstrap inference. ## Installation from GitHub Install the package from GitHub with `remotes`: ```{r install, eval=FALSE} install.packages("remotes") remotes::install_github("qc-zhao/VCMoE") ``` Then load the package: ```{r packages} library(VCMoE) library(ggplot2) ``` ## Gaussian model Simulate a small Gaussian data set with two latent components. The returned object contains the observed data and the true coefficient functions. ```{r gaussian-simulate} set.seed(1) sim <- simulate_vcmoe_gaussian( n = 180, k = 2, seed = 1, separation = 1.6, scenario = "well_separated" ) head(sim$data) ``` The model formula has two parts: - `y ~ z1` is the component-specific expert mean model; - `| x1` is the gating model for component probabilities. The varying coordinate is supplied through `u = "u"`. ```{r gaussian-fit} fit <- vcmoe_fit( y ~ z1 | x1, data = sim$data, u = "u", family = "gaussian", k = 2, bandwidth = 0.35, u_grid = seq(0.15, 0.85, length.out = 4), control = list(maxit = 60, n_starts = 1, seed = 2, warn_ambiguous = FALSE) ) fit ``` Expert coefficients are returned as an array indexed by grid point, component, and term. ```{r gaussian-coefficients} expert_coef <- coef(fit, "expert") dim(expert_coef) expert_coef[, , "z1"] ``` Predictions can be requested as marginal means, posterior component probabilities, or component-specific means. ```{r gaussian-predictions} head(predict(fit, type = "mean")) head(predict(fit, type = "posterior")) head(predict(fit, type = "component")) ``` ## Diagnostics and basic plots Always inspect diagnostics before interpreting coefficient functions. ```{r gaussian-diagnostics} diagnostics <- vcmoe_diagnostics(fit) diagnostics[, c("u", "converged", "ambiguous", "posterior_entropy", "effective_n")] ``` `plot_coefficients()` and `plot_posterior()` provide quick visual checks. ```{r gaussian-coefficient-plot} plot_coefficients(fit, "expert") ``` ```{r gaussian-posterior-plot} plot_posterior(fit) ``` ## Analytic simultaneous confidence bands `vcmoe_confband()` computes diagnostic-gated analytic-style Epanechnikov path bands. The returned object contains an interval table and grid-level diagnostics. Rows with `status != "ok"` are blocked because the local fit is too weak for the interval to be interpreted. ```{r gaussian-scb} band <- vcmoe_confband( fit, data = sim$data, level = 0.95, type = "simultaneous", coefficient_set = "expert", strict = FALSE ) band head(band$intervals[, c( "u", "component", "term", "block", "estimate", "lower", "upper", "status", "block_reason" )]) ``` For coefficient-function plots, use the local-linear intercept rows. The slope rows describe local derivative terms and are not the coefficient functions themselves. ```{r gaussian-scb-plot} scb_plot <- subset( band$intervals, coefficient_set == "expert" & block == "intercept" & status == "ok" ) ggplot(scb_plot, aes(x = u, y = estimate, color = component, fill = component)) + geom_ribbon(aes(ymin = lower, ymax = upper), alpha = 0.18, color = NA) + geom_line(linewidth = 0.8) + facet_wrap(~ term, scales = "free_y") + labs( x = "u", y = "coefficient", color = "component", fill = "component" ) + theme_minimal(base_size = 12) ``` ## Bootstrap inference Parametric bootstrap inference is also available. Each bootstrap replicate simulates a new response from the fitted mixture, refits the same VCMoE model, and aligns bootstrap component labels back to the reference fit. ```{r gaussian-bootstrap} boot <- vcmoe_bootstrap( fit, data = sim$data, B = 6, seed = 5, min_successful = 2, control = list(maxit = 40, n_starts = 1, warn_ambiguous = FALSE) ) boot head(confint(boot, parm = "expert", type = "simultaneous")) ``` `plot_inference()` visualizes bootstrap intervals directly. Here we request simultaneous bootstrap bands for the coefficient paths. ```{r gaussian-bootstrap-plot} plot_inference( boot, coefficient_set = "expert", type = "simultaneous", level = 0.95 ) ``` ## Optional bandwidth selection For real analyses, bandwidth should usually be selected rather than fixed by hand. The selector uses held-out predictive log-likelihood and returns a final refit by default. ```{r bandwidth, eval=FALSE} selection <- vcmoe_select_bandwidth( y ~ z1 | x1, data = sim$data, u = "u", family = "gaussian", k = 2, bandwidth_grid = c(0.25, 0.35, 0.45), folds = 3, u_grid = seq(0.15, 0.85, length.out = 4), control = list(maxit = 60, n_starts = 1, seed = 3), seed = 4 ) selection selection$best_bandwidth ```