glmnetUI is a graphical user interface for the R glmnet
package, which fits regularized generalized linear models via the Lasso,
ridge, and elastic-net penalties. It offers three purpose modes—general
predictive modeling, real-estate appraisal, and market-area analysis—and
guides the user through data import, model configuration,
cross-validated fitting, coefficient paths, and downloadable reports.
This article documents glmnetUI’s data-format requirements, modeling
workflow, output displays, and complete feature reference.
The (east bsolute hrinkage and election perator) was introduced by Robert Tibshirani in 1996. The name is a deliberate play on the cowboy’s lasso — a rope that constrains and selects. Tibshirani chose it because the method literally “lassos” the coefficients, constraining their absolute values and pulling some to exactly zero, thereby selecting which variables remain in the model.
Tibshirani’s insight was to add an \(L_1\) penalty (sum of absolute values of coefficients) to the least squares objective. Unlike the \(L_2\) penalty of ridge regression (Hoerl & Kennard, 1970), which shrinks coefficients toward zero but never reaches it, the \(L_1\) penalty produces solutions where many coefficients are exactly zero. This makes the Lasso both a regularization method and a variable selection method.
was proposed by Arthur Hoerl and Robert Kennard in 1970. It adds an \(L_2\) penalty (sum of squared coefficients) which shrinks all coefficients proportionally but keeps all variables in the model. Ridge is effective when predictors are correlated (multicollinearity) but does not simplify the model.
The , introduced by Hui Zou and Trevor Hastie in 2005, combines both penalties. The mixing parameter \(\alpha\) controls the blend: \(\alpha = 1\) is pure Lasso, \(\alpha = 0\) is pure ridge, and values in between give a compromise. The elastic net overcomes a limitation of the Lasso: when predictors are highly correlated, the Lasso tends to select one and ignore the rest, whereas the elastic net groups correlated predictors together.
The R package was developed by Jerome Friedman, Trevor Hastie, and Robert Tibshirani at Stanford University. It implements elastic net regularization for generalized linear models using an extremely efficient coordinate descent algorithm (Friedman et al. 2010), since extended to Cox proportional-hazards models (Simon et al. 2011) and to all generalized linear model families (Tay et al. 2023). The package handles gaussian (linear), binomial (logistic), poisson, and other GLM families.
Key features of include:
glmnetUI, earthUI, and mgcvUI are companion applications that use different modeling engines. The following table summarizes the key differences:
(Multivariate Adaptive Regression Splines) was introduced by Jerome Friedman in 1991. It builds piecewise linear models by adaptively selecting hinge functions \(\max(0, x - k)\) and their knot positions from the data. The R implementation is the package by Stephen Milborrow. MARS excels at automatically discovering nonlinear relationships and interactions without requiring the user to specify them. Its g-functions (grouped basis terms per variable) provide highly interpretable partial effect curves.
(Generalized Additive Models) were formalized by Hastie and Tibshirani (1990). The R implementation by Simon Wood provides penalized regression splines with automatic smoothness selection via REML or GCV. GAMs produce smooth partial effect curves that are intuitive to visualize and explain. When earthUI exports knot locations to mgcvUI, the earth-derived knots serve as starting points for GAM smooth terms, combining MARS’s adaptive knot placement with GAM’s smooth estimation.
For real estate appraisal and similar applications requiring interpretable, defensible models:
glmnetUI is a graphical user interface for the R package. It runs as a local Shiny application — there is no login, no server, and no accounts. You launch it from R, import a dataset (CSV or Excel), configure your model, and fit it interactively.
The application provides a complete workflow: data import, variable configuration, model fitting, diagnostic plots, variable importance, model equations, and downloadable reports in Word, PDF, or HTML format.
Elastic net regression combines two regularization techniques:
When you launch glmnetUI, a Purpose radio button at the top of the sidebar lets you choose one of three modes:
In all three modes, the core modeling engine is identical — you are always fitting an elastic net (glmnet) model. The purpose setting controls which additional tools and interface elements are available.
When either For Appraisal or Market Area Analysis is selected, glmnetUI activates several features designed for real estate analysis:
contract_date,
dom, concessions, latitude,
longitude, living_area, lot_size,
actual_age, effective_age, area,
site_dimensions, or display_only. These
designations control how the column is handled during fitting and
output.contract_date and an Effective Date is provided, glmnetUI
computes a sale_age column (days between sale date and
effective date) and substitutes it as a predictor.To use glmnetUI:
install.packages("glmnetUI") or install from source.glmnetUI::glmnetUI() in R. The app opens in your web
browser on port 7879. You can also access the app directly by navigating
to http://localhost:7879 in your browser. The app remembers
your last-used purpose mode and restores it automatically..rds file to use earth’s hinge
basis functions with glmnet’s regularization. Skip this step if you do
not have an earthUI result.Settings are automatically persisted in your browser’s local storage and restored when you reload the same input file.
For real estate appraisal and market analysis workflows, your input data typically comes from a Multiple Listing Service (MLS) export. This chapter describes the expected file structure and the columns that glmnetUI can use.
glmnetUI accepts CSV and Excel
(.xlsx, .xls) files. On import, column names
are automatically converted to snake_case — for example,
“Living SqFt” becomes living_sqft, “Contract Date” becomes
contract_date, and “Sale Price” becomes
sale_price. This normalization ensures consistent column
references throughout the workflow. The CSV separator and decimal mark
used during import are determined by the locale settings (see Chapter 3,
“Locale & Regional Settings”).
Your data file should be a flat table with one row per property and one column per attribute. The first row of the file must contain column headers.
While glmnetUI works with any set of columns, the full appraisal workflow benefits from having the following columns:
Spreadsheet column names can be in a foreign language — the “special” names are in English so that the R program can give them special treatment. Otherwise, the given column names show up in the regression models, graphs and, if doing appraisals, the output reports.
Not all columns are required. glmnetUI adapts — if a column is missing, the corresponding feature is simply omitted. However, for real estate pricing models certain columns are highly recommended to achieve acceptable fit:
Sale Age — the number of days between the
contract sale date and the effective date of the appraisal or analysis.
If multi-year sales history is being used, especially for periods over 5
years, sale_age often plays a central role in estimating
the sale price.
Living Area — also goes by names such as “Living Sqft,” “GLA” (gross living area) and so on. This is another leading determinant of sale price.
Total Bath Count — the total number of full, quarter, half, and 3/4 bathrooms. For example, two full baths and one half-bath would be a value of 2.5.
Garage Bays or Garage Area — the number of garage spaces or the garage square footage.
Lot Size — the land area of the property, typically in square feet or acres.
Longitude, Latitude, and if available Area ID. These location variables help the model account for geographic price variation.
glmnetUI identifies columns by their special type designation, not by their column name. You can name your columns anything you like in the MLS export — what matters is that you assign the correct special type in the Variable Configuration table (Chapter 6).
For example, your MLS might export living area as “GLA”, “Living
SqFt”, “liv_area”, or “gross_living_area”. After import (where it
becomes snake_case), you simply designate it as living_area
in the Special dropdown. glmnetUI will then use it for per-SF residual
calculations regardless of its original name.
In Appraisal mode, row 1 must be the subject property. All remaining rows are comparable sales. The subject row is excluded from model fitting. After fitting, the model still generates predictions for the subject row.
In Market Area Analysis mode, placing the subject in row 1 is optional.
In General mode, there is no special row handling — all rows are treated equally.
General Purpose mode is the default when you launch glmnetUI. It provides the complete elastic net regression workflow for any dataset — not just real estate. You can use glmnetUI for scientific data, financial analysis, engineering studies, or any regression problem.
In General mode, the interface omits the real estate–specific features (special columns, sale age, coordinate rounding, RCA). The sidebar is streamlined to focus on variable selection, parameter configuration, model fitting, and export.
A Skip first row checkbox appears below the Purpose selector (in General and Market modes). When checked, row 1 is excluded from model fitting. This is useful when row 1 contains a target or reference observation that should not influence the model. In Appraisal mode, row 1 (the subject property) is always excluded automatically.
Settings (predictor selections, model parameters, interactions) are saved separately for each combination of input file and purpose mode. Switching between General, Appraisal, and Market modes preserves each mode’s settings independently.
The sidebar is organized into numbered, collapsible sections that guide you from data import through export:
1. Import Data — File upload accepting CSV and Excel files. For Excel files with multiple sheets, a sheet selector appears. Column names are automatically converted to snake_case.
2. Import from earthUI (optional) — Import an earthUI result file to use earth’s hinge basis functions with glmnet’s elastic net regularization. The browse button is styled to match the Section 1 file upload. This step is optional — skip it if you do not have an earthUI result to import.
3. Project Output Folder — A text field specifying
where downloads are saved (defaults to ~/Downloads).
4. Variable Configuration — Target variable selector, predictor table with checkboxes for Include, Force, and Sign. The Special column (for designating contract dates, coordinates, etc.) appears only in Appraisal and Market modes. See Chapter 6 for full details.
5. glmnet Call Parameters — All model configuration: alpha, lambda, family, standardize, sign constraints, relaxed lasso, interaction matrix, and advanced parameters (lambda.min.ratio, nlambda, CV loss metric, convergence threshold, max iterations, intercept). See Chapter 7 for the complete parameter reference.
6. Fit Glmnet Model — The button that runs the model.
7. Download Output — Exports predictions, residuals, CQA scores, and per-variable contributions as an Excel file. Available in all purpose modes.
8. Download Report — Generates a formatted report (Word, PDF, or HTML) saved to the output folder. See Chapter 13. Steps 8–9 (RCA Adjustments and Sales Grid) appear only in Appraisal mode; in General mode the report step is numbered 8.
Section 2 of the sidebar — Import from earthUI — lets you import an earthUI result file. This allows glmnetUI to use earth’s hinge basis functions (piecewise linear terms) with glmnet’s elastic net regularization, combining earth’s automatic nonlinearity detection with glmnet’s variable selection and coefficient constraints.
The browse button is styled to match the Section 1 file upload. This step is entirely optional — if you do not have an earthUI result file, simply skip Section 2 and proceed to Section 3.
After fitting, the main panel provides the following tabs:
glmnetUI automatically saves your configuration to the browser’s local storage, keyed by the input filename. When you reload the same file, all settings are restored: target selection, predictor checkboxes, data types, expected signs, glmnet parameters, and interaction matrix. The last-used purpose mode is also persisted globally and restored when the app is relaunched. Three options are available:
A Save current as default button saves all current settings as the global default.
Click the moon/sun icon in the upper-right corner to toggle between light and dark themes. The theme preference is saved in local storage and persists across sessions. All UI elements (tables, plots, cards, buttons) adapt to the selected theme via CSS variables.
glmnetUI supports international number and CSV formatting conventions through a country-based locale system. The Settings dropdown in the title bar provides Country and Paper selectors for 31 supported countries. Each preset configures:
,) for
US/UK/Japan or semicolon (;) for most of Europe, where the
comma is used as a decimal mark..) or comma
(,).Click Save as my default to store your locale preferences globally. These defaults apply to all future sessions regardless of which data file you load.
When you select For Appraisal as the Purpose, glmnetUI configures itself for single-property valuation. All features described in Chapter 3 remain available; this chapter covers only the appraisal-specific additions.
In appraisal mode, row 1 of your dataset is the subject property and all remaining rows are comparable sales. Your input file must be organized accordingly (see Chapter 2). The subject’s sale price can be left blank or set to any value — glmnetUI automatically treats it as NA during fitting.
After importing, the Data Preview tab splits into two sections: Subject Property (row 1) and Comparable Sales (rows 2+). Row 1 is always excluded from model fitting. After fitting, the model still generates predictions for the subject row.
In appraisal and market modes, an Effective Date
field appears in the Variable Configuration section (defaulting to
today’s date). If you designate a column as contract_date
in the Special column dropdown, glmnetUI computes a
sale_age column — the number of integer days between each
sale’s contract date and the effective date. This column is added as a
predictor.
When the Effective Date changes, sale_age is
automatically recomputed.
In appraisal and market modes, a Special dropdown appears for each predictor in the Variable Configuration table. See Chapter 6 for the complete reference of special types and their effects.
The Calculate RCA Adjustments & Download button (sidebar section 8, visible only in appraisal mode after fitting) computes market-derived adjustments for each comparable relative to the subject. The full RCA workflow is described in Chapter 11.
When you select Market Area Analysis as the Purpose, glmnetUI provides the same real estate–specific features as appraisal mode (special columns, sale age, coordinate rounding) but is oriented toward analyzing a group of properties rather than valuing a single subject.
Market Area Analysis mode is appropriate when you are:
Section 4 of the sidebar — Variable Configuration — is where you choose which columns participate in the model and how they are treated.
The Target (response) variable dropdown at the top of Section 3 lists every column in your dataset. Select one column as the response variable (e.g., sale price). The target column is automatically excluded from the predictor list.
Below the target selector, a table lists every remaining column with the following fields:
| Column | Description |
|---|---|
| Variable | Column name |
| Type | Data type dropdown: numeric,
integer, character, logical,
factor, Date, POSIXct |
| Inc | Checkbox — include this column as a predictor in the model |
| Force | Checkbox — force into model (penalty factor = 0, never dropped by lasso) |
| Special | Dropdown (appraisal/market only) — see Special Column Types Reference below |
| Sign | Expected coefficient sign: positive, negative, or either |
| NAs | Count of missing values |
glmnetUI automatically detects data types on import. Numeric,
integer, logical, factor, and date columns are recognized. Character
columns that look like dates (common date format patterns) are
classified as Date.
You can override any detection by changing the Type dropdown. Changing types affects how the column is encoded in the model matrix.
For each predictor, the Sign dropdown specifies whether you expect its coefficient to be positive, negative, or either direction. After fitting:
When the Enforce Sign Constraints checkbox is
enabled in the glmnet parameters (Chapter 7), the model is forced to
produce coefficients matching the expected signs. This is implemented
via upper.limits and lower.limits in
glmnet.
In appraisal and market modes, the Special dropdown provides the following options:
Weighting:
weight — Observation weight column (available in all
modes; only one allowed; rows with weight = 0 are excluded from
fitting)Date & Time Types:
contract_date — Triggers automatic
sale_age computation from the Effective Datelisting_date — Used as a fallback for computing Days on
Marketdom — Identifies the Days on Market columnMonetary Types:
concessions — Identifies sale concessions (seller
credits, buyer incentives, etc.)Size & Location Types:
latitude — Values automatically rounded to 3 decimal
placeslongitude — Same rounding treatment as latitudearea — Market area or neighborhood identifier (e.g.,
area_id, neighborhood, market_area)living_area — Enables per-square-foot residual
calculations (residual_sf and cqa_sf)lot_size — Site size columnsite_dimensions — Grouped with lot size (e.g.,
“75x120”)Age Types:
actual_age — Property age columneffective_age — Effective property ageDisplay Types:
display_only — Column is included in Excel exports but
excluded from model fitting entirely. Use this for address fields, MLS
numbers, parcel IDs, or other reference data. Multiple columns can have
this designation.Section 5 of the sidebar — glmnet Call Parameters — provides access to all configuration options for the elastic net model. Each parameter has a blue help icon (?) with a tooltip explanation.
The alpha parameter controls the type of regularization:
| Alpha Value | Type | Behavior |
|---|---|---|
| 0 | Ridge | Shrinks all coefficients proportionally; never sets any to zero |
| 1 | Lasso | Can set coefficients exactly to zero (variable selection) |
| 0 to 1 | Elastic Net | Blend of ridge and lasso |
Three alpha selection methods are available:
Lambda controls the strength of regularization:
| Method | Description |
|---|---|
| Cross-validation | Recommended. Tests many lambda values using k-fold CV. |
| Manual | Enter a specific lambda value if you have a reason to. |
When using cross-validation, two lambda choices are available:
The Number of CV Folds parameter controls how the data is split for cross-validation (default: 10 folds).
Choose the distribution family for your response variable:
| Family | Use Case |
|---|---|
| gaussian | Continuous responses (e.g., sale price). Most common. |
| binomial | Binary outcomes (e.g., sold/not sold). |
| poisson | Count data (e.g., number of sales). |
When checked (default), all predictors are scaled to have mean 0 and standard deviation 1 before fitting. This ensures the penalty treats all predictors equally regardless of their original scale. Coefficients are returned on the original scale. Usually should be left on.
When Enforce Sign Constraints is enabled, coefficients are constrained to match the expected signs set in the variable table (Chapter 6):
This is implemented via the upper.limits and
lower.limits parameters in glmnet().
Regular lasso uses the same penalty for variable selection AND coefficient estimation, which can over-shrink important coefficients toward zero. Relaxed lasso separates these steps:
The Gamma slider controls the degree of relaxation:
| Gamma | Effect |
|---|---|
| 0 | Fully relaxed (OLS refit, no shrinkage at all) |
| 1 | No relaxation (same as regular glmnet) |
| Between | Blends relaxed and penalized fits |
With cross-validation, gamma is chosen automatically for optimal performance.
An interactive upper-triangular matrix lets you control which
variable pairs are allowed to interact. Each cell contains a checkbox —
checked means the interaction term x1:x2 is included in the
model matrix.
Interaction selections are saved to browser localStorage per input file and restored when you reload the same file.
A collapsible “Advanced” section at the bottom of glmnet Call Parameters exposes additional settings. These use sensible defaults but are visible so all model settings can be documented for court or audit.
Three options control how parameters are initialized when a file is loaded:
Click Save current as default to store all current settings as the global default for future files.
Section 6 of the sidebar contains the Fit Glmnet Model button. Clicking it runs the model with your current configuration.
When you click Fit, glmnetUI:
upper.limits and lower.limits on
coefficients.cv.glmnet() to find the optimal lambda. If relaxed lasso is
enabled, calls cv.glmnet(..., relax = TRUE).A status message below the Fit button shows the number of observations, excluded rows, and selected lambda.
Shows the imported data as an interactive DataTable. In appraisal/market modes, the preview is split into two tables: Subject Property (row 1) and Comparable Sales (rows 2+).
Displays the fitted model equation rendered in LaTeX via MathJax. Shows all non-zero coefficients with proper mathematical formatting:
A heatmap matrix showing Pearson correlations among all numeric predictors and the response variable. Available immediately after data import — no model fitting required.
Model fit statistics displayed as cards:
In appraisal/market modes, additional cards show:
An overfitting warning appears when training R exceeds CV R by more than 0.1.
Below the cards, a coefficient table with sign warnings duplicates the Coefficients tab for convenience.
A table of all non-zero coefficients showing:
Standardized coefficient magnitude: \(|\beta| \times \text{sd}(x)\) for each predictor, aggregated across dummy columns for factor variables. Two display modes:
plotly
package is installed): Horizontal bar chart with hover tooltips showing
exact importance, coefficient, and relative percentage.Below the chart, a DataTable shows Variable, Importance, Coefficient, and Relative %.
Per-variable partial effect plots showing the contribution of each predictor to the model’s predictions. Select a variable from the dropdown.
For numeric predictors: A scatter plot of the
variable’s x-values vs. its contribution (\(\beta \times x\)), with a red line segment
and a slope label. The slope label shows the marginal effect per unit
(e.g.,
+1,234.56/unit'') with adaptive units for small-range variables like latitude/longitude (e.g.,+0.12/0.001’’).
The subtitle shows the intercept (basis) value.
For factor/interaction predictors: A histogram showing the distribution of contribution values across observations.
Variance decomposition table showing for each predictor:
Includes an intercept row and a TOTAL MODEL row. Interaction terms (containing ``:’’) are shown as separate groups.
Four diagnostic plots with large fonts (base size 16pt), 15 axis tick marks, and comma-formatted labels (no scientific notation):
Coefficient Path — Shows how all coefficients change as \(\log(\lambda)\) increases. Lines are colored by variable. Helps visualize the regularization path and variable selection.
CV Error — Cross-validation error as a function of \(\log(\lambda)\). Points show mean CV error; error bars show \(\pm 1\) standard error. Dashed vertical lines mark \(\lambda_{\min}\) (blue) and \(\lambda_{1\text{se}}\) (green).
Actual vs Predicted — Scatter plot of observed vs. predicted values. Points should cluster around the 45-degree dashed reference line. Wider scatter indicates more prediction error.
Residuals vs Fitted — Scatter plot of residuals vs. fitted values. Should show random scatter around the zero line (dashed). Patterns suggest model misspecification (e.g., non-linearity, heteroscedasticity).
Each plot has a Download PNG button for saving at 150 DPI.
Configure report fields (appraiser name, property address, report date, file number) and export to Word (.docx), PDF, or HTML. See Chapter 13.
After fitting, the Download Output button (sidebar section 7, appraisal/market modes) exports an Excel file with predictions and diagnostics.
A white checkmark appears on the download button after successful completion.
CQA ranks each row’s residual against all others on a 0–10 scale:
In appraisal/market modes, rows are sorted by
residual_sf descending when a living_area
column is designated.
The RCA (Reconciliation by Comparable Adjustment) workflow is available in Appraisal mode only, after fitting the model and downloading the output (Step 7).
Click the Calculate RCA Adjustments & Download button in sidebar section 8. Choose:
A white checkmark appears on the button after successful computation.
The Sales Comparison Grid is available in Appraisal mode only, after computing RCA adjustments (Step 8). It generates a formatted Excel workbook suitable for inclusion in appraisal reports.
Clicking the Sales Grid button in sidebar section 9 opens a modal dialog listing all comparable sales from the RCA output. Comps are split into two groups:
Select up to 30 comps, then click Generate Sales Grid to create the workbook.
The generated workbook contains up to 10 sheets, each holding 3 comps side by side:
Each sheet is protected with a password to prevent accidental edits, but the residual feature value cells are explicitly unlocked. This allows appraisers to enter values for features not in the model while preserving the formula-driven adjustment calculations.
Three formats are available:
Reports are generated using Quarto when available, with a fallback to rmarkdown.
Reports include all tab content:
All axis labels and color legends use comma-formatted numbers (no scientific notation).
A white checkmark appears on the Download Report button after successful generation.
glmnetUI and earthUI are companion tools for regularized regression modeling. They share the same data format, special column types, RCA workflow, and demo datasets, but use different modeling engines.
Both tools provide:
Use glmnetUI when you want a simple, interpretable linear model with automatic variable selection. Elastic net is particularly well-suited when you have many predictors and want the model to select the most important ones.
Use earthUI when you expect non-linear relationships (e.g., the effect of living area on price changes at different sizes) or need hinge functions to capture threshold effects. Earth models can also capture higher-order interactions (degree 2 and 3).
Use both to compare linear vs. non-linear models on the same data. If both produce similar R values, the simpler glmnet model may be preferred for interpretability.
The demo dataset is shared with earthUI. If earthUI is installed, load it with:
demo_file <- system.file("extdata", "Appraisal_1.csv", package = "earthUI")The file contains 1,502 residential sales (plus 1 subject property in row 1) from a simulated MLS export. The data represents single-family home sales in a multi-area market with a range of property sizes, ages, and locations.
This is not real data, but is based on a realistic neighborhood in Northern California. All identification information has been altered or removed.
glmnetUI::glmnetUI()Appraisal_1.csv via the file uploadsale_price as the targetsale_age, living_sqft,
baths_total, lot_size, area_id
(as factor), age, latitude,
longitude, garage_spacesglmnetUI runs on macOS, Windows, and Linux. RStudio Desktop (2023.06+) is strongly recommended — it bundles pandoc needed for reports.
tinytex::install_tinytex()macOS: Fewest issues. Install TinyTeX for PDF:
tinytex::install_tinytex()
Windows: Works well with RStudio. Corporate/locked-down machines may have temp directory restrictions — the app will warn in the console but continue without settings persistence.
Linux: May need system libraries:
sudo apt install libcurl4-openssl-dev libssl-dev libxml2-dev libsqlite3-dev libfontconfig1-dev
| Missing Component | Behavior |
|---|---|
| No LaTeX | PDF option hidden. HTML and Word still available. |
| No internet / fonts fail | System sans-serif used. Console message logged. |
| No RSQLite / read-only filesystem | Settings don’t persist. App runs normally. |
| Temp directory not writable | Report generation fails with clear error. Set TMPDIR to
a writable location. |
“PDF option not available”: Run
tinytex::install_tinytex() in R, restart app.
“Settings will not persist”: Check permissions on
the user data directory (macOS:
~/Library/Application Support/R/glmnetUI/, Linux:
~/.local/share/R/glmnetUI/, Windows:
%APPDATA%/R/data/glmnetUI/).
Port 7879 already in use: Run
lsof -ti:7879 | xargs kill (macOS/Linux) or
Stop-Process -Id (Get-NetTCPConnection -LocalPort 7879).OwningProcess
(Windows PowerShell).