Feature/avalonia UI by willwade · Pull Request #4 · AACTools/VoiceGarden-SAPI

willwade · 2026-06-26T10:06:52Z

No description provided.

Comprehensive plan for replacing Installer.exe + SherpaOnnxConfig.exe + EngineConfig.exe with a single Avalonia UI application. Covers: architecture, UI layout, CLI interface, registry schema, build pipeline changes, migration strategy, and critical compatibility requirements (HKLM tokens, DLL version preservation, Grid3/System.Speech compatibility, duplicate voice prevention, branding config).

New VoiceGarden.UI Avalonia app replacing the C++ Installer.exe. Phase 1 includes: - Main window with adapter install/uninstall (64-bit + 32-bit) - Engine checkboxes for all cloud engines (Azure, OpenAI, ElevenLabs, Google, Polly, Cartesia, Deepgram) with inline API key fields - SherpaOnnx enable checkbox + model count - Advanced section (collapsible) with Edge/Narrator toggles + log level - Branding.json loading (showEdgeVoices, showNarratorVoices, etc.) - CLI mode: install, uninstall, status commands - Registry save/load for all engine settings - Azure key pre-fill from legacy registry location - COM registration via elevated regsvr32 Architecture: - Models/BrandingConfig.cs: Feature flags + engine definitions - Services/ComRegistrationService.cs: regsvr32, IsRegistered, IsInstalled - Services/RegistryService.cs: HKCU/HKLM read/write helpers - ViewModels/MainViewModel.cs: MVVM with CommunityToolkit - CliDispatcher.cs: Headless CLI dispatch - App.axaml: Fluent theme, light mode - MainWindow.axaml: Card-based layout with sections Tested: CLI 'status' command correctly shows registration state.

VoiceConfigViewModel: - Fetch voices from any engine via DotNetTtsWrapper - Search/filter voice list - Select/unselect voices - Promote selected to HKLM (elevated) - Unpromote (remove from HKLM) - Validate credentials with fallback to synthesis test - Track installed vs not-installed status VoiceConfigView.axaml: - Engine dropdown + API key + region fields - Validate button with result display - Fetch Voices button - Filterable voice list with checkboxes - Install/Uninstall selected buttons - Status bar with counts MainViewModel: - Navigation between main view and voice config (IsVoiceConfigVisible) - Pre-fills engine/key from first enabled cloud engine - Back to Main button CLI commands: - voices: list voices for an engine (--engine, --key, --region, --json) - validate: validate credentials - promote: install voice to HKLM - promoted: list all promoted voices - unpromote: remove from HKLM VoicePromotionService: - Direct HKLM token creation (when elevated) - Fallback to EngineConfig.exe (when not elevated) - List promoted voices - Azure backward compatibility (saves to legacy registry) - Unpromote Tested: status, help, validate azure (OK 556 voices), promoted (5 tokens)

SherpaModelService: - Load catalog from merged_models.json (sidecar/embedded) - Scan installed models (find .onnx, tokens.txt, espeak-ng-data) - Download models with progress reporting - Promote models to HKLM (batch promote-all) - Unpromote models from HKLM - Track promoted status SherpaModelsViewModel: - Catalog browser with language/search filter - Show installed/dowloaded/promoted status per model - Download selected with progress bars - 'Install All to SAPI' button (promote-all) - Open models folder - Rescan button SherpaModelsView.axaml: - Filterable model list with checkboxes - Download progress indicators - Status badges (Downloaded, Installed) - Action buttons: Download Selected, Install All, Open Folder MainViewModel navigation: - Three-panel system: Main ↔ VoiceConfig ↔ SherpaModels - IsMainViewVisible computed property - BackToMain resets all sub-views - LoadCatalogCommand auto-fires on Sherpa panel open CLI commands: - models list: show installed models - models download <id>: download from catalog - models promote-all: promote all to HKLM - models rescan: refresh installed status Tested: models list (2 found), models rescan (2 found, both promoted), promoted (5 tokens: 2 Sherpa + 3 Cloud)

- Localization framework: Loc.cs helper + Strings.resx embedded resource with 22 default strings (AppName, AdapterInstallation, VoiceEngines, etc.) - Strings.Designer.cs auto-generated accessor class - Updated CLI help text to include models commands - Ready for translation: just add Strings.fr.resx etc. All 4 phases of the Avalonia UI redesign plan are now implemented: 1. Core Shell: Main window, adapter install/uninstall, branding 2. Voice Configuration: DotNetTtsWrapper voice listing + HKLM promotion 3. SherpaOnnx Integration: Catalog, download, promote-all 4. Polish: Localization, complete CLI, error handling CLI commands tested: status, --help, validate azure (OK 556 voices), models list (2 found), models rescan (2 found), promoted (5 tokens), promote/unpromote/voices

…alog loading 1. All 20 cloud engines in the checkbox grid (was 7) 2. Scrollable WrapPanel grid of engine checkboxes (maxHeight=200) 3. Three-panel navigation: Main → Configure Credentials → Configure Voices - Configure Credentials: shows API key/region fields per enabled engine - Configure Voices: voice listing + promotion (existing) 4. Advanced hidden by default 5. Aligned grid columns in adapter section (MinWidth on buttons) 6. merged_models.json included in publish output 7. SherpaModelsView now loads full catalog (1336+ models from merged_models.json)

- LoadCatalogAsync now deserializes as Dictionary<string, CatalogModel> (merged_models.json is keyed by model ID, not an array) - Fixed CatalogLanguage field names: language_name, country (was Iso Code, Display) - Fixed language display to use LanguageName/Country - Added fallback path for catalog during development

Replaced raw JSON parsing with DotNetTtsWrapper.SherpaOnnxTtsClient. GetVoicesAsync() which returns unified TtsVoice objects with proper BCP-47 language codes. The DotNetTtsWrapper bug (resource name mismatch + snake_case field mapping) is fixed at the source. Now shows 1334 models with correct languages (English, Chinese, etc.) instead of 'Unknown language'.

Was stuck on 1.1.6 from NuGet.org which had the embedded resource bug (only 2 voices). Now uses 1.3.0 from local artifacts feed which has the resource name fix + MMS language field mapping fix. Returns all 1334 voices with proper languages.

Row 0: Engine + Key + Validate Row 1: Action buttons (Fetch, Install, Uninstall) - StackPanel Row 2: Search + Select All/None Row 3: Voice list (* fills remaining space) Row 4: Status bar Buttons no longer overlap with the voice list.

About: - Shows app version, DotNetTtsWrapper version, engine list, GitHub URL - Opens as a panel with Back button Preview Voice: - Each voice in the voice config list has a ▶ button - Synthesizes 'Hello, my name is <voice>' via DotNetTtsWrapper - Saves to temp WAV, plays with system default player - Works for all cloud engines (Azure, OpenAI, etc.)

Design.DataContext in VoiceConfigView and SherpaModelsView caused runtime crash 'Unable to resolve type vm:VoiceConfigViewModel' when the views were loaded inside MainWindow's panel navigation. Removed Design.DataContext from both user controls — they get their DataContext from the parent window's binding (DataContext={Binding VoiceConfig}). Verified: app launches and runs without crash.

The Preview button's Command binding used inline type resolution that crashed at runtime. Replaced with: - Button Tag stores voice Id - Code-behind PreviewVoice_Click handler looks up the voice and calls the PreviewVoiceCommand - No runtime XAML type resolution needed

Use System.Media.SoundPlayer (PlaySync on background thread) instead of Process.Start which opened Windows Media Player. Audio plays directly, no external app window, temp file cleaned up after.

…alias 1. Register 32/64-bit buttons now properly refresh after elevated regsvr32 — was using lowercase field names instead of generated ObservableProperty names 2. Download progress: _catalog was empty (switched to GetVoicesAsync) now loads catalog for download URLs. Progress bar + status text visible during download. 3. SherpaOnnx preview: ▶ button on each downloaded model row. Synthesizes 'Hello, this is a X voice' via DotNetTtsWrapper SherpaOnnx client, plays via SoundPlayer. 4. en-US alias checkbox: 'Add en-US alias' option visible next to 'Install to SAPI' button. Passed to PromoteAll for compat aliases.

1. PromoteAll: when not admin, tries direct first, then relaunches self elevated via 'models promote-all' CLI command with UAC prompt. Shows 'Requesting admin privileges...' then 'Models installed (elevated)'. 2. Suppressed CA1416 (platform compatibility) and MVVMTK0034 (field reference) warnings — this is a Windows-only app, these are noise. Build now has 0 warnings (was 126).

The C++ Installer.exe creates its own Add/Remove Programs entry via HKCU\...\Uninstall\NaturalVoiceSAPIAdapter. The MSI also creates one. This results in two entries in Add/Remove Programs. Fix: skip AddUninstallRegistryKey() when running from Program Files (MSI-managed install). Only create the ARP entry when running as a standalone portable exe.

Build pipeline: - Step 3.6: Build VoiceGarden.UI (Avalonia, self-contained single-file) - Step 7: Stage VoiceGarden.UI.exe at payload root alongside Installer.exe - MSI now contains both Installer.exe (C++) and VoiceGarden.UI.exe (Avalonia) Fixes: - Register/unregister: 500ms delay after regsvr32 for registry flush - Register: fallback to exe directory if DLL not in Program Files - RunElevated: 60s timeout (was 30s), catch UAC cancel (error 1223) - SherpaOnnx PromoteAll: tries direct first, relaunches elevated if not admin - Download progress: loads catalog for URLs (was empty after GetVoicesAsync switch) - Preview SherpaOnnx: ▶ button synthesizes via DotNetTtsWrapper SherpaOnnx client - en-US alias checkbox: passes AddEnUsAlias to PromoteAll - CLI voices --json: progress text goes to stderr (was polluting stdout JSON) - NoWarn CA1416, MVVMTK0034: 0 build warnings (was 126) Gitignore: - Added engine-config/, voicegarden-ui/, nuget.exe

Complete rename across all 402 tracked files: - Directory: NaturalVoiceSAPIAdapter/ → VoiceGardenSAPIAdapter/ - Directory: NaturalVoiceSAPIAdapter.Net/ → VoiceGardenSAPIAdapter.Net/ - Solution: NaturalVoiceSAPIAdapter.sln → VoiceGardenSAPIAdapter.sln - DLL: NaturalVoiceSAPIAdapter.dll → VoiceGardenSAPIAdapter.dll - Registry: SOFTWARE\NaturalVoiceSAPIAdapter → SOFTWARE\VoiceGardenSAPIAdapter - Config key: NaturalVoiceConfig → VoiceGardenConfig - Token enum: NaturalVoiceEnumerator → VoiceGardenEnumerator - SAPI attribute: NaturalVoiceType → VoiceGardenType - Log path: %LOCALAPPDATA%\NaturalVoiceSAPIAdapter → VoiceGardenSAPIAdapter - All .cpp, .h, .idl, .cs, .ps1, .wxs, .json files updated Also: VoiceGarden.UI.exe is now the main app - WiX shortcuts point to VoiceGarden.UI.exe (not Installer.exe) - SetupLauncher auto-launches VoiceGarden.UI.exe after install - VoiceGarden.UI.exe in curated MSI payload CLSID GUIDs unchanged for COM backward compat.

VoiceGarden.UI.exe is now the sole configuration app. Installer.exe (C++ Win32) is no longer built, staged, or included in the MSI. Changes: - build-release-local.ps1: Removed Installer (Win32) build step - build-release-local.ps1: Removed Installer.exe/InstallPlanRunner.exe staging - create-setup-payload.ps1: Removed from rootAllow list - SetupLauncher: Removed Installer.exe fallback, only launches VoiceGarden.UI.exe - SetupLauncher: Updated error message to reference VoiceGarden.UI.exe Installer source code remains in repo/ for reference but is not built. The MSI now contains only VoiceGarden.UI.exe as the main app.

- Moved Installer/ source to archive/ (no longer built or deployed) - Removed from VoiceGardenSAPIAdapter.sln - Removed pages-plan-builder.yml (references non-existent web-plan-builder) - Moved samples/ to archive/samples/ (install plan templates, no longer used) - Untracked lib/ (NuGet packages cache, should not be in repo) - Added lib/ to .gitignore - Cleaned stale build directories Final repo structure: .github/workflows/msbuild.yml — CI VoiceGarden.UI/ — Avalonia config app (main UI) VoiceGardenSAPIAdapter/ — C++ SAPI adapter DLL VoiceGardenSAPIAdapter.Net/ — .NET adapter SherpaOnnxConfig/ — Model manager CLI helper EngineConfig/ — Cloud engine CLI helper Setup/ + SetupLauncher/ — WiX MSI + bootstrapper scripts/ — Build scripts SherpaOnnx/ — SherpaOnnx integration AzureSpeechSDKShim/ + Arm64XForwarder/ — Native helpers include/ — Third-party headers config/ — Branding docs/ — Documentation archive/ — Old code (Installer, tests, scripts)

…ress 1. IsRegistered: 32-bit now checks WOW6432Node (was checking native hive, always returning same result as 64-bit) 2. Register/Unregister 32-bit: uses SysWOW64\regsvr32.exe (was using 64-bit regsvr32 for 32-bit DLL) 3. Download progress: shows %, file size, model name in both the row status and the panel status bar. User can see e.g. 'Downloading piper-en-alan-low (64MB)... 45%'

- HttpClient.GetAsync now uses ResponseHeadersRead so progress fires during actual download (was buffering entire response first) - Throttle reports to 4/sec with MB done/total (e.g. '45% (30/64MB)') - Added IsDownloading flag for clean UI binding (progress bar hidden when not downloading, status text still shown for failed/downloaded) - Progress bar widened to 120px, status text shown next to it - 30-min HttpClient timeout (was default 100s, too short for big models)

Root cause: download completed but 7-Zip extraction never ran (or was aborted), leaving a 319MB .tar.bz2 with no .onnx file inside. When the user clicked 'Install to SAPI', ScanInstalledModels found the directory but no model.onnx, so PromoteAll returned (0, 0) — nothing to install. Fixes: - ScanInstalledModels now calls TryExtractArchives() on every model dir before looking for .onnx. Self-heals any download that finished but was never extracted. Also cleans up leftover .tar/.tar.bz2 once the .onnx is present. - Download path uses the same RunSevenZip helper with a 2-min timeout per stage (was unbounded WaitForExit — could hang forever). - PromoteAll now distinguishes 'nothing found' from 'failed': when both promoted and failed are 0, the status explains no downloaded models were found and suggests Rescan. - 7-Zip missing gives an actionable error pointing to 7-zip.org.

Root cause: DownloadModelAsync created the model directory before the HTTP request. On 404, the empty folder was left behind. Rescan() then saw the folder and marked the model as IsDownloaded=true (it only checked directory existence, not an actual .onnx file). The final 'Download complete' status also overwrote the failure message. Fixes: - Directory is now created only AFTER a successful HTTP response - On non-2xx, the partial/empty directory is cleaned up - RefreshInstalled filters to models with a ModelPath (.onnx present) - DownloadSelected tracks ok/fail counts and reports a combined summary instead of blindly saying 'Download complete' - Per-row DownloadStatus still shows 'Failed: HTTP 404 NotFound' so the user can see exactly which model failed and why

MMS models (mms_hyw, mms_eng, etc.) have URLs pointing to HuggingFace directories (e.g. /resolve/main/hyw), not .tar.bz2 archives. The old code tried to download the directory URL directly → 404. DownloadModelAsync now routes based on URL pattern: - .tar.bz2 / .tar → archive download + 7-Zip extract (kokoro, piper, etc.) - No extension → HuggingFace directory: downloads model.onnx (with progress) + tokens.txt + optional lexicon.txt individually Verified: hyw/model.onnx = 109MB, hyw/tokens.txt = 452 bytes, both 200 OK.

1. Install to SAPI now uses .reg file + 'reg import' elevated instead of relaunching the 116MB single-file exe (which was slow and fragile). PromoteAllElevated() generates a .reg file from the installed models and imports it via reg.exe with UAC. Much faster and reliable. 2. Preview audio fixed in DotNetTtsWrapper: GetModelConfigurationAsync now detects voices.bin and vocoder.onnx alongside model.onnx when given explicit file paths. Kokoro models require voices.bin — without it, SherpaOnnx silently produces no audio. 3. New cleanup script: scripts/cleanup-voices.ps1 - Removes all Sherpa-*, Cloud-*, NaturalVoice-* SAPI voice tokens - Cleans up VoiceGarden/NaturalVoice registry keys (HKCU+HKLM) - Keeps built-in Microsoft voices (TTS_MS_*) - Supports -DryRun to preview changes - Checks both native and WOW6432Node registry hives

1. Cleanup script now covers HKCU voice tokens (was only checking HKLM and WOW6432Node). Old Sherpa/Cloud tokens in HKCU were the source of duplicate voices. Also cleans up eSpeak tokens in WOW6432Node. 2. Model type detection: ScanInstalledModels now detects Kokoro (has voices.bin), Matcha (has vocoder.onnx), vs VITS (default). Sets SherpaOnnxModelType accordingly (0=VITS, 1=Matcha, 2=Kokoro). 3. PromoteSherpaModel + AppendModelToReg now write SherpaOnnxVoices (voices.bin path) and SherpaOnnxLexicon for Kokoro models. Without these, Kokoro voices failed to synthesize with 'Not a model using characters as modeling unit' error. 4. PromoteAllElevated writes .reg file to C:\ProgramData\ instead of user temp, because the elevated process runs as a different admin user that can't read the current user's temp directory. 5. End-to-end test verified: kokoro-en-en-19 (87KB), piper-en-amy-low (77KB), mms_eng (71KB) all produce audio via System.Speech COM. 6. Also added test-sherpa-e2e.ps1 for automated testing.

Major changes: - Add build-voice-garden-ui job (builds Avalonia app, clones DotNetTtsWrapper from GitHub since v1.3.0 not yet on NuGet.org) - Remove all Installer.exe / InstallPlanRunner.exe references (Installer moved to archive/, replaced by VoiceGarden.UI.exe) - Remove dead test scripts (test-installer-cli.ps1, run-sherpa-smoke-test.ps1 no longer exist) - Fix action versions: upload-artifact v6→v4, download-artifact v7→v4 (v6/v7 don't exist) - Update setup payload to stage VoiceGarden.UI.exe instead of Installer.exe - Update release ZIP to include VoiceGarden.UI.exe - Fix branch triggers: add 'main' and 'feature/**' patterns - Add PublishReadyToRun=false for VoiceGarden.UI (SherpaOnnxConfig OOMs under R2R on .NET 10 SDK) - Clean up the utilities job: remove Installer build step, remove samples/install-plans copy, remove InstallPlanRunner copy

1. Restore lib/{x86,x64,arm64}/ OpenSSL + Detours .lib files (were untracked in cleanup commit, but C++ adapter needs them for linking — libcrypto.lib, libssl.lib, detours.lib) 2. Remove hardcoded local path from VoiceGarden.UI/nuget.config (CI clones DotNetTtsWrapper separately and adds the source) 3. Fix verify-sherpa-integration.ps1: remove Installer.rc check (Installer moved to archive/) 4. Use absolute path for dotnet nuget add source in CI

The relative path was being resolved against AppData instead of the workspace, causing NU1301 on the runner.

willwade added 30 commits June 25, 2026 17:12

Fix: preview voice plays without opening Windows Media Player

cbb77bc

Use System.Media.SoundPlayer (PlaySync on background thread) instead of Process.Start which opened Windows Media Player. Audio plays directly, no external app window, temp file cleaned up after.

Remove debug output file from tracking

e61678b

willwade added 3 commits June 26, 2026 10:42

Fix CI: use absolute path for DotNetTtsWrapper local NuGet source

800b31d

The relative path was being resolved against AppData instead of the workspace, causing NU1301 on the runner.

Fix CI: build DotNetTtsWrapper before packing (NU5026)

4004ef5

willwade merged commit ecb910d into feature-installer-policy-and-setup Jun 26, 2026
13 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feature/avalonia UI#4

Feature/avalonia UI#4
willwade merged 33 commits into
feature-installer-policy-and-setupfrom
feature/avalonia-ui

willwade commented Jun 26, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

willwade commented Jun 26, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant