Skip to content

Use GPU for translation model inference and compute BLEU/TER locally with torchmetrics#709

Open
Irozuku wants to merge 2 commits into
developfrom
fix/translation-gpu-and-torchmetrics
Open

Use GPU for translation model inference and compute BLEU/TER locally with torchmetrics#709
Irozuku wants to merge 2 commits into
developfrom
fix/translation-gpu-and-torchmetrics

Conversation

@Irozuku

@Irozuku Irozuku commented Jun 19, 2026

Copy link
Copy Markdown
Collaborator

Summary

Fixes translation models not using the GPU during inference, and replaces the deprecated evaluate library with torchmetrics for the BLEU and TER translation metrics (CHRF already used torchmetrics). evaluate.load(...) fetches the metric script from a remote endpoint (Hugging Face Hub) on every call; switching to torchmetrics computes the scores fully locally, with no network dependency.


Type of Change

Check all that apply like this [x]:

  • Backend change
  • Frontend change
  • CI / Workflow change
  • Build / Packaging change
  • Bug fix
  • Documentation

Changes (by file)

  • DashAI/back/models/hugging_face/base_opus_mt_transformer.py: predict() now moves the model to CUDA when device == "gpu" (else CPU) and sets eval() mode before inference. Previously the model stayed on CPU after init/load, so inputs were copied to a CPU device and inference never used the GPU even when selected.
  • DashAI/back/models/hugging_face/m2m100_transformer.py: same GPU placement fix in predict().
  • DashAI/back/models/hugging_face/nllb_transformer.py: same GPU placement fix in predict().
  • DashAI/back/models/hugging_face/t5_small_transformer.py: same GPU placement fix in predict().
  • DashAI/back/metrics/translation/bleu.py: replaced evaluate.load("bleu") with torchmetrics.text.bleu.BLEUScore; references wrapped as list of lists. Import of prepare_to_metric switched to translation_metric.
  • DashAI/back/metrics/translation/ter.py: replaced evaluate.load("ter") with torchmetrics.text.ter.TranslationEditRate. Import of prepare_to_metric switched to translation_metric.

Testing

  1. Train a translation model (e.g. OpusMtEnEsTransformer) with GPU selected as the device.
  2. With the model on GPU, confirm prediction runs on CUDA (e.g. nvidia-smi shows utilization / next(model.model.parameters()).device is cuda) and is faster than CPU.
  3. Verify the run's results table shows BLEU, TER, and CHRF values, all in the 0–1 range.

@Irozuku Irozuku added bug Something isn't working back Backend work labels Jun 19, 2026
@Irozuku Irozuku force-pushed the fix/translation-gpu-and-torchmetrics branch from 639a877 to 24d1347 Compare June 19, 2026 14:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

back Backend work bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant