Hi Team, Thanks a lot for this. Few questions - 1) Is the speedup only for GPU or the inference on CPU is also boosted? 2) Wondering if an inference example with T5/BART summarization from Huggingface etc can be provided in a colab notebook or so. Easier to adopt. Sorry if it is a bit of a stretch to request this. Appreciate you reading this.
Hi Team,
Thanks a lot for this.
Few questions -
Is the speedup only for GPU or the inference on CPU is also boosted?
Wondering if an inference example with T5/BART summarization from Huggingface etc can be provided in a colab notebook or so. Easier to adopt.
Sorry if it is a bit of a stretch to request this. Appreciate you reading this.