From Zero to SOTA: How to fine-tune BERT using Hugging Face for world-class training performance by Tim Santos

We would like to demonstrate how Graphcore accelerates the time-to-value of AI development through speedups from our very own hardware, the Intelligence Processing Unit, and by improving the development iterations as a result of easy-to-use and easy-to-integrate examples. BERT is one of today’s most widely used natural language processing applications and one of the most highly requested models by Graphcore customers. Transformer models, such as BERT, are necessary to build robust solutions in industries undergoing massive AI transformation such as legal, banking and finance, and healthcare. Our engineers have implemented and optimized BERT-Large for our IPU systems, demonstrating state-of-the-art performance results, using industry-standard machine learning training schemes. In this demo, we will show you how to access IPUs through Spell’s Cloud MLOps platform, we’ll walk you through our BERT Fine-tuning notebook tutorial using the SQuADv1 dataset, and finally run an inference question / answering task using the HuggingFace inference API. Plus, we’ll provide an overview of how you can train even more transformer models faster with the Hugging Face Optimum toolkit.

Supported by