In my previous blog talking about TextGenie, I mentioned the issues I faced while collecting text data from scratch and using paraphrases generated from T5(Text-To-Text Transfer Transformer) as one of the methods to augment text data. Having seen the model in action, let’s get our hands dirty with the training process😉
If you wish to walk along with me throughout, you can find the notebook for training here on my Github repo.
Tip: If you do not have a GPU, I suggest using Google Colaboratory for training the model.
Before proceeding, let’s get all the required packages handy using:
Often while developing Natural Language Processing models, we find it difficult to find relevant data. And more than that, finding data in a large amount.
Previously, while developing our Intent Classifier, we used the CLINC150 Dataset that had 100 samples for 150 different classes. But, what if we needed even more samples? One more similar scenario was when I was working on a contextual assistant with Rasa. While creating the training data from scratch, I’d have to imagine different samples for each intent or ask my friends for some help. …
Welcome to the sequel of my blog where I talked about training an Intent Classifier. So now that we have our intent classifier on trained.. next what? How do we use it. It must be deployed somewhere to be used, right? We shall choose Heroku as our platform to see the intent classifier live.
Before we begin, you can find all the code on my github repo if you wish to walk along with me.
We shall keep the following folder structure for ease:
Being a fanboy of NLP, I always used to wonder how does the Google Assistant or Alexa understand when I asked it to do something. The question continued with if I could make my machine understand me too? The solution was -Intent Classification.
Intent classification is a part of Natural Language Understanding, where the machine learning/deep learning algorithm learns to classify a given phrase on the basis of the ones it has been trained on.
Let’s take a fun example; I’m making an assistant like Alexa.
I’ve always been a big fan of Natural Language Processing. Since I love machines, I’d always love to find ways to communicate with my machine too.
Isn’t it cool, you ask your machine something and it answers back?😍
Before we talk about anything,how about we begin with a friendly example? When you receive an email, the provider automatically places it into the inbox/spam folders. Almost all the time, they are correctly placed in their corresponding folders while sometimes, even the mails that we wanted to see in our inbox are marked as spam 😕. But more than all this, who does this job for us? 🤔
Machine Learning is the magician in the background here!
Data is one of the strongest pillars to consider while building a deep learning model. The more precise the data is, the more experienced the model is. To train a deep learning model, you need lots of data which you might want to create your own or you can use the public datasets available across the internet such as MS COCO, ImageNet, Open Images, etc.
Sometimes, these datasets follow a different format and the model you want to train follows another. Making the data usable by the model becomes a hassle when the data is very large. …
Hi, I am a machine learning enthusiast, passionate about counting stars under the sky of Deep Learning and Machine Learning.