Step by Step Intent Recognition With BERT

Step by Step Intent Recognition With BERT

Use BERT for intent recognition

What is BERT?

Bidirectional Encoder Representations from Transformers (BERT) is a technique for NLP (Natural Language Processing) pre-training developed by Google. BERT was created and published in 2018 by Jacob Devlin and his colleagues from Google.

BERT is designed to pre-train deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers. As a re-sult, the pre-trained BERT model can be fine-tuned with just one additional output layer to create state-of-the-art models for a widerange of tasks, such as question answering and language inference, without substantial task-specific architecture modifications. BERT is conceptually simple and empirically powerful.

BERT won the Best Long Paper Award at the 2019 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL). On October 25, 2019, Google Search announced that they had started applying BERT models for English language search queries within the US. On December 9, 2019, it was reported that BERT had been adopted by Google Search for over 70 languages.

Model architecture

BERT is a multi-layer bidirectional Transformer encoder. There are two models introduced in the paper. BERT denote the number of layers(i.e., Transformer blocks) as L, the hidden size as H, and the number of self-attention heads as A.

BERT base – L=12, H=768,  A=12,  Total Parameters=110M
BERT Large – L=24, H=1024, A=16, Total Parameters=340M

Data preparation

Import modules

In [1]:
import pandas as pd
import numpy as np
In [2]:
!ls /home/jupyter-thakur/xv-shared-folders/training/input/intent-recognition
test.csv  train.csv  valid.csv

Declare data folder path

In [3]:
inputFolder = '/home/jupyter-thakur/xv-shared-folders/training/input/intent-recognition/'
inputFolder
Out[3]:
'/home/jupyter-thakur/xv-shared-folders/training/input/intent-recognition/'

Read csv files

pandas.read_csv() Read a comma-separated values (csv) file into DataFrame.

In [4]:
train = pd.read_csv(inputFolder + "train.csv")
valid = pd.read_csv(inputFolder + "valid.csv")
test = pd.read_csv(inputFolder + "test.csv")
In [5]:
print(f"train: {train.shape} \n{train.head()}" )
print(f"\nvalid: {valid.shape} \n{valid.head()}" )
print(f"\ntest: {test.shape} \n{test.head()}" )
train: (13084, 2) 
                                                text         intent
0   listen to westbam alumb allergic on google music      PlayMusic
1         add step to me to the 50 clásicos playlist  AddToPlaylist
2  i give this current textbook a rating value of...       RateBook
3               play the song little robin redbreast      PlayMusic
4  please add iris dement to my playlist this is ...  AddToPlaylist

valid: (700, 2) 
                                                text         intent
0  i d like to have this track onto my classical ...  AddToPlaylist
1          add the album to my flow español playlist  AddToPlaylist
2      add digging now to my young at heart playlist  AddToPlaylist
3  add this song by too poetic to my piano ballad...  AddToPlaylist
4           add this album to old school death metal  AddToPlaylist

test: (700, 2) 
                                                text          intent
0  add sabrina salerno to the grime instrumentals...   AddToPlaylist
1  i want to bring four people to a place that s ...  BookRestaurant
2  put lindsey cardinale into my hillary clinton ...   AddToPlaylist
3                will it snow in mt on june 13  2038      GetWeather
4     play signe anderson chant music that is newest       PlayMusic

Concat train and valid dataframe

In [6]:
train = pd.concat([train, valid], ignore_index = True)
#print(train.shape)
In [7]:
print(f"train: {train.shape} \n{train.head()}" )
train: (13784, 2) 
                                                text         intent
0   listen to westbam alumb allergic on google music      PlayMusic
1         add step to me to the 50 clásicos playlist  AddToPlaylist
2  i give this current textbook a rating value of...       RateBook
3               play the song little robin redbreast      PlayMusic
4  please add iris dement to my playlist this is ...  AddToPlaylist

Get the unique intent and it's count

pandas.unique(values) - Hash table-based unique. Uniques are returned in order of appearance. This does NOT sort.

In [8]:
#print the unique intents
train.intent.unique()
Out[8]:
array(['PlayMusic', 'AddToPlaylist', 'RateBook', 'SearchScreeningEvent',
       'BookRestaurant', 'GetWeather', 'SearchCreativeWork'], dtype=object)

DataFrame.value_counts(subset=None, normalize=False, sort=True, ascending=False) - Return a Series containing counts of unique rows in the DataFrame.

In [9]:
#print the count of intent
train.intent.value_counts()
Out[9]:
PlayMusic               2014
GetWeather              1996
BookRestaurant          1981
RateBook                1976
SearchScreeningEvent    1952
SearchCreativeWork      1947
AddToPlaylist           1918
Name: intent, dtype: int64

Visualization

In [10]:
import seaborn as sns
import matplotlib.pyplot as plt

Plot intents

Countplot

In [11]:
sns.set()
plt.figure(figsize = (12, 8))
chart = sns.countplot(x = 'intent', data = train, palette='Set1')
chart.set_xticklabels(chart.get_xticklabels(), rotation = 30, horizontalalignment='right', fontweight='light', fontsize='medium')

chart.set_title('Intent Distribution', fontsize = 18)
chart.set_xlabel('Intents', fontsize = 14)
chart.set_ylabel('Counts', fontsize = 14)
plt.show()

Barplot

In [12]:
plt.figure(figsize = (12, 8))
data = train.intent.value_counts()

explode = (0.1, 0, 0, 0, 0, 0, 0)

ax  = data.plot.pie(autopct = '%1.1f%%', labels = data.index, explode = explode, fontsize = 14)
ax.set_title('Intent Distribution', fontsize = 18)
plt.axis('off')
ax.legend(labels = data.index, loc = "upper left", fontsize = 14, fancybox = True, 
          labelspacing = 1, framealpha = 1, shadow=True, borderpad=1)
plt.show()

Load BERT model

We can download bert model and all files related to bert model from below link.

We are going to use 12/768 (BERT-Base) model, which can be downloaded from this link. https://storage.googleapis.com/bert_models/2020_02_20/uncased_L-12_H-768_A-12.zip

BERT-Base has Uncased: 12-layer, 768-hidden, 12-heads, 110M parameters. Uncased means that the text has been lowercased before WordPiece tokenization, e.g., John Smith becomes john smith. The Uncased model also strips out any accent markers. Cased means that the true case and accent markers are preserved. Typically, the Uncased model is better unless you know that case information is important for our task.

We can download all 24 from here https://github.com/google-research/bert

We can get it from TensorFlow Hub also https://tfhub.dev/google/collections/bert/1

In [13]:
!ls /home/jupyter-thakur/xv-shared-folders/training/input/uncased_L-12_H-768_A-12
bert_config.json		     bert_model.ckpt.index  vocab.txt
bert_model.ckpt.data-00000-of-00001  bert_model.ckpt.meta

View bert_config.json

In [14]:
!cat /home/jupyter-thakur/xv-shared-folders/training/input/uncased_L-12_H-768_A-12/bert_config.json
{
  "attention_probs_dropout_prob": 0.1,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "max_position_embeddings": 512,
  "num_attention_heads": 12,
  "num_hidden_layers": 12,
  "type_vocab_size": 2,
  "vocab_size": 30522
}

Declare bert configuration files

In [15]:
import os
In [16]:
modelInputFolder = '/home/jupyter-thakur/xv-shared-folders/training/input/'

bert_model_name="uncased_L-12_H-768_A-12"

bert_ckpt_dir = os.path.join(modelInputFolder, bert_model_name)
bert_ckpt_file = os.path.join(bert_ckpt_dir, "bert_model.ckpt")
bert_config_file = os.path.join(bert_ckpt_dir, "bert_config.json")

print(bert_ckpt_dir)
print(bert_ckpt_file)
print(bert_config_file)
/home/jupyter-thakur/xv-shared-folders/training/input/uncased_L-12_H-768_A-12
/home/jupyter-thakur/xv-shared-folders/training/input/uncased_L-12_H-768_A-12/bert_model.ckpt
/home/jupyter-thakur/xv-shared-folders/training/input/uncased_L-12_H-768_A-12/bert_config.json

Tokenize vocabulary

View vocab.txt file content with tail command

In [17]:
!head ~/xv-shared-folders/training/input/uncased_L-12_H-768_A-12/vocab.txt
[PAD]
[unused0]
[unused1]
[unused2]
[unused3]
[unused4]
[unused5]
[unused6]
[unused7]
[unused8]

Load vocab.txt file

In [ ]:
vocab_file = os.path.join(bert_ckpt_dir, "vocab.txt")
print(vocab_file)

Tokenize vocab with FullTokenizer

Import FullTokenizer from bert_tokenization

In [21]:
from bert.tokenization.bert_tokenization import FullTokenizer

Tokenization is the process of dividing text into pieces such as words, keywords, phrases, symbols and other elements. These pieces are called tokens.

In [22]:
tokenizer = FullTokenizer(vocab_file)
print(tokenizer)
<bert.tokenization.bert_tokenization.FullTokenizer object at 0x7fd8734df810>

tokenizer.convert_tokens_to_ids: converts a string in a sequence of ids (integer), using the tokenizer.

In [23]:
tokens = tokenizer.tokenize("Hello, How are you?")
token_ids = tokenizer.convert_tokens_to_ids(tokens)

print(tokens)
print(token_ids)
['hello', ',', 'how', 'are', 'you', '?']
[7592, 1010, 2129, 2024, 2017, 1029]

Get intent classes

In [24]:
classes = train.intent.unique().tolist()
print(classes)
['PlayMusic', 'AddToPlaylist', 'RateBook', 'SearchScreeningEvent', 'BookRestaurant', 'GetWeather', 'SearchCreativeWork']

Preprocess data

Create class

In [25]:
class IntentDataManager:
    
    def __init__(self):
        print("IntentDataManager class is called")
        
        pass
    
    pass
#class IntentDataManager:

data = IntentDataManager()
print(data)
IntentDataManager class is called
<__main__.IntentDataManager object at 0x7fd8730b2ad0>

Add input parameters to the constructor

Pass train, test, tokenizer, classes and max_seq_len as inputs of class IntentDataManager

In [26]:
max_seq_len = 192
In [27]:
class IntentDataManager:
    
    def __init__(self, train, test, tokenizer: FullTokenizer, classes, max_seq_len):
        
        #declare tokenizer and classes as a class members
        self.tokenizer = tokenizer
        self.classes = classes
        
        print(f"train shape: {train.shape}")
        print(f"test shape: {test.shape}")
        print(f"tokenizer: {self.tokenizer}")
        print(f"intent_classes: {self.classes}")
        print(f"max_seq_len: {max_seq_len}")
        
        pass
    
    pass
#class IntentDataManager:

data = IntentDataManager(train, test, tokenizer, classes, max_seq_len)
print(data)
train shape: (13784, 2)
test shape: (700, 2)
tokenizer: <bert.tokenization.bert_tokenization.FullTokenizer object at 0x7fd8734df810>
intent_classes: ['PlayMusic', 'AddToPlaylist', 'RateBook', 'SearchScreeningEvent', 'BookRestaurant', 'GetWeather', 'SearchCreativeWork']
max_seq_len: 192
<__main__.IntentDataManager object at 0x7fd8730bb3d0>

Sort train and test data by length of text

Sort text column by length of train data

The str.len() function is used to compute the length of each element in the Series/Index.

In [28]:
len_of_text = train['text'].str.len()
print(len_of_text)
0        48
1        42
2        71
3        36
4        52
         ..
13779    41
13780    65
13781    50
13782    60
13783    48
Name: text, Length: 13784, dtype: int64

Pandas sort_values() function sorts a data frame in Ascending or Descending order of passed Column.

In [29]:
#sort text by length
sorted_indexes = len_of_text.sort_values()
print(sorted_indexes)
7713       8
10016      8
9189      10
3195      10
5452      10
        ... 
10951    136
1290     141
2674     149
2603     150
7258     186
Name: text, Length: 13784, dtype: int64
In [30]:
#Sort length of text by index
sorted_indexes = len_of_text.sort_values().index
print(sorted_indexes)
Int64Index([ 7713, 10016,  9189,  3195,  5452,  4137,  2222, 12186, 10600,
             2869,
            ...
             3324,  5670,  9992, 11405,  9919, 10951,  1290,  2674,  2603,
             7258],
           dtype='int64', length=13784)
In [31]:
#pass dataframe as input in lambda function 
#sort values by length of text index
#and return the new index of sorted_indexes 
sort_by_length_text = lambda input_df: input_df.reindex( 
    input_df['text'].str.len().sort_values().index 
)
print(sort_by_length_text)
<function <lambda> at 0x7fd8734dcd40>
In [32]:
class IntentDataManager:
    
    def __init__(self, train, test, tokenizer: FullTokenizer, classes, max_seq_len):
        
        #declare tokenizer and classes as a class members
        self.tokenizer = tokenizer
        self.classes = classes
        
        '''
        print(f"train shape: {train.shape}")
        print(f"test shape: {test.shape}")
        print(f"tokenizer: {self.tokenizer}")
        print(f"intent_classes: {self.classes}")
        print(f"max_seq_len: {max_seq_len}")
        '''
        
        #sort train and test data by length of text
        train, test = map(sort_by_length_text, [train, test])
        
        print(f"train shape: {train.shape} \n\n {train.head()}")
        print(f"\n\ntest shape: {test.shape} \n\n {test.head()}")
        
        
        pass
    
    pass
#class IntentDataManager:

data = IntentDataManager(train, test, tokenizer, classes, max_seq_len)
print(data)
train shape: (13784, 2) 

              text              intent
7713     play pop           PlayMusic
10016    play eve           PlayMusic
9189   play zvooq           PlayMusic
3195   fimd glory  SearchCreativeWork
5452   play zvooq           PlayMusic


test shape: (700, 2) 

                   text                intent
319     find heat wave  SearchScreeningEvent
139   play jawad ahmad             PlayMusic
395   find movie times  SearchScreeningEvent
296   find movie times  SearchScreeningEvent
629  play the insoc ep             PlayMusic
<__main__.IntentDataManager object at 0x7fd8730bb050>

Preprocess train and test data

To use a pre-trained BERT model, we need to convert the input data into an appropriate format so that each sentence can be sent to the pre-trained model to obtain the corresponding embedding.

BERT embeddings are trained with two training tasks:

1. Classification Task: to determine which category the input sentence should fall into
2. Next Sentence Prediction Task: to determine if the second sentence naturally follows the first sentence.

The [CLS] and [SEP] Tokens:

For the classification task, a single vector representing the whole input sentence is needed to be fed to a classifier. In BERT, the decision is that the hidden state of the first token is taken to represent the whole sentence. To achieve this, an additional token has to be added manually to the input sentence. The token [CLS] is chosen for this purpose.

In the "next sentence prediction" task, we need to inform the model where does the first sentence end, and where does the second sentence begin. Hence, another artificial token, [SEP], is introduced. If we are trying to train a classifier, each input sample will contain only one sentence (or a single text input). In that case, the [SEP] token will be added to the end of the input text.

In summary, to preprocess the input text data, the first thing we will have to do is to add the [CLS] token at the beginning, and the [SEP] token at the end of each input text.

In [33]:
class IntentDataManager:
    
    def __init__(self, train, test, tokenizer: FullTokenizer, classes, max_seq_len):
        
        #declare tokenizer and classes as a class members
        self.tokenizer = tokenizer
        self.classes = classes
        self.max_seq_len = 0
        
         
        #sort train and test data by length of text
        train, test = map(sort_by_length_text, [train, test])
        
     
        #call preprocessData function
        (train_X, train_y), (test_X, test_y)  =  map(self.preprocessData, [train, test])
        
        print(f"train_X shape: {train_X.shape}")
        print(f"train_y shape: {train_y.shape}")
        print(f"\ntrain_X: \n{train_X[:5]}")
        print(f"\ntrain_y: \n{train_y[:5]}")

        print(f"test_X shape: {test_X.shape}")
        print(f"test_y shape: {test_y.shape}")
        print(f"\ntest_X: \n{test_X[:5]}")
        print(f"\ntest_y: \n{test_y[:5]}")
        
        
        pass
        

    def preprocessData(self, df):
        
        x, y = [], []

        for idx, row in df[:5].iterrows():
            text = row['text']
            label = row['intent']
            
            #convert text to tokens
            tokens = self.tokenizer.tokenize(text)
            tokens = ["[CLS]"] + tokens + ["[SEP]"] 
          
            print(f"tokens = {tokens}")
            
            #convert tokens to ids
            token_ids = self.tokenizer.convert_tokens_to_ids(tokens)
            print(f"token_ids = {token_ids}")
    
            #append tokens_ids to x
            x.append(token_ids)

            
            #get maxmium sequence length
            self.max_seq_len = max(self.max_seq_len, len(token_ids))
            print(f"max_seq_len = {self.max_seq_len}")
            
            
            print(f"classes = {self.classes}")
            print(f"label = {label}")
            
            
            #get index of class label
            class_label_index = self.classes.index(label)
            print(f"class_label_index = {class_label_index}")
            
            #append index of class label to y
            y.append(class_label_index)
          

            pass
        
        
        arrX = np.array(x)
        arrY = np.array(y)
        print(f"\narrX = {arrX}")
        print(f"\narrY = {arrY}")
        
        return arrX, arrY
            
            
        pass
        
        
    
    pass
#class IntentDataManager:

data = IntentDataManager(train, test, tokenizer, classes, max_seq_len)
print(data)
tokens = ['[CLS]', 'play', 'pop', '[SEP]']
token_ids = [101, 2377, 3769, 102]
max_seq_len = 4
classes = ['PlayMusic', 'AddToPlaylist', 'RateBook', 'SearchScreeningEvent', 'BookRestaurant', 'GetWeather', 'SearchCreativeWork']
label = PlayMusic
class_label_index = 0
tokens = ['[CLS]', 'play', 'eve', '[SEP]']
token_ids = [101, 2377, 6574, 102]
max_seq_len = 4
classes = ['PlayMusic', 'AddToPlaylist', 'RateBook', 'SearchScreeningEvent', 'BookRestaurant', 'GetWeather', 'SearchCreativeWork']
label = PlayMusic
class_label_index = 0
tokens = ['[CLS]', 'play', 'z', '##vo', '##o', '##q', '[SEP]']
token_ids = [101, 2377, 1062, 6767, 2080, 4160, 102]
max_seq_len = 7
classes = ['PlayMusic', 'AddToPlaylist', 'RateBook', 'SearchScreeningEvent', 'BookRestaurant', 'GetWeather', 'SearchCreativeWork']
label = PlayMusic
class_label_index = 0
tokens = ['[CLS]', 'fi', '##md', 'glory', '[SEP]']
token_ids = [101, 10882, 26876, 8294, 102]
max_seq_len = 7
classes = ['PlayMusic', 'AddToPlaylist', 'RateBook', 'SearchScreeningEvent', 'BookRestaurant', 'GetWeather', 'SearchCreativeWork']
label = SearchCreativeWork
class_label_index = 6
tokens = ['[CLS]', 'play', 'z', '##vo', '##o', '##q', '[SEP]']
token_ids = [101, 2377, 1062, 6767, 2080, 4160, 102]
max_seq_len = 7
classes = ['PlayMusic', 'AddToPlaylist', 'RateBook', 'SearchScreeningEvent', 'BookRestaurant', 'GetWeather', 'SearchCreativeWork']
label = PlayMusic
class_label_index = 0

arrX = [list([101, 2377, 3769, 102]) list([101, 2377, 6574, 102])
 list([101, 2377, 1062, 6767, 2080, 4160, 102])
 list([101, 10882, 26876, 8294, 102])
 list([101, 2377, 1062, 6767, 2080, 4160, 102])]

arrY = [0 0 0 6 0]
tokens = ['[CLS]', 'find', 'heat', 'wave', '[SEP]']
token_ids = [101, 2424, 3684, 4400, 102]
max_seq_len = 7
classes = ['PlayMusic', 'AddToPlaylist', 'RateBook', 'SearchScreeningEvent', 'BookRestaurant', 'GetWeather', 'SearchCreativeWork']
label = SearchScreeningEvent
class_label_index = 3
tokens = ['[CLS]', 'play', 'jaw', '##ad', 'ahmad', '[SEP]']
token_ids = [101, 2377, 5730, 4215, 10781, 102]
max_seq_len = 7
classes = ['PlayMusic', 'AddToPlaylist', 'RateBook', 'SearchScreeningEvent', 'BookRestaurant', 'GetWeather', 'SearchCreativeWork']
label = PlayMusic
class_label_index = 0
tokens = ['[CLS]', 'find', 'movie', 'times', '[SEP]']
token_ids = [101, 2424, 3185, 2335, 102]
max_seq_len = 7
classes = ['PlayMusic', 'AddToPlaylist', 'RateBook', 'SearchScreeningEvent', 'BookRestaurant', 'GetWeather', 'SearchCreativeWork']
label = SearchScreeningEvent
class_label_index = 3
tokens = ['[CLS]', 'find', 'movie', 'times', '[SEP]']
token_ids = [101, 2424, 3185, 2335, 102]
max_seq_len = 7
classes = ['PlayMusic', 'AddToPlaylist', 'RateBook', 'SearchScreeningEvent', 'BookRestaurant', 'GetWeather', 'SearchCreativeWork']
label = SearchScreeningEvent
class_label_index = 3
tokens = ['[CLS]', 'play', 'the', 'ins', '##oc', 'ep', '[SEP]']
token_ids = [101, 2377, 1996, 16021, 10085, 4958, 102]
max_seq_len = 7
classes = ['PlayMusic', 'AddToPlaylist', 'RateBook', 'SearchScreeningEvent', 'BookRestaurant', 'GetWeather', 'SearchCreativeWork']
label = PlayMusic
class_label_index = 0

arrX = [list([101, 2424, 3684, 4400, 102])
 list([101, 2377, 5730, 4215, 10781, 102])
 list([101, 2424, 3185, 2335, 102]) list([101, 2424, 3185, 2335, 102])
 list([101, 2377, 1996, 16021, 10085, 4958, 102])]

arrY = [3 0 3 3 0]
train_X shape: (5,)
train_y shape: (5,)

train_X: 
[list([101, 2377, 3769, 102]) list([101, 2377, 6574, 102])
 list([101, 2377, 1062, 6767, 2080, 4160, 102])
 list([101, 10882, 26876, 8294, 102])
 list([101, 2377, 1062, 6767, 2080, 4160, 102])]

train_y: 
[0 0 0 6 0]
test_X shape: (5,)
test_y shape: (5,)

test_X: 
[list([101, 2424, 3684, 4400, 102])
 list([101, 2377, 5730, 4215, 10781, 102])
 list([101, 2424, 3185, 2335, 102]) list([101, 2424, 3185, 2335, 102])
 list([101, 2377, 1996, 16021, 10085, 4958, 102])]

test_y: 
[3 0 3 3 0]
<__main__.IntentDataManager object at 0x7fd8730ddf10>

Add padding to the sequence

The BERT model receives a fixed length of sentence as input. If sentences are shorter than maximum length, we will have to add paddings to the sentences to make up the length.

In [34]:
class IntentDataManager:
    
    def __init__(self, train, test, tokenizer: FullTokenizer, classes, max_seq_len):
        
        #declare tokenizer and classes as a class members
        self.tokenizer = tokenizer
        self.classes = classes
        self.max_seq_len = 0
        
         
        #sort train and test data by length of text
        train, test = map(sort_by_length_text, [train, test])
        
     
        #call preprocessData function
        (train_X, train_y), (test_X, test_y)  =  map(self.preprocessData, [train, test])
        
        '''
        print(f"train_X shape: {train_X.shape}")
        print(f"train_y shape: {train_y.shape}")
        print(f"\ntrain_X: \n{train_X[:5]}")
        print(f"\ntrain_y: \n{train_y[:5]}")

        print(f"test_X shape: {test_X.shape}")
        print(f"test_y shape: {test_y.shape}")
        print(f"\ntest_X: \n{test_X[:5]}")
        print(f"\ntest_y: \n{test_y[:5]}")
        
        '''
        

        print(f"\nmax_seq_len = {self.max_seq_len}")
        
        #pad x and y to max_seq_len
        train_X = self.padSequences(train_X)
        test_X = self.padSequences(test_X)
                
        print(f"\ntrain_X: \n{train_X[:5]}")
        print(f"\ntest_X: \n{test_X[:5]}")
        
        pass
        

    def preprocessData(self, df):
        
        x, y = [], []

        for idx, row in df[:5].iterrows():
            text = row['text']
            label = row['intent']
            
            #convert text to tokens
            tokens = self.tokenizer.tokenize(text)
            tokens = ["[CLS]"] + tokens + ["[SEP]"] 
          
           
            #convert tokens to ids
            token_ids = self.tokenizer.convert_tokens_to_ids(tokens)
    
            #append tokens_ids to x
            x.append(token_ids)

            
            #get maxmium sequence length
            self.max_seq_len = max(self.max_seq_len, len(token_ids))
            
           
            #get index of class label
            class_label_index = self.classes.index(label)
            
            #append index of class label to y
            y.append(class_label_index)
          

            pass
        
        
        arrX = np.array(x)
        arrY = np.array(y)
        
        return arrX, arrY
        pass
        
    def padSequences(self, arr):
        
        #print("arr", arr)
        newArr = []
        for item in arr:
            
            #print("item", item)
            #calculate the shortfall of sequence length
            shortfall = self.max_seq_len - len(item)
            
            #add zero to shortfall
            item = item + [0] * (shortfall)
            #print(newItem)
            
            newArr.append(item)
            pass
        
        return np.array(newArr)
        pass
       
    
    pass
#class IntentDataManager:

data = IntentDataManager(train, test, tokenizer, classes, max_seq_len)
print(data)
max_seq_len = 7

train_X: 
[[  101  2377  3769   102     0     0     0]
 [  101  2377  6574   102     0     0     0]
 [  101  2377  1062  6767  2080  4160   102]
 [  101 10882 26876  8294   102     0     0]
 [  101  2377  1062  6767  2080  4160   102]]

test_X: 
[[  101  2424  3684  4400   102     0     0]
 [  101  2377  5730  4215 10781   102     0]
 [  101  2424  3185  2335   102     0     0]
 [  101  2424  3185  2335   102     0     0]
 [  101  2377  1996 16021 10085  4958   102]]
<__main__.IntentDataManager object at 0x7fd8730da210>

Full class code

In [35]:
class IntentDataManager:
    
    def __init__(self, train, test, tokenizer: FullTokenizer, classes, max_seq_len):
        
        #declare tokenizer and classes as a class members
        self.tokenizer = tokenizer
        self.classes = classes
        self.max_seq_len = 0
        
         
        #sort train and test data by length of text
        train, test = map(sort_by_length_text, [train, test])
     
        #call preprocessData function
        (self.train_X, self.train_y), (self.test_X, self.test_y)  =  map(self.preprocessData, [train, test])
        
        self.max_seq_len = min(self.max_seq_len, max_seq_len)
        self.train_X, self.test_X = map(self.padSequences, [self.train_X, self.test_X])
        
        pass
        

    def preprocessData(self, df):
        
        x, y = [], []

        for idx, row in df.iterrows():
            text = row['text']
            label = row['intent']
            
            #convert text to tokens
            tokens = self.tokenizer.tokenize(text)
            tokens = ["[CLS]"] + tokens + ["[SEP]"] 
          
           
            #convert tokens to ids
            token_ids = self.tokenizer.convert_tokens_to_ids(tokens)
    
            #append tokens_ids to x
            x.append(token_ids)

            
            #get maxmium sequence length
            self.max_seq_len = max(self.max_seq_len, len(token_ids))
            
           
            #get index of class label
            class_label_index = self.classes.index(label)
            
            #append index of class label to y
            y.append(class_label_index)
          

            pass
        
        
        arrX = np.array(x)
        arrY = np.array(y)
        
        return arrX, arrY
        pass
    
        
    def padSequences(self, arr):
        
        #print("arr", arr)
        newArr = []
        for item in arr:
            #print("item", item)
            #calculate the shortfall of sequence length
            shortfall = self.max_seq_len - len(item)
            
            #add zero to shortfall
            #zerosToAdd = np.zeros(shortfall, dtype = np.int32)
            #newItem = np.append(item, zerosToAdd)
            item = item + [0] * (shortfall)
            #print(newItem)
            
            newArr.append(np.array(item))
            pass
        
        return np.array(newArr)
        pass
       
    
    pass
#class IntentDataManager:

data = IntentDataManager(train, test, tokenizer, classes, max_seq_len)
#print(data)
In [36]:
data.train_X.shape
Out[36]:
(13784, 38)
In [37]:
print(data.train_X[0])
[ 101 2377 3769  102    0    0    0    0    0    0    0    0    0    0
    0    0    0    0    0    0    0    0    0    0    0    0    0    0
    0    0    0    0    0    0    0    0    0    0]
In [38]:
print(data.test_X[0])
[ 101 2424 3684 4400  102    0    0    0    0    0    0    0    0    0
    0    0    0    0    0    0    0    0    0    0    0    0    0    0
    0    0    0    0    0    0    0    0    0    0]

Create custom model

In [39]:
import tensorflow as tf
from bert.loader import StockBertConfig, map_stock_config_to_params
from bert import BertModelLayer
from tensorflow import keras
from bert.loader import load_stock_weights

Create a function

In [40]:
#create customModel function
def customModel():
    print("Custom model")

    pass

model = customModel()
print(model)
Custom model
None

Read bert configuration file

In [41]:
#create custom_model function
def customModel(max_seq_len, 
                        bert_config_file, 
                        bert_ckpt_file):
    
    #read config file with special reader tf.io.gfile.GFile
    with tf.io.gfile.GFile(bert_config_file, "r") as reader:
        #read data as json string
        customConfig = StockBertConfig.from_json_string(reader.read())
        print(f"customConfig = {customConfig}")
        
        #load all params for our model
        #If params not in customConfig, defauls value is used
        bert_params = map_stock_config_to_params(customConfig)
        print(f"\nbert_params = {bert_params}")
        
        #print(f"\nbert_params.adapter_size = {bert_params.adapter_size}")
        bert_params.adapter_size = None
        print(f"\nbert_params.adapter_size = {bert_params.adapter_size}")
            
        pass
    
    pass

model = customModel(data.max_seq_len, 
                            bert_config_file, 
                            bert_ckpt_file)
print(model)
customConfig = {'attention_probs_dropout_prob': 0.1, 'hidden_act': 'gelu', 'hidden_dropout_prob': 0.1, 'hidden_size': 768, 'initializer_range': 0.02, 'intermediate_size': 3072, 'max_position_embeddings': 512, 'num_attention_heads': 12, 'num_hidden_layers': 12, 'type_vocab_size': 2, 'vocab_size': 30522, 'ln_type': None, 'embedding_size': None}

bert_params = {'initializer_range': 0.02, 'max_position_embeddings': 512, 'hidden_size': 768, 'embedding_size': None, 'project_embeddings_with_bias': True, 'vocab_size': 30522, 'use_token_type': True, 'use_position_embeddings': True, 'token_type_vocab_size': 2, 'hidden_dropout': 0.1, 'extra_tokens_vocab_size': None, 'project_position_embeddings': True, 'mask_zero': False, 'adapter_size': None, 'adapter_activation': 'gelu', 'adapter_init_scale': 0.001, 'num_heads': 12, 'size_per_head': None, 'query_activation': None, 'key_activation': None, 'value_activation': None, 'attention_dropout': 0.1, 'negative_infinity': -10000.0, 'intermediate_size': 3072, 'intermediate_activation': 'gelu', 'num_layers': 12, 'out_layer_ndxs': None, 'shared_layer': False}

bert_params.adapter_size = None

bert_layer = <bert.model.BertModelLayer object at 0x7fd8730c2790>
None

Create bert model layer from bert configuration parameters

In [ ]:
#create custom_model function
def customModel(max_seq_len, 
                        bert_config_file, 
                        bert_ckpt_file):
    
    #read config file with special reader tf.io.gfile.GFile
    with tf.io.gfile.GFile(bert_config_file, "r") as reader:
        #read data as json string
        customConfig = StockBertConfig.from_json_string(reader.read())
        print(f"customConfig = {customConfig}")
        
        #load all params for our model
        #If params not in customConfig, defauls value is used
        bert_params = map_stock_config_to_params(customConfig)
        print(f"\nbert_params = {bert_params}")
        
        #print(f"\nbert_params.adapter_size = {bert_params.adapter_size}")
        bert_params.adapter_size = None
        print(f"\nbert_params.adapter_size = {bert_params.adapter_size}")
            
        #create bert layer
        bert_layer = BertModelLayer.from_params(bert_params, name="bert_layer")
        print(f"\nbert_layer = {bert_layer}")
        
        pass
    
    pass

model = customModel(data.max_seq_len, 
                            bert_config_file, 
                            bert_ckpt_file)
print(model)

Create Keras model and add input and output layers in that

In [42]:
#create customModel
def customModel(max_seq_len, 
                        bert_config_file, 
                        bert_ckpt_file):
    
    #create input layer
    input_layer = keras.layers.Input(
                          shape=(max_seq_len, ), 
                          dtype='int32', 
                          name="input_layer")
      

    
    #read config file with special reader tf.io.gfile.GFile
    with tf.io.gfile.GFile(bert_config_file, "r") as reader:
        #read data as json string
        customConfig = StockBertConfig.from_json_string(reader.read())
        print(f"customConfig = {customConfig}")
        
        #load all params for our model
        #If params not in customConfig, defauls value is used
        bert_params = map_stock_config_to_params(customConfig)
        print(f"\nbert_params = {bert_params}")
        
        #print(f"\nbert_params.adapter_size = {bert_params.adapter_size}")
        bert_params.adapter_size = None
        print(f"\nbert_params.adapter_size = {bert_params.adapter_size}")
            
        #create bert layer
        bert_layer = BertModelLayer.from_params(bert_params, name="bert_layer")
        print(f"\nbert_layer = {bert_layer}")
        
        pass
    
    
    #process input through bert_layer
    bert_output = bert_layer(input_layer)
    print(f"bert shape = {bert_output.shape}")

    
    #create model with all layers
    custom_model = keras.Model(inputs = input_layer, outputs = bert_output)
    custom_model.build(input_shape = (None, max_seq_len))

    #load weights
    load_stock_weights(bert_layer, bert_ckpt_file)
    
    return custom_model
    pass

    
    pass

model = customModel(data.max_seq_len, 
                            bert_config_file, 
                            bert_ckpt_file)
print(model)
customConfig = {'attention_probs_dropout_prob': 0.1, 'hidden_act': 'gelu', 'hidden_dropout_prob': 0.1, 'hidden_size': 768, 'initializer_range': 0.02, 'intermediate_size': 3072, 'max_position_embeddings': 512, 'num_attention_heads': 12, 'num_hidden_layers': 12, 'type_vocab_size': 2, 'vocab_size': 30522, 'ln_type': None, 'embedding_size': None}

bert_params = {'initializer_range': 0.02, 'max_position_embeddings': 512, 'hidden_size': 768, 'embedding_size': None, 'project_embeddings_with_bias': True, 'vocab_size': 30522, 'use_token_type': True, 'use_position_embeddings': True, 'token_type_vocab_size': 2, 'hidden_dropout': 0.1, 'extra_tokens_vocab_size': None, 'project_position_embeddings': True, 'mask_zero': False, 'adapter_size': None, 'adapter_activation': 'gelu', 'adapter_init_scale': 0.001, 'num_heads': 12, 'size_per_head': None, 'query_activation': None, 'key_activation': None, 'value_activation': None, 'attention_dropout': 0.1, 'negative_infinity': -10000.0, 'intermediate_size': 3072, 'intermediate_activation': 'gelu', 'num_layers': 12, 'out_layer_ndxs': None, 'shared_layer': False}

bert_params.adapter_size = None

bert_layer = <bert.model.BertModelLayer object at 0x7fd872de5e10>
bert shape = (None, 38, 768)
Done loading 196 BERT weights from: /home/jupyter-thakur/xv-shared-folders/training/input/uncased_L-12_H-768_A-12/bert_model.ckpt into <bert.model.BertModelLayer object at 0x7fd872de5e10> (prefix:bert_layer). Count of weights not found in the checkpoint was: [0]. Count of weights with mismatched shape: [0]
Unused weights from checkpoint: 
	bert/embeddings/token_type_embeddings
	bert/pooler/dense/bias
	bert/pooler/dense/kernel
	cls/predictions/output_bias
	cls/predictions/transform/LayerNorm/beta
	cls/predictions/transform/LayerNorm/gamma
	cls/predictions/transform/dense/bias
	cls/predictions/transform/dense/kernel
	cls/seq_relationship/output_bias
	cls/seq_relationship/output_weights
<tensorflow.python.keras.engine.training.Model object at 0x7fd8730a5b90>

Add hidden layers to the model

In [43]:
max_seq_len = 192
In [44]:
#create customModel
def customModel(max_seq_len, 
                        bert_config_file, 
                        bert_ckpt_file):
    
    #create input layer
    input_layer = keras.layers.Input(
                          shape=(max_seq_len, ), 
                          dtype='int32', 
                          name="input_layer")
      

    
    #read config file with special reader tf.io.gfile.GFile
    with tf.io.gfile.GFile(bert_config_file, "r") as reader:
        #read data as json string
        customConfig = StockBertConfig.from_json_string(reader.read())
        print(f"customConfig = {customConfig}")
        
        #load all params for our model
        #If params not in customConfig, defauls value is used
        bert_params = map_stock_config_to_params(customConfig)
        print(f"\nbert_params = {bert_params}")
        
        #print(f"\nbert_params.adapter_size = {bert_params.adapter_size}")
        bert_params.adapter_size = None
        print(f"\nbert_params.adapter_size = {bert_params.adapter_size}")
            
        #create bert layer
        bert_layer = BertModelLayer.from_params(bert_params, name="bert_layer")
        print(f"\nbert_layer = {bert_layer}")
        
        pass
    
     
    #process input through bert_layer
    bert_output = bert_layer(input_layer)
    print(f"bert shape = {bert_output.shape}")
    
    #add hidden layer1
    hidden_output1 = keras.layers.Lambda(lambda seq: seq[:, 0, :])(bert_output)
    print(f"hidden_output1 = {hidden_output1.shape}")
    
    #dropout layer 1
    dropout_1 = keras.layers.Dropout(0.5)(hidden_output1)
    print(f"dropout_output1 = {hidden_output1.shape}")

    #add hidden layer2
    hidden_output2 = keras.layers.Dense(units=768, activation="tanh")(dropout_1)
    #print(f"hidden_output2 = {hidden_output2.shape}")
    
    #dropout layer 2
    dropout_2 = keras.layers.Dropout(0.5)(hidden_output2)
    print(f"dropout_output2 = {hidden_output2.shape}")
    
    final_output = keras.layers.Dense(units=len(classes), activation="softmax")(dropout_2)
    print(f"final_output = {final_output.shape}")

    
    #create model with all layers
    model = keras.Model(inputs = input_layer, outputs = final_output)
    model.build(input_shape = (None, max_seq_len))
    
    load_stock_weights(bert_layer, bert_ckpt_file)
    
    return model
    pass

    
    pass

model = customModel(data.max_seq_len, 
                            bert_config_file, 
                            bert_ckpt_file)
print(model)
customConfig = {'attention_probs_dropout_prob': 0.1, 'hidden_act': 'gelu', 'hidden_dropout_prob': 0.1, 'hidden_size': 768, 'initializer_range': 0.02, 'intermediate_size': 3072, 'max_position_embeddings': 512, 'num_attention_heads': 12, 'num_hidden_layers': 12, 'type_vocab_size': 2, 'vocab_size': 30522, 'ln_type': None, 'embedding_size': None}

bert_params = {'initializer_range': 0.02, 'max_position_embeddings': 512, 'hidden_size': 768, 'embedding_size': None, 'project_embeddings_with_bias': True, 'vocab_size': 30522, 'use_token_type': True, 'use_position_embeddings': True, 'token_type_vocab_size': 2, 'hidden_dropout': 0.1, 'extra_tokens_vocab_size': None, 'project_position_embeddings': True, 'mask_zero': False, 'adapter_size': None, 'adapter_activation': 'gelu', 'adapter_init_scale': 0.001, 'num_heads': 12, 'size_per_head': None, 'query_activation': None, 'key_activation': None, 'value_activation': None, 'attention_dropout': 0.1, 'negative_infinity': -10000.0, 'intermediate_size': 3072, 'intermediate_activation': 'gelu', 'num_layers': 12, 'out_layer_ndxs': None, 'shared_layer': False}

bert_params.adapter_size = None

bert_layer = <bert.model.BertModelLayer object at 0x7fd841fc9fd0>
bert shape = (None, 38, 768)
hidden_output1 = (None, 768)
dropout_output1 = (None, 768)
dropout_output2 = (None, 768)
final_output = (None, 7)
Done loading 196 BERT weights from: /home/jupyter-thakur/xv-shared-folders/training/input/uncased_L-12_H-768_A-12/bert_model.ckpt into <bert.model.BertModelLayer object at 0x7fd841fc9fd0> (prefix:bert_layer_1). Count of weights not found in the checkpoint was: [0]. Count of weights with mismatched shape: [0]
Unused weights from checkpoint: 
	bert/embeddings/token_type_embeddings
	bert/pooler/dense/bias
	bert/pooler/dense/kernel
	cls/predictions/output_bias
	cls/predictions/transform/LayerNorm/beta
	cls/predictions/transform/LayerNorm/gamma
	cls/predictions/transform/dense/bias
	cls/predictions/transform/dense/kernel
	cls/seq_relationship/output_bias
	cls/seq_relationship/output_weights
<tensorflow.python.keras.engine.training.Model object at 0x7fd8404b6490>

Full custom model

In [45]:
#create customModel
def customModel(max_seq_len, 
                        bert_config_file, 
                        bert_ckpt_file):
    
    #create input layer
    input_layer = keras.layers.Input(
                          shape=(max_seq_len, ), 
                          dtype='int32', 
                          name="input_layer")
      

    
    #read config file with special reader tf.io.gfile.GFile
    with tf.io.gfile.GFile(bert_config_file, "r") as reader:
        #read data as json string
        customConfig = StockBertConfig.from_json_string(reader.read())
        
        #load all params for our model
        #If params not in customConfig, defauls value is used
        bert_params = map_stock_config_to_params(customConfig)
        
        #print(f"\nbert_params.adapter_size = {bert_params.adapter_size}")
        bert_params.adapter_size = None
            
        #create bert layer
        bert_layer = BertModelLayer.from_params(bert_params, name="bert_layer")
        
        pass
    
     
    #process input through bert_layer
    bert_output = bert_layer(input_layer)
    
    #add hidden layer1
    hidden_output1 = keras.layers.Lambda(lambda seq: seq[:, 0, :])(bert_output)
    
    #dropout layer 1
    dropout_1 = keras.layers.Dropout(0.5)(hidden_output1)

    #add hidden layer2
    hidden_output2 = keras.layers.Dense(units=768, activation="tanh")(dropout_1)
    
    #dropout layer 2
    dropout_2 = keras.layers.Dropout(0.5)(hidden_output2)
    
    final_output = keras.layers.Dense(units=len(classes), activation="softmax")(dropout_2)

    
    #create model with all layers
    model = keras.Model(inputs = input_layer, outputs = final_output)
    model.build(input_shape = (None, max_seq_len))
    
    load_stock_weights(bert_layer, bert_ckpt_file)
    
    return model
    pass

    
    pass

Call model

In [46]:
model = customModel(data.max_seq_len, bert_config_file, bert_ckpt_file)
print(model)
Done loading 196 BERT weights from: /home/jupyter-thakur/xv-shared-folders/training/input/uncased_L-12_H-768_A-12/bert_model.ckpt into <bert.model.BertModelLayer object at 0x7fd8730e97d0> (prefix:bert_layer_2). Count of weights not found in the checkpoint was: [0]. Count of weights with mismatched shape: [0]
Unused weights from checkpoint: 
	bert/embeddings/token_type_embeddings
	bert/pooler/dense/bias
	bert/pooler/dense/kernel
	cls/predictions/output_bias
	cls/predictions/transform/LayerNorm/beta
	cls/predictions/transform/LayerNorm/gamma
	cls/predictions/transform/dense/bias
	cls/predictions/transform/dense/kernel
	cls/seq_relationship/output_bias
	cls/seq_relationship/output_weights
<tensorflow.python.keras.engine.training.Model object at 0x7fd82e8dfe50>

Model summary

In [47]:
model.summary()
Model: "model_2"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_layer (InputLayer)     [(None, 38)]              0         
_________________________________________________________________
bert_layer (BertModelLayer)  (None, 38, 768)           108890112 
_________________________________________________________________
lambda_1 (Lambda)            (None, 768)               0         
_________________________________________________________________
dropout_2 (Dropout)          (None, 768)               0         
_________________________________________________________________
dense_2 (Dense)              (None, 768)               590592    
_________________________________________________________________
dropout_3 (Dropout)          (None, 768)               0         
_________________________________________________________________
dense_3 (Dense)              (None, 7)                 5383      
=================================================================
Total params: 109,486,087
Trainable params: 109,486,087
Non-trainable params: 0
_________________________________________________________________

Plot model

In [52]:
from tensorflow.keras.utils import plot_model
In [53]:
plot_model(model, to_file='bert_model.png')
Out[53]:

Compile model

In [48]:
model.compile(
    optimizer=keras.optimizers.Adam(1e-5),
    
    # Loss function to minimize
    loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True),
    
    # List of metrics to monitor
    metrics=[keras.metrics.SparseCategoricalAccuracy(name="acc")]
)

Train the model

fit() function will train the model by slicing the data into "batches" of size "batch_size", and repeatedly iterating over the entire dataset for a given number of "epochs".

In [49]:
x = data.train_X
print(x[0])
[ 101 2377 3769  102    0    0    0    0    0    0    0    0    0    0
    0    0    0    0    0    0    0    0    0    0    0    0    0    0
    0    0    0    0    0    0    0    0    0    0]
In [50]:
y = data.train_y
print(y[:5])
[0 0 0 6 0]
In [51]:
history = model.fit(
  x, 
  y,
  validation_split = 0.1,
  batch_size = 16,
  shuffle = True,
  epochs = 5
)
Epoch 1/5
776/776 [==============================] - 1577s 2s/step - loss: 1.3077 - acc: 0.8722 - val_loss: 1.1887 - val_acc: 0.9775
Epoch 2/5
776/776 [==============================] - 1574s 2s/step - loss: 1.1848 - acc: 0.9820 - val_loss: 1.1691 - val_acc: 0.9964
Epoch 3/5
776/776 [==============================] - 1572s 2s/step - loss: 1.1805 - acc: 0.9856 - val_loss: 1.1698 - val_acc: 0.9956
Epoch 4/5
776/776 [==============================] - 1573s 2s/step - loss: 1.1783 - acc: 0.9877 - val_loss: 1.1684 - val_acc: 0.9971
Epoch 5/5
776/776 [==============================] - 1573s 2s/step - loss: 1.1763 - acc: 0.9895 - val_loss: 1.1715 - val_acc: 0.9935

Print history

In [56]:
history.history
Out[56]:
{'loss': [1.3076690435409546,
  1.184829592704773,
  1.180479884147644,
  1.1782810688018799,
  1.176334023475647],
 'acc': [0.8722289204597473,
  0.9820233583450317,
  0.9855703115463257,
  0.9876662492752075,
  0.9895203709602356],
 'val_loss': [1.18868088722229,
  1.1690980195999146,
  1.169758677482605,
  1.1684192419052124,
  1.1715320348739624],
 'val_acc': [0.9775199294090271,
  0.9963741898536682,
  0.9956490397453308,
  0.9970993399620056,
  0.9934735298156738]}

Visualize loss and accuracy

Plot loss during training

In [64]:
plt.figure(figsize = (10, 6))
plt.plot(history.history['loss'], label='train')
plt.plot(history.history['val_loss'], label='test')
plt.legend(['train', 'test'])
plt.title('Loss during training')
plt.show();

Plot accuracy during training

In [65]:
plt.figure(figsize = (10, 6))
plt.plot(history.history['acc'], label = 'train')
plt.plot(history.history['val_acc'], label = 'test')
plt.legend(['train', 'test'])
plt.title('Accuracy during training')
plt.show();

Evaluate the model

Evaluate the model on the test data by evaluate()

In [67]:
train_loss, train_accuracy = model.evaluate(data.train_X, data.train_y)
test_loss, test_accuracy = model.evaluate(data.test_X, data.test_y, batch_size = 16)
print("train_loss, train_accuracy:", train_accuracy)
print("test_loss, test_accuracy:", test_accuracy)
431/431 [==============================] - 458s 1s/step - loss: 1.1750 - acc: 0.9904
44/44 [==============================] - 23s 523ms/step - loss: 1.1911 - acc: 0.9743
train_loss, train_accuracy: 0.990351140499115
test_loss, test_accuracy: 0.9742857217788696

Predict

In [74]:
#predict test data
y_pred = model.predict(data.test_X).argmax(axis = -1)
y_pred.shape
Out[74]:
(700,)
In [76]:
y_pred[:10]
Out[76]:
array([6, 0, 3, 3, 6, 0, 0, 0, 1, 5])
In [77]:
for label in y_pred[:10]:
    print(classes[label])
    pass
SearchCreativeWork
PlayMusic
SearchScreeningEvent
SearchScreeningEvent
SearchCreativeWork
PlayMusic
PlayMusic
PlayMusic
AddToPlaylist
GetWeather

Compute classification report

In [78]:
from sklearn.metrics import classification_report
In [79]:
print(classification_report(data.test_y, y_pred, target_names = classes))
                      precision    recall  f1-score   support

           PlayMusic       1.00      0.93      0.96        86
       AddToPlaylist       0.99      1.00      1.00       124
            RateBook       1.00      1.00      1.00        80
SearchScreeningEvent       1.00      0.90      0.95       107
      BookRestaurant       0.99      1.00      0.99        92
          GetWeather       1.00      0.99      1.00       104
  SearchCreativeWork       0.87      1.00      0.93       107

            accuracy                           0.97       700
           macro avg       0.98      0.97      0.98       700
        weighted avg       0.98      0.97      0.97       700

Predict intent with new sentences

In [82]:
sentences = [
    "Play party song",
    "Dance song",
    "How is weather today"
]

#tokenize sentences
tokens = map(tokenizer.tokenize, sentences)

#add [CLS] and [SEP] Tokens  
tokens = map(lambda token: ["[CLS]"] + token + ["[SEP]"], tokens)

#convert each tokens to ids
token_ids = list(map(tokenizer.convert_tokens_to_ids, tokens))

#add padding
token_ids = map(lambda tids: tids + [0] * (data.max_seq_len-len(tids)), token_ids)
token_ids = np.array(list(token_ids))

#predict
predictions = model.predict(token_ids).argmax(axis = -1)

for text, label in zip(sentences, predictions):
    print("Text:", text, "\nIntent:", classes[label])
    print()
Text: Play party song 
Intent: PlayMusic

Text: Dance song 
Intent: SearchCreativeWork

Text: How is weather today 
Intent: GetWeather

In [ ]:
 
In [ ]:
 
In [ ]:
 

Machine Learning

  1. Deal Banking Marketing Campaign Dataset With Machine Learning

TensorFlow

  1. Difference Between Scalar, Vector, Matrix and Tensor
  2. TensorFlow Deep Learning Model With IRIS Dataset
  3. Sequence to Sequence Learning With Neural Networks To Perform Number Addition
  4. Image Classification Model MobileNet V2 from TensorFlow Hub
  5. Step by Step Intent Recognition With BERT
  6. Sentiment Analysis for Hotel Reviews With NLTK and Keras
  7. Simple Sequence Prediction With LSTM
  8. Image Classification With ResNet50 Model
  9. Predict Amazon Inc Stock Price with Machine Learning
  10. Predict Diabetes With Machine Learning Algorithms
  11. TensorFlow Build Custom Convolutional Neural Network With MNIST Dataset
  12. Deal Banking Marketing Campaign Dataset With Machine Learning

PySpark

  1. How to Parallelize and Distribute Collection in PySpark
  2. Role of StringIndexer and Pipelines in PySpark ML Feature - Part 1
  3. Role of OneHotEncoder and Pipelines in PySpark ML Feature - Part 2
  4. Feature Transformer VectorAssembler in PySpark ML Feature - Part 3
  5. Logistic Regression in PySpark (ML Feature) with Breast Cancer Data Set

PyTorch

  1. Build the Neural Network with PyTorch
  2. Image Classification with PyTorch
  3. Twitter Sentiment Classification In PyTorch
  4. Training an Image Classifier in Pytorch

Natural Language Processing

  1. Spelling Correction Of The Text Data In Natural Language Processing
  2. Handling Text For Machine Learning
  3. Extracting Text From PDF File in Python Using PyPDF2
  4. How to Collect Data Using Twitter API V2 For Natural Language Processing
  5. Converting Text to Features in Natural Language Processing
  6. Extract A Noun Phrase For A Sentence In Natural Language Processing