Topic covered
Introduction of Natural Language Processing

Click Here Chapter Summary Notes.

Let Us Revise

Both Human Languages & computer languages have syntax and semantics and a specific structure.
But Human Languages have morphology, context-sensitivity and convey the intent even when mispronounced, whereas computer language have no morphology and no context-sensitivity.
NLP models mostly use Text Normalization to reach to the intent of the message.
Text Normalization is a process to reduce the variations in text's word forms to a common form when the variations mean the same thing..
Text Normalization uses the following steps: Sentence Segmentation,Tokenization,Removing Punctuation & stop words ,Case Normalization, stemming or lemmatisation .
Sentence Segmentation is the process of dividing the whole text into smaller components, i.e. individual sentences..
The whole collection of words from all the documents being processed, is called corpus.
Tokenization is the process of splitting up of individual sentences into smaller units called token (a word, a phrase, a number or a symbol).
TF-IDF(Term Frequency -Inverse document frequency)is a statistical measure that evaluates how relevant a word is to a document in a collection of documents..

Click Below
NLP

Test your Knowledge
Chapter based MCQ's
Click below

MCQ's Test

Chapter based Question and Answer.

Click Below
Question With Answer of Natural Language Processing (Courtsey - CBSE)

TF-IDF(Term Frequency-Inverse Document Frequency) based Question

Competency based Questions -- Evaluation

1.An AI model has been developed to filter spam mails.In the detection of spam mail,it is okey if any spam mail remains undetected(false negative), but what if + miss any critical mail because it is classified as spam(false positive). In this situation, false Positive should be as low as possible.
Thus,here _______ is more vital as compared to recall.

accuracy.
F1 Score
Precision
Confusion Matrix

Show Answer

Answer - (c) Precision

2.An AI model has been developed to detect credit card fraud detections. The aim is not to miss any fraud transactions. Therefore, we want False-Negative to be as low as possible.In these situations, we can compromise with the low precision, but ____ should be high. .

accuracy
recall
F1 score
confusion Matrix

Show Answer

Answer-(b) recall