Topic covered
Introduction of Natural Language Processing

Click Here   Chapter Summary Notes.
  • Let Us Revise

    1. Both Human Languages & computer languages have syntax and semantics and a specific structure.
    2. But Human Languages have morphology, context-sensitivity and convey the intent even when mispronounced, whereas computer language have no morphology and no context-sensitivity.
    3. NLP models mostly use Text Normalization to reach to the intent of the message.
    4. Text Normalization is a process to reduce the variations in text's word forms to a common form when the variations mean the same thing..
    5. Text Normalization uses the following steps: Sentence Segmentation,Tokenization,Removing Punctuation & stop words ,Case Normalization, stemming or lemmatisation .
    6. Sentence Segmentation is the process of dividing the whole text into smaller components, i.e. individual sentences..
    7. The whole collection of words from all the documents being processed, is called corpus.
    8. Tokenization is the process of splitting up of individual sentences into smaller units called token (a word, a phrase, a number or a symbol).
    9. TF-IDF(Term Frequency -Inverse document frequency)is a statistical measure that evaluates how relevant a word is to a document in a collection of documents..




Click Below
NLP


Test your Knowledge
Chapter based MCQ's
Click below

         

MCQ's Test

Competency based Questions -- Evaluation

1.An AI model has been developed to filter spam mails.In the detection of spam mail,it is okey if any spam mail remains undetected(false negative), but what if + miss any critical mail because it is classified as spam(false positive). In this situation, false Positive should be as low as possible.
Thus,here _______ is more vital as compared to recall.
  1. accuracy.
  2. F1 Score
  3. Precision
  4. Confusion Matrix
Show Answer

Answer - (c) Precision


2.An AI model has been developed to detect credit card fraud detections. The aim is not to miss any fraud transactions. Therefore, we want False-Negative to be as low as possible.In these situations, we can compromise with the low precision, but ____ should be high. .
  1. accuracy
  2. recall
  3. F1 score
  4. confusion Matrix
Show Answer

Answer-(b) recall