Week 2 notes completed
This commit is contained in:
49
CM3060 Natural Language Processing/Week 2/Week 2 Notes.md
Normal file
49
CM3060 Natural Language Processing/Week 2/Week 2 Notes.md
Normal file
@@ -0,0 +1,49 @@
|
|||||||
|
# NLP Platforms and toolkit
|
||||||
|
We'll be using NLTK, which has its origin in computational linguistics.
|
||||||
|
|
||||||
|
## Principles of NLTK
|
||||||
|
* Simplicity
|
||||||
|
* Consistency
|
||||||
|
* Extensibility
|
||||||
|
* Modularity
|
||||||
|
|
||||||
|
NLTK is not necessarily SOTA (State-Of-The-Art) or optimized for runtime performance.
|
||||||
|
|
||||||
|
Other toolkits
|
||||||
|
* Scikit-learn (ML library) A lot of NLP tasks can be modeled as ML tasks.
|
||||||
|
* Gensim - word embedding, topic modeling for humans
|
||||||
|
* spaCy - Industrial-strength Natural Language Processing
|
||||||
|
* Textacy - built on top of spaCy
|
||||||
|
* TextBlob - Simplified Text Processing built on nltk
|
||||||
|
* Deep learning: TensorFlow, PyTorch
|
||||||
|
* Chatbot development: RASA
|
||||||
|
|
||||||
|
# Introduction to evaluation
|
||||||
|
|
||||||
|
Before we build our system we should think how we're going to **measure its performance**, **how do we know when we have done a good job**?
|
||||||
|
|
||||||
|
### In a classification problem
|
||||||
|
In a spam filter we have a binary problem (yes/no) for whether the message is spam or not.
|
||||||
|
|
||||||
|
Accuracy = Correct Predictions / Total Number
|
||||||
|
|
||||||
|
This could result in skewed results with unbalanced data, if for example the data already has a majority class which results in high accuracy only by choosing it.
|
||||||
|
|
||||||
|
#### Another example classification problem
|
||||||
|
|
||||||
|
Predict whether a CT scan shows a tumor or not?
|
||||||
|
|
||||||
|
Tumors are rare events, so our classes are unbalanced.
|
||||||
|
The cost of missing a tumor is much higher than a false positive. **Accuracy is not a good metric**
|
||||||
|
|
||||||
|
This is where confusion matrices come into play, this compares predicted values with actual values (ground truth).
|
||||||
|
|
||||||
|

|
||||||
|
|
||||||
|
Accuracy = (TP + TN) / (TP + TN + FP + FN)
|
||||||
|
Recall = TP / (TP + FN)
|
||||||
|
Precission = TP / (TP + FP)
|
||||||
|
|
||||||
|
**Recall** and **Precision** are better at illustrating the performance for unbalanced datasets and a baseline against which we can compare the performance of future iterations of our systems with different algorithms or approaches.
|
||||||
|
|
||||||
|

|
||||||
Reference in New Issue
Block a user