Introduction to NLP#

📜 Course Description#

This course provides an introduction to natural language processing (NLP), a subfield of artificial intelligence concerned with the interaction between computers and human language. The course will cover the fundamental concepts and techniques of NLP, including text pre-processing, morphology, syntax, semantics, text classification, topic modeling, and word embeddings.

♾️ Learning Goals#

By the end of this course, students will be able to:

  • Understand key concepts, models, and challenges in natural language processing

  • Implement and apply fundamental algorithms in NLP

  • Evaluate and use software systems for various NLP tasks

  • Understand current approaches, datasets, and systems for various NLP tasks

  • Build NLP models for various applications

📚 Textbook#

  • Jurafsky, D., & Martin, J. H. (2019). Speech and language processing: An introduction to natural language processing, computational linguistics, and speech recognition (3rd ed.). Upper Saddle River, NJ: Pearson Education.

  • https://web.stanford.edu/~jurafsky/slp3/

Reference Books#

  • Eckert, P., & Strunk Jr., W. (2000). Elements of style (4th ed.). New York: Allyn and Bacon.

  • Gelbukh, A. (Ed.). (2018). Computational linguistics and intelligent text processing: 18th international conference, CICLing 2017, New Delhi, India, February 19-25, 2017, revised selected papers (Vol. 10573). Cham: Springer International Publishing.

  • Mitchell, T. M. (1997). Machine learning (1st ed.). New York: McGraw-Hill Science/Engineering/Math.

  • NLTK Book - Bird et al.: Natural Language Processing with Python

🏆 Grading#

  • Participation: 10%

  • Midterm: 30%

  • Term Project: 60%

🧠 Term Project#

  • For the term project, students will choose a real-world data set and build a natural language processing system that can perform some task on the data set.

  • The project will be presented in the form of a poster at the end of the semester.

🎲 Whole Game of NLP#

In this course, we apply the concept of “teaching the whole game” to natural language processing (NLP). We believe that just like in baseball, students should be able to start playing with a general sense of NLP and gradually learn more rules and details as they progress.

Unlike many NLP courses that focus only on the technical aspects of the field, we take a practical approach to NLP. We show how to use fundamental concepts and techniques of NLP, including text pre-processing, morphology, syntax, semantics, text classification, topic modeling, and word embeddings, to solve real-world problems.

We begin by introducing students to a complete, working, and very usable NLP system that uses simple and expressive tools. Then, we gradually introduce more advanced concepts, showing students how to apply them to real-world datasets.

This approach has several advantages. It makes NLP more accessible and understandable to students, and it allows them to immediately start using NLP to solve their own problems. Furthermore, it helps students learn the whole game of NLP, not just individual concepts or techniques. By covering both the theory and practice of NLP, students will be well-prepared for further study or practical applications in the field.

Harvard Professor David Perkins’s book, Making Learning Whole, popularized the idea of “teaching the whole game.”

🗓️ Table of Contents#