Descriptive statistics

INFO 555: Applied Natural Language Processing

Most of web data today consists of unstructured text. This course will cover the fundamental knowledge necessary to organize such texts, search them a meaningful way, and extract relevant information from them. This course will teach natural language processing through the design and development of end-to-end natural language understanding applications, including sentiment analysis (e.g., is this review positive or negative?), information extraction (e.g., extracting named entities and their relations from text), and question answering (retrieving exact answers to natural language questions such as "What is the capital of France" from large document collections). We will use several natural language processing toolkits, such as NLTK and Stanford's CoreNLP. The main programming language used in the course will be Python, but code written in Java or Scala will be accepted as well. Graduate-level requirements include implementing more complex, state-of-the-art algorithms for the three proposed projects. This will require additional reading of conference papers and journal articles.

Course Credits
3

INFO 523: Data Mining and Discovery

This course will introduce students to the concepts and techniques of data mining for knowledge discovery. It includes methods developed in the fields of statistics, large-scale data analytics, machine learning, pattern recognition, database technology and artificial intelligence for automatic or semi-automatic analysis of large quantities of data to extract previously unknown interesting patterns. Topics include understanding varieties of data, data preprocessing, classification, association and correlation rule analysis, cluster analysis, outlier detection, and data mining trends and research frontiers. We will use software packages for data mining, explaining the underlying algorithms and their use and limitations. The course include laboratory exercises, with data mining case studies using data from many different resources such as social networks, linguistics, geo-spatial applications, marketing and/or psychology.

Course Credits
3

INFO 521: Introduction to Machine Learning

Machine learning describes the development of algorithms which can modify their internal parameters (i.e., "learn") to recognize patterns and make decisions based on example data. These examples can be provided by a human, or they can be gathered automatically as part of the learning algorithm itself. This course will introduce the fundamentals of machine learning, will describe how to implement several practical methods for pattern recognition, feature selection, clustering, and decision making for reward maximization, and will provide a foundation for the development of new machine learning algorithms.

 

Course Credits
3