Showing posts with label nlp. Show all posts
Showing posts with label nlp. Show all posts

Introduction to Information Retrieval Review

Introduction to Information Retrieval
Average Reviews:

(More customer reviews)
I am a big fan of the authors 1999 book on Statistical Natural Language Processing, and I and was thrilled when I found this new book online -- just search for "Information Retrieval" on Google.
In these two books, they describe the theory behind a vast toolbox which can be used to construct new tools/products for the Internet. Now I can go back to them when the need arises.
For starters, I appreciate the detailed theoretical explanations of topics that I could not find in other texts, and the references to related work are especially helpful. One of the other books I read was Information Retrieval by Grossman, which is an older book but has a more condensed style compared to this. Grossman's discussion of clustering was more high level and referenced a few more papers that I found useful. That helped increase my interest to read through these chapters in which offer greater detail.
Before I felt like I could place each topic in its appropriate context, I had to spend six months of reading both the books, playing with code and finding s/w packages, searching the research literature, reading papers and other books, and then cycling back to the books. Here's are some suggestions for things I'd like to see:
1. A set of recomended programming tools: in some books on Perl -- such as the chapter "Natural Language Tools" in pages 149-171 in "Advanced Perl Programming" by Simon Cozens (O'Reilly) -- you get a very "quick & dirty" introduction to maybe 20-30% of the concepts in these two books along with ways to implement and play around with them. Although Perl has many natural language processing tools, the Cozens book cuts to the chase, explains which are the best tools, and shows you how to use them. I think knowing such shortcuts aids in learning how to apply and improve on them. The more complex and sophisticated topics, the more likely to make it out into the real world if they are easy to play with.
2. More data/examples on what does/doesn't work with end-users: Numbers, graphs, and charts are all good stuff. I always appreciate it when the authors referenced quantitative comparisons, real-world products, and history of Internet. One of the reasons I had to consult the research literature was to broaden my understanding of quantitative comparisons between different techniques involving end-users, which were typically done in the context of complete systems studies that users could try out.
Thanks,
-Sri

Click Here to see more reviews about: Introduction to Information Retrieval

Class-tested and coherent, this groundbreaking new textbook teaches web-era information retrieval, including web search and the related areas of text classification and text clustering from basic concepts. Written from a computer science perspective by three leading experts in the field, it gives an up-to-date treatment of all aspects of the design and implementation of systems for gathering, indexing, and searching documents; methods for evaluating systems; and an introduction to the use of machine learning methods on text collections. All the important ideas are explained using examples and figures, making it perfect for introductory courses in information retrieval for advanced undergraduates and graduate students in computer science. Based on feedback from extensive classroom experience, the book has been carefully structured in order to make teaching more natural and effective. Although originally designed as the primary text for a graduate or advanced undergraduate course in information retrieval, the book will also create a buzz for researchers and professionals alike.

Buy Now

Click here for more information about Introduction to Information Retrieval

Read More...

Natural Language Processing for Online Applications: Text Retrieval, Extraction, and Categorization (Natural Language Processing, 5) Review

Natural Language Processing for Online Applications: Text Retrieval, Extraction, and Categorization (Natural Language Processing, 5)
Average Reviews:

(More customer reviews)
Work on Natural Language processing has been going on for at least
thirty years. In the past most natural language processing (NLP)
applications where mainly in the research realm. The rapid
increase in computer processing power and disk storage capacity
have moved NLP from research into the area of applied science.
This gives NLP the feel of a new and vibrant area. Progress is
being made rapidly, but the research literature can be difficult
for someone who has no experience with NLP. Simply learning the
terminology can be time consuming.
This book by Jackson and Moulinier provides an excellent overview
of several sub-areas of NLP applied to natural language text.
Both Jackson and Moulinier have been involved in implementing
NLP applications in a commercial context, so there is a
some concentration on applying NLP in real applications, rather
than artificial contexts like the Message Understand Conference
data set.
I purchased this book for its coverage of Information Extraction.
Along with this book I read a number of papers from the research
literature. One paper I found particularly interesting was
a paper on the FASTUS NLP system developed by researchers at
SRI. I was very happy to see that FASTUS and finite automata
approaches were covered in more detail in this book.
For my purposes I would have liked a book that moved from
an overview of various NLP applications to more implementation
detail. For example, while I think that I understand
push down automata from working on parsers for compilers,
I don't fully understand them in the NLP context. This book
did not go into enough detail to make this clear.
I cannot really offer this lack of detail as a criticism
since I don't believe that it was the authors intention to
provide this level of detail. Their objective is to provide
a detailed overview and I think that they succeeded in doing
this. A book that provided this overview with
details on implementation would be much longer, perhaps the
size of Manning and Schutze's excellent book "Foundations of
Natural Language Processing" (Manning and Schutze provide
a great deal of detail, but they do not cover information
extraction).

Click Here to see more reviews about: Natural Language Processing for Online Applications: Text Retrieval, Extraction, and Categorization (Natural Language Processing, 5)



Buy Now

Click here for more information about Natural Language Processing for Online Applications: Text Retrieval, Extraction, and Categorization (Natural Language Processing, 5)

Read More...

Foundations of Statistical Natural Language Processing Review

Foundations of Statistical Natural Language Processing
Average Reviews:

(More customer reviews)
This is the best book I've ever read on computational linguistics. It should be ideal for both linguists who want to learn about statistical language processing and those building language applications who want to learn about linguistics. This book isn't even published and it's now my most highly used reference book, joining gems such as Cormen, Leiserson and Rivest's algorithm book, Quirk et al.'s English Grammar, and Andrew Gelman's Bayesian statistics book (three excellent companions to this book, by the way).
The book is written more like a computer science or math book in that it starts absolutely from scratch, but moves quickly and assumes a sophisticated reader. The first one hundred or so pages provide background in probability, information theory and linguistics.
This book covers (almost) every current trend in NLP from a statistical perspective: syntactic tagging, sense disambiguation, parsing, information retrieval, lexical subcategorization, Hidden Markov Models, and probabilistic context-free grammars. It also covers machine translation and information retrieval in later chapters.
It covers all the statistical techniques used in NLP from Bayes' law through to maximum entropy modeling, clustering: nearest neighbors and decision trees, and much more.
What you won't find is information on applications to higher-level discourse and dialogue phenomena like pronoun resolution or speech act classification.

Click Here to see more reviews about: Foundations of Statistical Natural Language Processing

Statistical approaches to processing natural language text have becomedominant in recent years. This foundational text is the first comprehensiveintroduction to statistical natural language processing (NLP) to appear. The bookcontains all the theory and algorithms needed for building NLP tools. It providesbroad but rigorous coverage of mathematical and linguistic foundations, as well asdetailed discussion of statistical methods, allowing students and researchers toconstruct their own implementations. The book covers collocation finding, word sensedisambiguation, probabilistic parsing, information retrieval, and otherapplications.

Buy Now

Click here for more information about Foundations of Statistical Natural Language Processing

Read More...