Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Understanding Language Structure for AI: Speech Acts & Natural Language Processing - Prof., Study notes of Computer Science

An introduction to speech acts and natural language processing from the perspective of artificial intelligence. It covers the history of communication as a set of speech acts, the distinction between formal and natural languages, the structure of natural languages through grammars and classes of languages, and the process of parsing sentences. The document also discusses the challenges and advantages of top-down and bottom-up parsing, as well as chart parsing.

Typology: Study notes

Pre 2010

Uploaded on 07/30/2009

koofers-user-3wi
koofers-user-3wi 🇺🇸

10 documents

1 / 33

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Artificial Intelligence
Programming
Natural Language Processing
Chris Brooks
Department of Computer Science
University of San Francisco
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f
pf20
pf21

Partial preview of the text

Download Understanding Language Structure for AI: Speech Acts & Natural Language Processing - Prof. and more Study notes Computer Science in PDF only on Docsity!

Artificial IntelligenceProgramming Natural Language Processing^ Chris BrooksDepartment of Computer ScienceUniversity of San Francisco

Speech Acts Since the 1950s (Wittgenstein), communication hasbeen seen as a set of

speech acts.

Communication as a form of action. Acts include: query, inform, request, acknowledge,promise An agent has a goal that it needs to accomplish, andselects speech acts that help it to acomplish that goal.

Department of Computer Science — University of San Francisco – p.1/

Language A language consists of a (possibly infinite) set of strings. These strings are constructed through theconcatenation of^

terminal symbols

We’ll distinguish between

formal languages

and^ natural

languages Formal languages have strict mathematical definitions. We can say unambiguously whether a string is a legalutterance in that language.^ SQL, first-order logic, Java, and Python are all formallanguages.

Department of Computer Science — University of San Francisco – p.3/

Natural Language Natural languages do not have a strict mathematicaldefinition. They have evolved through a community of usage.^ English, Chinese, Japanese, Spanish, French, etc. Structure can be specified:^ Prescriptively: What are the “correct” rules of thelanguage.^ Descriptively: How is the language actually used inpractice? We’ll attempt to treat natural languages as formallanguages, even though the match is inexact.

Department of Computer Science — University of San Francisco – p.4/

Example Lexicon Noun ->^ cat^ | dog^ |^ bunny

|^ fish InTansVerb^ ->^ sit^ |^

sleep^ |^ eat TransVerb^ ->^ isAdjective^ ->^ happy^ |

sad^ |^ tired Adverb^ ->^ happily^ |^

quietly Gerund^ ->^ sleepingArticle^ ->^ the^ |^ a |

an Conjunction^ ->^ and^ |

or^ |^ but

Department of Computer Science — University of San Francisco – p.6/

Example Grammar S -> NP^ VP^ |^ S^ ->^ S^ Conjunction

S

NP^ ->^ Noun^ |^ Article

Noun VP^ ->^ InTransVerb^ |^

TransVerb^ Adjective^

|^ InTransVerb^ Adverb

InTransVerb^ Gerund

Department of Computer Science — University of San Francisco – p.7/

Classes of languages We can characterize languages (and grammars) interms of the strings that can be constructed from them. Regular languages contain rules of the form A^ →^ b|A^ →^ Bb^ Equivalent to regular expressions or finite stateautomata^ Can’t represent (for example) balanced opening andclosing parentheses. Context-free languages contain rules of the form A^ →^ b|A^ →^ XY^ (one nonterminal on left, anything onrighthand side)^ All programming languages are context free.^ Natural languages are assumed to be context free.

Department of Computer Science — University of San Francisco – p.9/

Classes of languages Context-sensitive languages contain rules of the form ABC^ →^ AQC^ (righthand side must contain at least asmany symbols as left)^ Some natural languages have context-sensitiveconstructs Recursively enumerable languages allow unrestrictedrules.^ They are equivalent to Turing machines. We’ll focus on context-free grammars.

Department of Computer Science — University of San Francisco – p.10/

Example “The cat is sleeping”

S NP VP Article Noun GerundIntransVerb^ "The" "cat""sleeping""is"^ Department of Computer Science — University of San Francisco – p.12/

Parsing as Search Parsing can be thought of as search^ Our search space is the space of all possible parsetrees We can either start with the top of the tree and builddown, or with the leaves and build up.

Department of Computer Science — University of San Francisco – p.13/

Example [S: ?][S: [NP:?]^ [VP^ :?]][S: [Noun^ :^ ?]^ [VP^ :^ ?]]^

-^ dead^ end^ -^ backtrack. [S:^ [[Article:^ ?]^ [Noun:

?]]^ [VP^ :^ ?]]

[S:[[Article:^ The]^ [Noun:

?]]^ [VP^ : ?]]

[S:[[Article:^ The]^ [Noun:

cat]]^ [VP^ :^ ?]] [S:[[Article:^ The]^ [Noun:

cat]]^ [VP^ :^ [Verb^ :?

]]^ - dead^ end,backtrack. [S:[[Article:^ The]^ [Noun:

cat]]^ [VP^ :^ [[TransVerb:

?]^ [Adv:^ ?]]]^ -dead end,^ backtrack.[S:[[Article:^ The]^ [Noun:

cat]]^ [VP^ :^ [[IntransVerb:

?]^ [Gerund: ?]]][S:[[Article:^ The]^ [Noun:

cat]]^ [VP^ :^ [[IntransVerb:

is]^ [Gerund: ?]]][S:[[Article:^ The]^ [Noun:

cat]]^ [VP^ :^ [[IntransVerb:

?]^ [Gerund: sleeping]]]

Department of Computer Science — University of San Francisco – p.15/

Top-down parsing Top-down parsing has two significant weaknesses:^ Doesn’t exploit sentence structure at upper levels ofthe parse tree^ Can wind up doing unnecessary search^ Can’t easily deal with left-recursive rules, such as^ S^ →^ S Conj S^ Can wind up infinitely re-expanding this rule, as inDFS.

Department of Computer Science — University of San Francisco – p.16/

Example Init: ’The^ cat^ is^ sleeping’

Succ:^ [[Art^ ’cat^ is

sleeping’]^ , [‘the’^ Noun^ ’is^ sleeping’]

[’the^ cat’^ InTransVerb

’sleeping’]] S1:^ [Art^ ’cat^ is^ sleeping’]

Succ:^ [[Art^ Noun^ ’is

sleeping’]^ [Art ’cat’^ InTransVerb^ ’sleeping’]

[Art^ ’cat^ is’^ Gerund]] S2:^ [[Art^ Noun^ ’is^ sleeping’]

Succ:^ [[NP^ ’is^ sleeping’]

[Art^ Noun IntransVerb^ ’sleeping’]

[Art^ Noun^ ’is’^ Gerund]] S3:^ [NP^ ’is^ sleeping’]

Succ:^ [[NP^ InTransVerb

’sleeping’]^ [NP^ ’is’ Gerund]]S4:^ [NP^ InTransVerb^

’sleeping’]^ Succ:^ [NP

IntransVerb^ Gerund] S5:^ [NP^ IntransVerb^

Gerund]^ Succ:^ [NP^ VP] S6:^ [NP^ VP]^ Succ:^ [S]

Department of Computer Science — University of San Francisco – p.18/

Bottom-up parsing While everything went fine in this simple example, therecan be problems:^ Words might match multiple parts of speech^ The same right-hand side can match many left-handsides^ Partial parses that could never lead to a completesentence get expanded.

Department of Computer Science — University of San Francisco – p.19/