Your browser doesn't support the features required by impress.js, so you are presented with a simplified version of this presentation.

For the best experience please use the latest Chrome, Safari or Firefox browser.

CIS 115

Searching & Information Retrieval

kst8r

K-State's new social network!

Edgar F. Codd

Image Source: Wikipedia

Relational Database

Image Source: Wikipedia

Relational Database

userID Name Birthday Major
russfeld Russell May 10th Comp. Sci
johnsmith John June 1 Info. Sci
miriamc Miriam February 2nd Info. Sys
gameguy Jayson Dec 26 computersci
sharpie Reily Dec. 18th IS

Why Normalize Data?

Relational Database

userID Name Birthday Major
russfeld Russell May 10th Comp. Sci
johnsmith John June 1 Info. Sci
smiller Sheila Sept. 26 Soft. Engg.
gameguy Jayson Dec 26 computersci
sharpie Reily Dec. 18th IS

Related Tables

userID Name Birthday Major
russfeld Russell May 10th 1
johnsmith John June 1 3
smiller Sheila Sept. 26 2
gameguy Jayson Dec 26 1
sharpie Reily Dec. 18th 3


majorID Major Abbreviation
1 Computer Science CS
2 Software Engineering SE
3 Information Systems IS

Storing Phone Numbers

userID Name Phone1 Phone2
johnsmith John 555-1234
smiller Sheila 555-5134
gameguy Jayson 555-1235 555-5134
sharpie Reily 555-5134

Many-to-One?

userID Name
johnsmith John
smiller Sheila
gameguy Jayson
sharpie Reily


Phone user1 user2
555-1234 johnsmith
555-5134 smiller gameguy ?sharpie?
555-1235 gameguy

One-to-Many

userID Name
johnsmith John
smiller Sheila
gameguy Jayson
sharpie Reily


userID phone
johnsmith 555-1234
smiller 555-5134
gameguy 555-1235
gameguy 555-5134
sharpie 555-5134

The World Wide Web

Image Source: Wikipedia

Our World Wide Web

Image Source: 9 Algorithms that Changed the Future by John MacCormick

Indexing

Image Source: 9 Algorithms that Changed the Future by John MacCormick

Word Location

Image Source: 9 Algorithms that Changed the Future by John MacCormick

Algorithm

INPUT: a two-word phrase of form,  "Word1 Word2"
OUTPUT: an  AnswerList,  a list of the numbers of the Web pages that
contain the phrase

ALGORITHM:
(0.  The  AnswerList  starts with nothing saved in it.)

1. Extract from the table the list of  Page#-Position#  pairs for  Word1.  Call it  List1.

2. Extract from the table the list of  Page#-Position#  pairs for  Word2.  Call it  List2.

3. For each  page#-pos#  pair in List1,
      search  List2  to see if there is a pair,  page#-(pos#+1).
        (that is, the page# is the same and pos# differs by +1)
      if yes, then include  page#  in the  AnswerList.
      if no, then ignore this pair.

4. Announce all the page numbers in  AnswerList

Nearness

Metawords

Image Source: 9 Algorithms that Changed the Future by John MacCormick

Ranking

AltaVista

Image Source: Stanford

Hyperlinks

Image Source: 9 Algorithms that Changed the Future by John MacCormick

Who Wins?

Image Source: 9 Algorithms that Changed the Future by John MacCormick

Authority

Image Source: 9 Algorithms that Changed the Future by John MacCormick

Cycles

Image Source: 9 Algorithms that Changed the Future by John MacCormick

Random Surfer

Image Source: 9 Algorithms that Changed the Future by John MacCormick

Page Rank Example

Image Source: Ian Rogers - Princeton

Page Rank Example

Image Source: Ian Rogers - Princeton

Simulate, Don't Calculate!

Image Source: 9 Algorithms that Changed the Future by John MacCormick

Find all paths with length < 5, calculate percentage of times each one appears (count appearances / total paths)

Problems?

What is Web 2.0?

The Long Tail

Image Source: Wikipedia

Web 1.0 Web 2.0
Static web pages Dynamic pages
Content from few Content from many
Local software Web software
Local storage Web storage
Read only Write / Collaborate
Text only Multimedia
Individual thoughts Collective thoughts
Propietary Open / Shared

Source: New Tools Workshop

Assignments

Blog 6: Computing Science & Mathematics

As we embark on reading the last textbook, it should be very obvious by now that mathematics is a core aspect of Computing Science. In fact, most Computing Science departments and disciplines have a direct foundation in the study of mathematics. Here’s a good quote summing it up from Harold Abelson: “Mathematics provides a framework for dealing precisely with notions of "what is". Computation provides a framework for dealing precisely with notions of "how to".” For this article, write about how Computing Science and Mathematics are related. (Food for thought - Read up on Theoretical Computer Science and Applied Mathematics.) Things you can talk about: