Accesos directos a las distintas zonas del curso

Ir a los contenidos

Ir a menú navegación principal

Ir a menú pie de página

MOTORES DE BÚSQUEDA WEB

Curso 2021/2022/Subject's code31101042

MOTORES DE BÚSQUEDA WEB

BIBLIOGRAFÍA COMPLEMENTARIA


Tema 1. Características de la búsqueda de información en la WWW
Sobre estructura de la WWW:
- Kleinberg, JM. Hubs, authorities, and communities, ACM computing surveys 1999.
http://www.cs.brown.edu/memex/ACMCSHT/10/10.html
- A Borodin, GO Roberts, JS Rosenthal, P. Tsaparas. Finding authorities and hubs from link structures on the World Wide Web. Proc. WWW 2001.
http://www10.org/cdrom/papers/314/
Sobre tipología de búsquedas web:
- Rose, D. y Levinson, D. Understanding User Goals in Web Search. WWW 2004.
http://wwwconf.ecs.soton.ac.uk/archive/00000537/01/p13-rose.pdf
Sobre navegación versus consulta:
- Marti A. Hearst. Next Generation Web Search: Setting Our Sites In IEEE Data Engineering Bulletin, 2002.
http://www.sims.berkeley.edu/hearst/papers/data-engineering
- A. Peñas, F. Verdejo, J. Gonzalo, 2002. Terminology Retrieval: towards a synergy between thesaurus and free text searching. Advances in Artificial Intelligence - IBERAMIA 2002, LNAI 2527.
http://nlp.uned.es/pergamus/pubs/iberamia2002.pdf

Tema 2. Arquitectura básica de un motor de búsqueda.
Sobre crawling:
- J Cho, H Garcia-Molina, L Page. Efficient Crawling Through URL Ordering, WWW 1998.
- Allan Heydon and Marc Najork. Mercator: A Scalable, Extensible Web Crawler. In Proceedings of World Wide Web Conference, 1999, pages 219-229.
Sobre soporte hardware:
- L. A. Barroso, J. Dean, U. Hoelzle. Web search for a planet: the Google cluster architecture. IEEE 2003.

Tema 3. Motores de búsqueda pre-Google: recuperación basada en contenidos.
- D Hiemstra. Using Language Models for Information Retrieval. CTIT Ph.D. Thesis, 2001.
- G Salton, A Wong, CS Yang. A Vector Space Model for Automatic Indexing. Comm. ACM, 1975.
- N Fuhr. Probabilistic Models in Information Retrieval. The Computer Journal, 1992.

Tema 4. Motores de búsqueda actuales (generalistas): recuperación basada en autoridad.
Referencias:
- M Hollander. Google's PageRank Algorithm to Better Internet Searching. TR UMN.
- Brin, S. y Page, L. The Anatomy of a Large-Scale Hypertextual Web
Search Engine. WWW 1998.
- CHQ Ding, X He, P Husbands, H Zha, HD Simon. PageRank, HITS and a unified framework for link analysis. SIGIR 2002.
TH Haveliwala. Topic-Sensitive PageRank: A Context-Sensitive Ranking Algorithm for Web Search. IEEE T. on Knowledge and data engineering, 2003.

Tema 5. Temas avanzados.
- Guha, R. y Garg, A. Disambiguating People in Search. Proc. WWW 2004.
- S Lawrence, NJ Princeton. Context in Web Search, IEEE data engineering
bulletin, 2000.
J Sivic, A Zisserman. Video google: A text retrieval approach to object matching in videos, ICCV 2003.
- SK Bhavnani, CK Bichakjian, TM Johnson, RJ Little. Strategy Hubs: Next-Generation Domain Portals with Search Procedures. Proc. ACM Conference on Human Factors in Computing Systems, 2003, ACM Press NY, USA.
- T Berners-Lee, J Hendler, O Lassila. The semantic Web. Scientific American, 2001.
- J Heflin, J Hendler. A Portrait of the Semantic Web in Action. IEEE Intelligent Systems, 2001.
- S Eissen, B Stein. Analysis of Clustering Algorithms for Web-Based
Search. Springer-Verlag, 2002.
- J. Cigarrán, A. Peñas, J. Gonzalo, F. Verdejo, 2005. Automatic selection of noun phrases as document descriptors in an FCA-based Information Retrieval system. ICFCA 2005. Springer LNCS 3403.
Search Engines: Technology, Society, and Business. Materiales online del curso:
http://www.sims.berkeley.edu/courses/is141/f05/schedule.html