Cookies Policy

This site uses cookies. By continuing to browse the site you are agreeing to our use of cookies.

I accept this policy

Find out more here

The World Wide Web as Linguistic Corpus

Brill’s MyBook program is exclusively available on BrillOnline Books and Journals. Students and scholars affiliated with an institution that has purchased a Brill E-Book on the BrillOnline platform automatically have access to the MyBook option for the title(s) acquired by the Library. Brill MyBook is a print-on-demand paperback copy which is sold at a favorably uniform low price.

Access this chapter

+ Tax (if applicable)

Chapter Summary

Increasingly, corpus linguists have begun using the World Wide Web as a corpus for conducting linguistic analyses. The Web, however, is really a very different kind of corpus: we do not know, for instance, precisely how large it is or what kinds of texts are on it. In this chapter, we evaluate the Web as a linguistic corpus, providing estimates of its size and composition. In addition, we conduct a series of sample analyses of the Web, demonstrating that while commonly available search engines have definite limitations, they can in a matter of seconds retrieve extremely large volumes of data that are very relevant to a corpus analysis, and also provide frequency information that may not be entirely accurate but suggestive of how frequently particular words and grammatical constructions occur.



Can't access your account?
  • Tools

  • Add to Favorites
  • Printable version
  • Email this page
  • Recommend to your library

    You must fill out fields marked with: *

    Librarian details
    Your details
    Why are you recommending this title?
    Select reason:
    Corpus Analysis — Recommend this title to your library
  • Export citations
  • Key

  • Full access
  • Open Access
  • Partial/No accessInformation