How do I download NLTK data?

Each Answer to this Q is separated by one/two green lines.

Updated answer:NLTK works for 2.7 well. I had 3.2. I uninstalled 3.2 and installed 2.7. Now it works!!

I have installed NLTK and tried to download NLTK Data. What I did was to follow the instrution on this site:

I downloaded NLTK, installed it, and then tried to run the following code:

>>> import nltk

It gave me the error message like below:

Traceback (most recent call last):
  File "<pyshell#6>", line 1, in <module>
AttributeError: 'module' object has no attribute 'download'
 Directory of C:\Python32\Lib\site-packages

Tried both and nltk.downloader(), both gave me error messages.

Then I used help(nltk) to pull out the package, it shows the following info:


    app (package)
    ccg (package)
    chat (package)
    chunk (package)
    classify (package)
    cluster (package)
    corpus (package)
    draw (package)
    examples (package)
    inference (package)
    metrics (package)
    misc (package)
    model (package)
    parse (package)
    sem (package)
    stem (package)
    tag (package)
    test (package)
    tokenize (package)


I do see Downloader there, not sure why it does not work. Python 3.2.2, system Windows vista.


To download a particular dataset/models, use the function, e.g. if you are looking to download the punkt sentence tokenizer, use:

$ python3
>>> import nltk

If you’re unsure of which data/model you need, you can start out with the basic list of data + models with:

>>> import nltk

It will download a list of “popular” resources, these includes:

<collection id="popular" name="Popular packages">
      <item ref="cmudict" />
      <item ref="gazetteers" />
      <item ref="genesis" />
      <item ref="gutenberg" />
      <item ref="inaugural" />
      <item ref="movie_reviews" />
      <item ref="names" />
      <item ref="shakespeare" />
      <item ref="stopwords" />
      <item ref="treebank" />
      <item ref="twitter_samples" />
      <item ref="omw" />
      <item ref="wordnet" />
      <item ref="wordnet_ic" />
      <item ref="words" />
      <item ref="maxent_ne_chunker" />
      <item ref="punkt" />
      <item ref="snowball_data" />
      <item ref="averaged_perceptron_tagger" />


In case anyone is avoiding errors from downloading larger datasets from nltk, from

$ rm /Users/<your_username>/nltk_data/corpora/
$ rm -r /Users/<your_username>/nltk_data/corpora/panlex_lite
$ python

>>> import nltk
>>> dler = nltk.downloader.Downloader()
>>> dler._update_index()
>>> dler._status_cache['panlex_lite'] = 'installed' # Trick the index to treat panlex_lite as it's already installed.


From v3.2.5, NLTK has a more informative error message when nltk_data resource is not found, e.g.:

>>> from nltk import word_tokenize
>>> word_tokenize('x')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/l/alvas/git/nltk/nltk/tokenize/", line 128, in word_tokenize
    sentences = [text] if preserve_line else sent_tokenize(text, language)
  File "/Users//alvas/git/nltk/nltk/tokenize/", line 94, in sent_tokenize
    tokenizer = load('tokenizers/punkt/{0}.pickle'.format(language))
  File "/Users/alvas/git/nltk/nltk/", line 820, in load
    opened_resource = _open(resource_url)
  File "/Users/alvas/git/nltk/nltk/", line 938, in _open
    return find(path_, path + ['']).open()
  File "/Users/alvas/git/nltk/nltk/", line 659, in find
    raise LookupError(resource_not_found)
  Resource punkt not found.
  Please use the NLTK Downloader to obtain the resource:

  >>> import nltk

  Searched in:
    - '/Users/alvas/nltk_data'
    - '/usr/share/nltk_data'
    - '/usr/local/share/nltk_data'
    - '/usr/lib/nltk_data'
    - '/usr/local/lib/nltk_data'
    - ''



this will download all the data and no need to download individually.

Install Pip: run in terminal : sudo easy_install pip

Install Numpy (optional): run : sudo pip install -U numpy

Install NLTK: run : sudo pip install -U nltk

Test installation: run: python

then type : import nltk

To download the corpus

run : python -m nltk.downloader all

Do not name your file I used the same code and name it nltk, and got the same error as you have, I changed the file name and it went well.

This worked for me:

nltk.set_proxy('http://user:[email protected]:8080')

Please Try

import nltk

After running this you get something like this

NLTK Downloader
   d) Download   l) List    u) Update   c) Config   h) Help   q) Quit

Then, Press d

Do As Follows:

Downloader> d all

You will get following message on completion, and Prompt then Press q
Done downloading collection all

you can’t have a saved python file called because the interpreter is reading from that and not from the actual file.

Change the name of your file that the python shell is reading from and try what you were doing originally:

import nltk and then

It’s very simple….

  1. Open pyScripter or any editor
  2. Create a python file eg:
  3. write the below code in it.
import nltk
  1. A pop-up window will apper and click on download .

The download window]

I had the similar issue. Probably check if you are using proxy.

If yes, set up the proxy before doing download:

nltk.set_proxy('', ('USERNAME', 'PASSWORD'))

If you are running a really old version of nltk, then there is indeed no download module available (reference)

Try this:

import nltk

As per the reference, anything after 0.9.5 should be fine

you should add python to your PATH during installation of python…after installation.. open cmd prompt type command-pip install nltk
then go to IDLE and open a new it as open
type the following:
import nltk

Try download the zip files from and then unzip, save in your Python folder, such as C:\ProgramData\Anaconda3\nltk_data

if you have already saved a file name and again rename as check whether you have still the file existing. If yes, then delete them and run the file it should work!

just do like

import nltk

then you will be show a popup asking what to download , select ‘all’. it will take some time because of its size, but eventually we will get it.

and if you are using Google Colab, you can use"/content/nltkdata")

after running that you will be asked to select from a list

NLTK Downloader
d) Download   l) List    u) Update   c) Config   h) Help   q) 
Downloader> d

here you have to enter d as you want to download.
after that you will be asked to enter the identifier that you want to download . You can see the list of available indentifier with l command or if you want all of them just enter ‘all’ in the input box.
then you will see something like –

Downloading collection 'all'
       | Downloading package abc to /content/nltkdata...
       |   Unzipping corpora/
       | Downloading package alpino to /content/nltkdata...
       |   Unzipping corpora/
       | Downloading package biocreative_ppi to /content/nltkdata...
       |   Unzipping corpora/
       | Downloading package brown to /content/nltkdata...
       |   Unzipping corpora/
       | Downloading package brown_tei to /content/nltkdata...
       |   Unzipping corpora/
       | Downloading package cess_cat to /content/nltkdata...
       |   Unzipping corpora/
 |   Unzipping models/
       | Downloading package mwa_ppdb to /content/nltkdata...
       |   Unzipping misc/
     Done downloading collection all

    d) Download   l) List    u) Update   c) Config   h) Help   q) Quit
Downloader> q

at last you can enter q to quit.

You may try:

>> $ import nltk
>> $ nltk.download_shell()
>> $ d
>> $ *name of the package*

happy nlp’ing.

The answers/resolutions are collected from stackoverflow, are licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0 .