Tip:
Highlight text to annotate it
X
This tutorial describes part of the How To called,
"Download a large, custom set of records from NCBI."
I'll describe two approaches to download records
directly from your browser.
The first approach uses a text query
and the second uses Batch Entrez
with a list of UIDs, or unique identifiers,
such as accession numbers,
gi numbers,
or gene id's.
Let's use a text query to find mouse mRNA's
that have the word chemokine
in the title of the record.
I'll enter mouse[organism] AND chemokine[title] in the Search box.
I could select a specific database, but let's use Entrez's global query.
Click search.
Since we're looking for mRNAs
we could pick databases such as Nucleotide,
EST,
the database of Expressed Sequence Tags,
Gene
or UniGene.
Let's view the records in Nucleotide.
The tabs represent preset filters
that are applied to all searches in this database.
You can set different filters by logging in to your My NCBI account.
I'm going to select the RefSeq or Reference Sequence tab
to get a non-redundant set of sequences.
Clicking the pushpin image on the tab
locks in that search term --
see how the search term gets added to the query.
Now when I select the mRNA tab,
I have the set of mouse mRNA's from RefSeq
with chemokine in the title.
Unless I also lock in the mRNA filter,
the retrieval in the next step
will include any non-mRNA records
so i'll click on the push pin.
Use the display menu
to select the format that you want to download,
for example, FASTA,
or a list of gi numbers.
I'll select FASTA.
Choosing Send to,
then File
allows you to save a multi-FASTA file
of all eighty nine records.
If you wanted to display all eighty nine
on the web page,
you first would have to set Show to 100
then choose Send to,
and Text.
The second approach uses Batch Entrez
and requires a text file of unique identifiers,
one per line.
You can create such a list from our web pages,
as just described,
for many, but not all, Entrez databases.
Let's say a colleague sent you a list
of protein gi numbers
for which you'd like the FASTA sequences.
Go to the Batch Entrez page
and select the correct database,
Protein in this case.
Browse to your local file of identifiers,
and click Retrieve.
You can then change Display to FASTA,
and Send To a local file.
If you want to automate retrieval of large
numbers of records,
use the E-Utilities described in this How To.