Tip:
Highlight text to annotate it
X
The purpose of this video is to help you submit metadata with a new MG-RAST job
submission using the metadata template file provided on the upload page.
The template can also be used to update meta data of existing completed jobs
although that is not discussed in this video.
The template is occasionally revised, so please download a new copy every time
you submit a job.
For the same reason you may notice minor variations in the current template
version
from what is depicted in this video.
Let's look at the template.
It is an excel spreadsheet containing several sheets and numerous fields on
each sheet.
Note that the text and cells have been enlarged and reformatted for purposes of
this demonstration.
Do not change the names of sheets and only enter text in blank cells.
Some fields are free form,
but most require either a very specific format or a term exactly chosen from
a specific list.
This will be checked by a validation program in and you asked
but as you will see that there are detailed instructions to help you.
Notice the tabs for each sheet.
After the sheet called readme,
there are sheets named project and sample
and multiple library sheets
and after that the rest begin with the E. P.
which signifies environmental package.
All jobs require the project and sample sheets
library information is also always required
but different sequencing techniques required different library meta data,
so use the library metagenome sheet
for shotgun sequencing
or the library MIMARKS survey sheet for amplicon sequencing.
Finally you must choose an environmental package
even if you do not supplied detailed environmental information.
In this video
we discuss only minimal requirements for metadata but more meditate always
increases the potential of post- annotation analysis.
Now we move to the project sheet.
In this sheet and all the others
field labels occupy the first row
and required fields are indicated in red color.
The second row gives the further explanation of each field including
format instructions.
On the project sheet complete exactly one row to organize your job
submission under at project heading within MG-RAST.
The project name field here in the spreadsheet is identical with step two
of data submission on the upload web page
shown here briefly.
You can use the name of an existing project of your own but you may not use
a project project name already used by someone else.
Note that the project id field is provided for your use and will not be
used by MG-RAST.
Now we quickly complete the remaining required fields of the project sheet
which are straightforward.
On the sample sheet now shown you must fill one or more rows reflecting your
experimental design and sequence files.
Look at the second row carefully because it explains what values or formats are
permitted for your entries.
You should complete one row for each experimental sample represented by the
sequence files of your job submission.
Give a distinct name teach sample with the sample name field.
Later you will exactly copy these sample names to the library sheet
to identify the origin of each submitted sequence file.
Sample names must also be copied to environmental package sheets.
As before, an id field is provided for your use
and will not be used by MG-RAST.
Latitude, longitude, country, and location are easy to complete but again note that
your entries must conform to the format details in the second row.
In particular use only positive and negative numbers for longitude and
latitude,
not directions like north south east or west.
For date, time, and time zone
make sure to use the indicated format.
For time zone please note the conventional abbreviations like GMT
or EST will not be accepted.
Instead, use the coordinated universal time system,
abbreviated UTC.
To the right there are several other required fields
For biome, feature, and material you must search the web page is indicated in the
second row.
The environment ontology web site now shown provides a systematic vocabulary
of appropriate terms.
In the left side bar notice the top-level categories biome, feature, and
material.
From each category find one term appropriate to your data
and copy the preferred name exactly as it appears on the website.
Lastly, specify an appropriate environmental package for your data.
Use one of the E.P. names from the second row
exactly as it appears.
Next you will use one of the two library sheets to provide technical
information for your sequencing run or runs.
Use the library metagenome sheet for shotgun sequencing
and the library MIMARK survey sheet for amflicon sequencing
the required fields are identical in both versions of the library sheet
We will look at the library MIMARKs survey sheet.
You should complete one row per metagenome
If your sequence files are not multiplexed then you will have one row per
sequence file.
Give a descriptive name to each metagenome and specify the experimental sample
it originates from
using a sample name that you previously assigned
on the sample sheet.
Specify the sequence file in which reads for the metagenome recorded.
The file examplereads.fna has already been uploaded to MG-RAST
for purposes of this demonstration.
On this library sheet
always answer MIMARKS survey for the investigation type field
on the library metagenome sheet
you would enter metagenome in the corresponding field
for sequencing method choose one of the alternatives in row two.
Finally locate the sheet or sheets for the environmental package or packages
previously chosen.
On each relevant E. P. sheet
answer the exact sample name or names
one perot they specified before
and that belong with that environmental package.
In this demonstration we have only one sample
and it uses the environmental package water,
which happens to be the last sheet.
We are entering minimal information here,
but we urge you complete as many fields as possible.
The spreadsheet is now done,
but for demonstration we introduced a common mistake
to show what happens when the validator fails.
EST does not conform to the required time zone format,
so this change will make the spreadsheet fail validation.
Notice the following steps.
We save the spreadsheet locally;
we upload it to our inbox;
we attempt to select it;
but it does fail. Notice that the validator provides a detailed
explanation.
We return to excel to fix the problem.
We again save the local version.
Before uploading again, we first delete the old version from our inbox,
and then we proceed as before,
first uploading the spreadsheet,
and then selecting it.
This time the spreadsheet passes the validator,
and the message from MG-RAST indicates that the meta data spreadsheet
will apply to a new job rather than an existing job.
Now we can continue through the rest of the job submission process.
The information in this video is also available in print on the MG-RAST
blog page
addressing metadata.
You can also download a completed example spreadsheet from that page.