Using Igv through Genepattern - V2

The Integrative Genomics Viewer (or IGV) is a high-performance visualization tool for interactive exploration of large, integrated datasets. It supports a wide variety of data types including sequence alignments, microarrays, and genomic annotations. The GenePattern IGV module is available on the GenePattern Server or can be downloaded from the GenePattern Repository. (Note that IGV is also available outside of GenePattern from the IGV website broadinstitute.org/igv). Launching IGV from within GenePattern allows you to pass your GenePattern result files directly to IGV and include IGV in your GenePattern pipelines. In this tutorial we will briefly demonstrate the use of IGV from GenePattern and some of the basic functions of IGV. The GenePattern IGV module launches the same application that is available from the IGV website. If you are both a user of the client IGV (either launching from the IGV website or your desktop) and GenePattern, this means you are using the same version of IGV complete with your home directory, saved genomes and other such IGV saved preferences. For all users this means you are getting the latest version each time you run IGV, regardless of whether you choose to do so from GenePattern or from the IGV client. As previously mentioned, launching IGV from GenePattern allows you to pass your GenePattern result files directly to IGV in the same way you would use a result file as input file for any other module. For instance, you'll now notice that (on servers where IGV is installed) output from a run of CopyNumberDivideByNormals will have IGV as a next option in the dropdown for the .cn result file. IGV also supports many other common GenePattern file formats such as GCT and BAM. A full list of supported file formats and their descriptions is available through the module documentation and on the IGV website. For this tutorial we will use the output of a recently run CopyNumberDivideByNormals job as input to the IGV module. We’ll now select the genome for our data. If we’d recently opened IGV with the desired genome or if hg19 were the genome for our data we could leave this parameter unselected, as IGV will launch with the last genome used or, if this is the first time launching, it will launch with hg19. For this tutorial we will select hg16. We may also specify the locus of interest, if we know ahead of time where we are interested in looking. We could do this by specifying a gene of interest, such as egfr or a locus in the format indicated by the caption below. For this tutorial we will leave this field blank. Once we’ve finished our selections we can run the module When the module has finished processing a “Launch” button will appear. Click this to launch IGV with your data and parameters loaded. You may be presented with a security warning from your browser, if this occurs simply click OK, Run or Yes, to allow IGV to continue. You may also choose not to see this message again. As indicated on the Job Status page – you will now be prompted to provide your username and password, so that IGV can access your data on the GenePattern server. You may also choose to save this password, so as to avoid having to manually enter it in the future. As we selected, IGV launched with hg16 loaded and by default displays chromosome 1. To zoom out to the whole genome, click on the home icon in the toolbar at the top. From here we could choose to zoom in by double clicking on a location of interest. Or use the slide bar in the upper right to accomplish the same. To move across the cytoband once zoomed in, we can click and drag in the main panel, or Click and drag the red box in on the cytoband. If we had a specific locus of interest, we could click on that area in the cytoband or type in the locus in the search box We can also search for a gene of interest by typing its name in the search box To load additional data sets such as a sample information file, we select “Load from File” in the File menu and select our files. Here we will load a sample information file, to fill out the attributes column. Note that this is how you would also load any other datasets associated with your data and saved on your local or networked drive. The colored blocks, now displayed, represent the attribute values for each sample. For each attribute, each unique value is assigned a color; therefore, a quick scan shows you the distribution of attribute values. To display an attribute value, hover over a colored block. For instance mousing over the Tumor_Type column shows us that all of the tracks have a tumor type of GBM. We can easily see that this pertains to all of the tracks because the attributes column for Tumor_type is all the same color. If the different tracks had different values for Type they would be different colors, as with the Name attribute. Similarly in the data panel, additional information about the selected data points, such as location, data scale and value, are displayed as you mouse over the data points. By default we are viewing our data as a heatmap, we could however choose to view the data as a Scatter Plot, or Bar chart. One way to do so for all tracks is to right click on an attribute shared by all tracks, such as data type, in this example. From the menu displayed you can choose bar chart as the new graph type, as well as many other options such as, renaming tracks or setting track heights, which can be useful when you have hundreds or thousands of tracks displayed. It’s good to remember that in IGV right clicking always provides easy access to methods for manipulating the display of your data. IGV also provides the ability to define a region of interest. Let’s zoom out to get a bigger picture and then click the Define Region button in the tool bar. Notice that the cursor, in the plot, is now a cross hatch. Use this to mark the start and the end of the region of interest. The viewer annotates this region of interest by adding a red section to the region bar. Clicking on the red bar displays a menu providing options for sorting, zooming in on and editing the description of that region of interest. For instance we can sort by amplification or we could sort by deletion. Notice that sorting affects the whole of the data panel and track order, but is based on the values in the region of interest. One last feature to note is the ability to copy and paste a GenePattern result file URL as input for IGV. This feature became available with the releases of GenePattern 3.3.2 and IGV 2.0. To this feature, simple right click and copy the URL of a job result in GP - , then in IGV select “File>Load from URL…” and paste the GenePattern result file URL into the provided field. As with the previous method of sending data from GenePattern to IGV, you’ll be prompted for you user name and password (used to run the GenePattern job). Your data will now be displayed in IGV. As originally stated, this tutorial merely scratches the surface of IGV’s capabilities, but hopefully has provided you with some basic tools for beginning to harness the power of IGV for the viewing of your data through GenePattern. More information about GenePattern can be found at genepattern.org. More information about IGV can be found at broadinstitute.org/igv