SUBA provides a powerful tool to investigate subcellular localisation in Arabidopsis through the unification of disparate datasets and through the provision of a web accessible interface for the construction of powerful user based queries resulting in a one-stop-shop for protein localisation in this model plant.
The SUBA database, established and maintained by the Centre tracks new data
on the subcellular location of proteins in the model plant Arabidopsis. This
is rapidly becoming the central reference point for researchers on the issue
of where a protein resides in plant cells. Just this month Thomson ISI
recognised the high citation rate of the original SUBA paper by selecting it
in ISI's Fast Moving Fronts
for November. An interview with the first
author, former PEB researcher Dr Joshua Heazlewood,
is now on the ISI website, or you could download the
PDF.
Subcellular localisation information can contribute towards our understanding of protein function, protein redundancy and of biological inter-relationships. While a variety of technologies are currently employed to determine the sub-cellular location of proteins much of this information is not available in an integrated manner. In an attempt to get a clearer picture of our experimental data and to more generally understand subcellular partitioning we have brought together various data sources to build SUBA. The database has a web accessible interface that allows advanced combinatorial queries to be undertaken on the contained data.
A PDF tutorial explaining how to use SUBA is available here: SUBA tutorial.
SUBA houses large scale proteomic and GFP localisation sets from cellular compartments of Arabidopsis. It also contains precompiled bioinformatic predictions for protein subcellular localisations. SUBA was last updated in June 2009 and is based on the TAIR8 genome annotation release.
SUBA query construction:
Easy query
The first row of pull down menus provides access to the majority of experimental localisation data. Using the 3 pull down menus, a query can be built to investigate, for example, plastid localised proteins by mass spectrometry. To access such a set of proteins:
More complex query
Select the Boolean linkers 'AND' or 'OR' command buttons to built more complex queries.
To search for plastid proteins identified through mass spectrometry and GFP analysis:
Use the linker buttons 'AND' or 'OR' to include:
Use the 'Undo' button to remove any mistakes from the forming query. This removes the last entered query.
Use the 'Clear' button to remove the entire constructed query.
Use the 'Submit' button to query the database.
Once a query has been submitted the contents of the 'RESULT' tab will automatically be displayed. By default 8 columns will be displayed: AGI, TAIR descriptor, location summaries of all predictors, location by Mass Spec, location by GFP, location by annotation (TAIR), location by AmiGO and location by UniProt.
Results can be sorted by field using the function menu. The function menu is activated by tracking the mouse over the column header and then selecting the emerging arrow. New columns can be added to the Result tab window by selecting 'Columns' in the function menu.
Columns can be organized using mouse drag and drop functionality.
Only 50 rows of data are displayed with further rows available using the 'Next Page' or 'Last Page' buttons at the bottom left of the Result tab window.
The codes or unique identifiers (uid) beside each location are links to the primary data source for each entry at PubMed or ISI.
Each AGI provides a link to a summary page (SUBA Flatfile) for the entry. The flatfile contains a detailed breakdown of subcellular localisation information (experimental and predicted), description and sequence information (TAIR8), physio-chemical characteristics, a hydropathy plot, and links to the same entry at other Arabidopsis databases.
All results can be downloaded as a tab delimited file by using the 'Download All Results' button at the top left of the Result tab window. By default this file opens in Excel.
For detailed description on how to use SUBA refer to the SUBA tutorial.
Current subcellular location data available in SUBA:
If you find this resource useful please cite one of the following two publications:
Heazlewood JL, Verboom RE, Tonti-Filippini J, Small I and Millar AH. (2007) SUBA: the Arabidopsis Subcellular Database. Nucleic Acids Res. 35:D213-D218. (PubMed)
Heazlewood JL, Tonti-Filippini J, Verboom RE, Millar AH. (2005) Combining experimental and predicted datasets for determination of the subcellular location of proteins in Arabidopsis. Plant Physiol. 139(2):598-609. (PubMed)
Harvey Millar (hmillar(at)cyllene.uwa.edu.au) or Julian Tonti-Filippini (tontis(at)iinet.net.au) or Sandra Tanz (stanz(at)cyllene.uwa.edu.au)