SaVanT accepts tab-delimited plain text files containing gene expression data in the format described below.
An example file conforming to this format is provided on the front page via the "Example" link, or directly at:
http://newpathways.mcdb.ucla.edu/savant-dev/SaVanT_InputMatrix.txt
All submitted input files must be plain-text files, although the filename can be arbitrary and does not need any particular extension (i.e., does not require to end with '.txt').
An annotated example of the first few lines of an acceptable file is below:
FORMAT
- The first line must have the following tab-delimited fields:
(1) Gene field title/descriptor/placeholder (e.g. the literal strings "SYMBOL", "GENE", etc.). Can be anything, but the field must be present -- in other words, if blank, the file must start with a tab character.
(2+) Sample names: unique identifiers/descriptions of the samples. Can include spaces, although not advisable for clarity (underscores preferable), and also advised to be kept short (5-15 characters).
- An optional second line can include group membership information if an ANOVA analysis is desired, with the following fields:
(1) The word "ANOVA_GROUP" (exactly as that, all caps, with an underscore and no other whitespace)
(2) Integer values (consecutive, starting from 1) that denote which samples belong to which groups
- The rest of the lines must consist of a gene symbol, followed by tab-separated numerical values of gene expression
NOTES
- Sample names must be unique
- All lines of the file must contain a consistent number of fields
- Commas will be replaced with semi-colons, and empty lines will be removed
- In cases where gene symbols are duplicated, the values of the last-appearing line with that gene symbol will be used
- Expression values MUST be numeric and cannot be, e.g., "N/A"