List of Options of SeqCode Services

NGS Tools

Compare2Peaks

Application to compare two sets of genomic regions and discriminate the common overlapping regions from the specific ones.

1 First set of peaks (BED format)

Tab-separated plain text file that contains one list of peaks. Each peak is described for the chromosome, the starting and the ending position in BED format.

Example about the format on a file with peaks in BED format:
chr   pos1   pos2
chr   pos1   pos2
...

2 Second set of peaks (BED format)

Tab-separated plain text file that contains one list of peaks. Each peak is described for the chromosome, the starting and the ending position in BED format.

Example about the format on a file with peaks in BED format:
chr   pos1   pos2
chr   pos1   pos2
...

3 Names of each set of peaks

Text labels that will be used to identify the peaks from each set when performing the comparison and building the final lists of overlapping and specific peaks. Spaces and special characters will be substituted for the symbol "_".

4 Definition of the overlap between two peaks

Minimum size in the overlap between two peaks that coincide to be considered as a successful match.

5 Graphical parameters of each set

Definition of graphical parameters to customize the Venn diagram between both sets of elements.

- Background color is used to fill the circles inside.
- Circle line color is used for the line around each circle.

Colors must be defined in the R software as shown in the following table.

- Transparency degree: alpha channel value related to the opacity in overlapping areas.
- Extract the number: this option is useful when there is not enough space to put a number inside an area of the diagram.
- Distance of the set name to circle: this option is useful to increase the space between the label of the set and the corresponding circle.

6 Global parameters of the Venn diagram

Definition of graphical parameters to customize the Venn diagram global appearance.

- Font size of gene names is useful to change the size of labels outside the diagram.
- Font size of numbers is useful to change the size of numbers inside the diagram.
- Line width of circles can be used to adapt the thickness of the Venn diagram circles.
- Figure size (small/normal): this option adapts the margins of the page and generates smaller plots (if necessary).
- Extract the common number permits the user to extract the value in the intersection of the diagram.
- Distance of external numbers together with the previous option is able to change the distance of numbers to the diagram.
- Proportional sizes: this option allows the users to generate proportional sets in the final diagram according to their sizes.
- Font family (helvetica/times) defines the family of the font (sans-serif or serif).

ComputeChIPlevels

Application to determine the amount of normalized reads of a sequencing experiment within a set of genomic regions.

1 Catalog of ChIPseq available experiments

List of ChIPseq experiments that are available in our web site. Samples are classified into several main groups:

- Mouse embryonic stem cells (serum)
- Mouse embryonic stem cells (2i+LIF)
- Mouse HPC7 cells (hematopoietic precursor)
- Human K562 cells (chronic myelogenous leukemia)
- Human DU145 cells (prostate cancer)
- Drosophila wing imaginal discs (L3)

Users will choose one sample of the list to plot the trend exhibit by the subsets of genes provided at the same time. Optionally, it is possible to select a control experiment from the second list (e.g. Input or IgG samples) to display the background level in the resulting image.

2 Number of subsets of genes to be included

Number of genesets that will be employed in the final graphical representation. For each list of genes provided by the user, the application will generate the resulting profile using the same sequencing experiment and the whole collection of profiles will be gathered into the same picture.

3 Subsets of target genes

Plain text file that contains a list of elements for further characterization. Only one element is stored at each line. The name of the elements is used to index another file of information in which the features for all elements have been previously uploaded as well. The lines of the full set of values that correspond to each key here will be graphically represented afterwards.

Example about the format of one file with the elements that form a subset.
element1
element2
element3
...

4 Captions and titles

Text labels that will be used to characterize the resulting boxplot of ChIPseq levels. Spaces and special characters will be substituted for the symbol "_".

5 Graphical parameters of Boxplots

Definition of graphical parameters to customize the global appearance of the resulting boxplot of ChIPseq levels.

- Color palette is the combination of colors that is used for the boxes.
- Color style for boxes can be used to determine whether the boxes are filled on a solid style, using colors for lines or black and white.
- Log scale converts the dsitributions of values submitted by the user into log scale distributions.
- Violin plots or violin plots with boxplots inside can be included in the final image instead of canonical boxplots.
- Line width can be used to adapt the thickness of the boxes of the boxplot.
- Outliers of each distribution can be included into the final boxplot.
- Each individual observation can be plotted as a point (optional).
- The size of the individual observations can be customized (see above for observations).
- The value on the X axis and the Y axis for the labels can be changed depending on the final boxplot.

PeakAnnotator

Application to determine the frequency of a set of peaks in each class of gene features (exon, intron, ...).

1 List of peaks (BED format)

Tab-separated plain text file that contains one list of peaks. Each peak is described for the chromosome, the starting and the ending position in BED format.

Example about the format on a file with peaks in BED format:
chr   pos1   pos2
chr   pos1   pos2
...

2 Genome assembly

To match ChIPseq peaks or regions of another classes to genes, users must select the appropriate catalog of RefSeq transcripts. Into this current version of SeqCodeWEB, there are three available collections of genes:

- Mouse (mm9)
- Human (hg19)
- Drosophila (dm3)

Users can run the stand-alone version of SeqCode to perform the same operation on any RefSeq catalog of transcripts.

3 Name of the experiment

Text label to characterize the peaks when matching to genes into the output files. Spaces and special characters will be substituted for the symbol "_".

4 Rules of association between genes and peaks

Users can define the region of the RefSeq transcripts that is useful to calculate the overlap against the ChIPseq peaks. There are three classes of definitions:

- One place upstream of the TSS until the TSS (excluding the gene body) or the TES (including the gene body)
- From the TES (excluding the gene body) or from the TSS (including the gene body) until one place downstream of the TES
- One region around the TSS (indicating the amount of positions upstream and downstream of the TSS)

In all these cases, it is necessary to set the length of the region that will be scanned when searching for overlaps between peaks and transcripts.

5 Graphical parameters of the pie charts

Definition of graphical parameters to customize the global appearance of the resulting piecharts of genome annotations.

- High/low detail is useful to switch between two lists of genomic features to characterize the peaks:

(Detailed)
* Distal promoters: the region between 2.5 Kb and 0.5 Kb upstream of the TSS of genes
* Proximal promoters: the region between the TSS and 0.5 Kb upstream of the TSS of genes
* 5'UTR and 3'UTR: untranslated regions upstream/downstream of transcripts (UTR exons)
* CDS: protein coding sequence part of transcripts (coding exons)
* Introns: spliced part of the genes that is not included into the transcripts
* Intergenic regions: genomic regions that do not belong to any of the previous classes

(Simple)
* Promoter region: the region between 2.5 Kb upstream of the TSS and the TSS
* Intragenic: exons and introns of genes
* Intergenic: genomic regions that are not classified as promoters or intragenic


Depending on the size of peaks, those elements can overlap with more than one class of genomic region. Thus, the total number of peaks shown in the title of the piechart will reflect these cases (this value will be equal or higher than the actual number of peaks provided by the user).
The corresponding spie chart to represent how significant are the results on the particular genome will be generated only when the High detail option is active. The circular grid can be optionally hidden.

- The color to depict each class of genomic region in the pie charts can be selected by the user
Colors must be defined in the R software coloring scheme as shown in the following table.

ProduceGENEplots

Application to generate the aggregated meta-gene plot of a ChIPseq experiment for a list of genes.

1 Catalog of ChIPseq available experiments

List of ChIPseq experiments that are available in our web site. Samples are classified into several main groups:

- Mouse embryonic stem cells (serum)
- Mouse embryonic stem cells (2i+LIF)
- Mouse HPC7 cells (hematopoietic precursor)
- Human K562 cells (chronic myelogenous leukemia)
- Human DU145 cells (prostate cancer)
- Drosophila wing imaginal discs (L3)

Users will choose one sample of the list to plot the trend exhibit by the subsets of genes provided at the same time. Optionally, it is possible to select a control experiment from the second list (e.g. Input or IgG samples) to display the background level in the resulting image.

2 Number of subsets of genes to be included

Number of genesets that will be employed in the final graphical representation. For each list of genes provided by the user, the application will generate the resulting profile using the same sequencing experiment and the whole collection of profiles will be gathered into the same picture.

3 Subsets of target genes

Plain text file that contains a list of elements for further characterization. Only one element is stored at each line. The name of the elements is used to index another file of information in which the features for all elements have been previously uploaded as well. The lines of the full set of values that correspond to each key here will be graphically represented afterwards.

Example about the format of one file with the elements that form a subset.
element1
element2
element3
...

4 Graphical parameters of each list of genes

Definition of graphical parameters to customize the appearance of the multiple lists of genes into the resulting metaplot.

For each list of genes, users can define the following values:
- The color to represent the list of genes in the metaplot
- The style of the line: solid, dashed or dotted
- The width or thickness of the line: from 1 to 8
- The level of transparency when overlapping to other lines

Colors must be defined in the R software coloring scheme as shown in the following table.

5 General graphical parameters

Definition of graphical parameters to customize the global appearance of the resulting metaplot.

Users can change the values of following parameters:
- The background and foreground colors of the metaplot
- The size of the font legend
- The style of the line depicting the TSS/TES of genes
- The width of the line depicting the TSS/TES of genes
- The colour of the line depicting the TSS/TES of genes
- The minimum and maximum value of the Y axis to crop the image

Colors must be defined in the R software coloring scheme as shown in the following table.

ProduceTSSmaps

Application to generate the heatmap of ChIPseq signal intensities of a selected experiment for a list of genes.

1 List of genes

Plain text file that contains a list of elements for further characterization. Only one element is stored at each line. The name of the elements is used to index another file of information in which the features for all elements have been previously uploaded as well. The lines of the full set of values that correspond to each key here will be graphically represented afterwards.

Example about the format of one file with the elements that form a subset.
element1
element2
element3
...

2 Number of ChIPseq experiments to be included

Number of ChIPseq experiments that will be employed in the final graphical representation. For each sample selected by the user, the application will generate the resulting heat map using the same set of genes provided before and the whole collection of maps will be gathered into the same picture.

3 Catalog of ChIPseq available experiments

List of ChIPseq experiments that are available in our web site. Samples are classified into several main groups:

- Mouse embryonic stem cells (serum)
- Mouse embryonic stem cells (2i+LIF)
- Mouse HPC7 cells (hematopoietic precursor)
- Human K562 cells (chronic myelogenous leukemia)
- Human DU145 cells (prostate cancer)
- Drosophila wing imaginal discs (L3)

Users can choose up to five different samples of the list to plot the heat maps corresponding to the same set of genes provided at this moment. It is possible to select a control experiment from the same list (e.g. Input or IgG samples) to display the background level in the resulting heat map.

4 Captions and titles

Text labels that will be used to characterize the resulting heatmap (title). Spaces and special characters will be substituted for the symbol "_".

5 Graphical parameters of each heatmap

Definition of graphical parameters to customize the appearance of the multiple heat maps into the resulting metaplot.

For each ChIPseq sample, users can define the following colors:
- Foreground color to represent the presence of signal at each gene
- Background color to represent the absence of signal at each gene

Colors must be defined in the R software coloring scheme as shown in the following table.

6 General graphical parameters

Definition of graphical parameters to customize the appearance of the image containing the full set of heat maps.

Users can change the values of following parameters:
- The background and foreground colors of the multiple heatmap
- To generate a uniform heat map in which the regions associated to each gene in the map display only presence (foreground color) or absence of signal (background color). In other words, it is generated a binary heat map as there is not a degree of colors proportional to the strength of the ChIP signal
- To normalize all the heat maps using the same value to favor the comparison among them

Colors must be defined in the R software coloring scheme as shown in the following table.

ProduceTSSplots

Application to generate the aggregated plot of a ChIPseq experiment around the TSS of a list of genes.

1 Catalog of ChIPseq available experiments

List of ChIPseq experiments that are available in our web site. Samples are classified into several main groups:

- Mouse embryonic stem cells (serum)
- Mouse embryonic stem cells (2i+LIF)
- Mouse HPC7 cells (hematopoietic precursor)
- Human K562 cells (chronic myelogenous leukemia)
- Human DU145 cells (prostate cancer)
- Drosophila wing imaginal discs (L3)

Users will choose one sample of the list to plot the trend exhibit by the subsets of genes provided at the same time. Optionally, it is possible to select a control experiment from the second list (e.g. Input or IgG samples) to display the background level in the resulting image.

2 Number of subsets of genes to be included

Number of genesets that will be employed in the final graphical representation. For each list of genes provided by the user, the application will generate the resulting profile using the same sequencing experiment and the whole collection of profiles will be gathered into the same picture.

3 Subsets of target genes

Plain text file that contains a list of elements for further characterization. Only one element is stored at each line. The name of the elements is used to index another file of information in which the features for all elements have been previously uploaded as well. The lines of the full set of values that correspond to each key here will be graphically represented afterwards.

Example about the format of one file with the elements that form a subset.
element1
element2
element3
...

4 Graphical parameters of each list of genes

Definition of graphical parameters to customize the appearance of the multiple lists of genes into the resulting metaplot.

For each list of genes, users can define the following values:
- The color to represent the list of genes in the metaplot
- The style of the line: solid, dashed or dotted
- The width or thickness of the line: from 1 to 8
- The level of transparency when overlapping to other lines

Colors must be defined in the R software coloring scheme as shown in the following table.

5 General graphical parameters

Definition of graphical parameters to customize the global appearance of the resulting metaplot.

Users can change the values of following parameters:
- The background and foreground colors of the metaplot
- The size of the font legend
- The style of the line depicting the TSS/TES of genes
- The width of the line depicting the TSS/TES of genes
- The colour of the line depicting the TSS/TES of genes
- The minimum and maximum value of the Y axis to crop the image

Colors must be defined in the R software coloring scheme as shown in the following table.

Data Sets

BoxPlotter

Application to generate the boxplot of the distribution of multiple values and perform statistical testing for a list of genes.

1 Full list of elements and values

Tab-separated plain text file that contains the list of elements and features. Each line of the list contains the same number of columns: the key that identifies the current element is introduced at the column 1 while the attributes are stored from column 2 up to column N. Commas (if any) are internally susbtituted for floating points. Not available values (NA) are included in the treatment.

Example about the format on a file with elements characterized using two features:
element1   value1   value2
element2   value1   value2
element3   value1   value2
...

2 Subset of elements to be studied

Plain text file that contains a list of elements for further characterization. Only one element is stored at each line. The name of the elements is used to index another file of information in which the features for all elements have been previously uploaded as well. The lines of the full set of values that correspond to each key here will be graphically represented afterwards.

Example about the format of one file with the elements that form a subset.
element1
element2
element3
...

3 Select the features, the input names and the color

Selection of the columns of the file of features uploaded before for the full set of elements. For those items included in the subset, a graphical representation will be generated from their characteristics. Users must label each column with a description that will be inserted in the picture. Spaces will be substituted for the symbol "_". One color defined in the R software must be assigned to each category as well.

Colors must be defined in the R software as shown in the following table.

4 Captions and titles

Text labels that will be used to characterize the resulting boxplot (title and Y axis). Spaces and special characters will be substituted for the symbol "_".

5 Graphical parameters of Boxplots

Definition of graphical parameters to customize the global appearance of the resulting boxplots.

- Font family (helvetica/times) defines the family of the font (sans-serif or serif). - Font size of the title is useful to change the size of the title above the boxplot.
- Font size of the axis Y is useful to change the size of values on this axis of the boxplot.
- Font size of the labels is useful to change the size of labels of the boxplot.
- Angle of the labels can be used to rotate (0/45/90 degrees) the labels of sets in the boxplot.
- Position of the labels allows the user to set the location of the X axis labels on a defined position.
- Line width can be used to adapt the thickness of the boxes of the boxplot.
- Show outliers must be active to include the outliers of the distributions into the boxplot.
- Log scale converts the distributions of values submitted by the user into log scale distributions.

6 Graphical parameters of Histograms

Definition of graphical parameters to customize the graphical appearance of the complementary histograms.

- Font size of the title is useful to change the size of the title above the boxplot.
- Font size of the axis is useful to change the size of values on the axes of the boxplot.
- Font family (helvetica/times) defines the family of the font (sans-serif or serif). - Line width can be used to adapt the thickness of the boxes of the boxplot.

BoxPlotter2

Application to generate the boxplot of the distribution of one value and perform statistical testing for multiple lists of genes.

1 Full dataset (element,value)

Tab-separated plain text file that contains the list of elements and features. Each line of the list contains the same number of columns: the key that identifies the current element is introduced at the column 1 while the attributes are stored from column 2 up to column N. Commas (if any) are internally susbtituted for floating points. Not available values (NA) are included in the treatment.

Example about the format on a file with elements characterized using two features:
element1   value1   value2
element2   value1   value2
element3   value1   value2
...
The user will select in the next step which column must be used to generate the boxplot.

2 Select column of data file

Selection of one column of the file of features uploaded in the first step for the full set of elements. For the subsets of genes that will be included in the the third step, a graphical representation will be generated from the value of this column.

3 Subsets of target genes

Plain text file that contains a list of elements for further characterization. Only one element is stored at each line. The name of the elements is used to index another file of information in which the features for all elements have been previously uploaded as well. The lines of the full set of values that correspond to each key here will be graphically represented afterwards.

Example about the format of one file with the elements that form a subset.
element1
element2
element3
...
Up to five different subsets can be incorporated into the same boxplot.
Users will provide a name for each subset that will be used to characterize them in the resulting plot.

4 Captions and titles

Text labels that will be used to characterize the resulting boxplot (title and Y axis). Spaces and special characters will be substituted for the symbol "_".

5 Graphical parameters of Boxplots

Definition of graphical parameters to customize the global appearance of the resulting boxplots.

- Font family (helvetica/times) defines the family of the font (sans-serif or serif). - Font size of the title is useful to change the size of the title above the boxplot.
- Font size of the axis Y is useful to change the size of values on this axis of the boxplot.
- Font size of the labels is useful to change the size of labels of the boxplot.
- Angle of the labels can be used to rotate (0/45/90 degrees) the labels of sets in the boxplot.
- Position of the labels allows the user to set the location of the X axis labels on a defined position.
- Line width can be used to adapt the thickness of the boxes of the boxplot.
- Show outliers must be active to include the outliers of the distributions into the boxplot.
- Log scale converts the distributions of values submitted by the user into log scale distributions.

One color will be assigned to each box representing a subset in the boxplot.
Colors must be defined in the R software as shown in the following table.

6 Graphical parameters of Histograms

Definition of graphical parameters to customize the graphical appearance of the complementary histograms.

- Font size of the title is useful to change the size of the title above the boxplot.
- Font size of the axis is useful to change the size of values on the axes of the boxplot.
- Font family (helvetica/times) defines the family of the font (sans-serif or serif). - Line width can be used to adapt the thickness of the boxes of the boxplot.

BoxPlotter3

Application to generate the boxplot of the distribution of multiple values for multiple lists of genes allowing a wide range of graphical options.

1 Full list of elements and values

Tab-separated plain text file that contains the list of elements and features. Each line of the list contains the same number of columns: the key that identifies the current element is introduced at the column 1 while the attributes are stored from column 2 up to column N. Commas (if any) are internally susbtituted for floating points. Not available values (NA) are included in the treatment.

Example about the format on a file with elements characterized using two features:
element1   value1   value2
element2   value1   value2
element3   value1   value2
...

2 Subset of elements to be studied

Plain text file that contains a list of elements for further characterization. Only one element is stored at each line. The name of the elements is used to index another file of information in which the features for all elements have been previously uploaded as well. The lines of the full set of values that correspond to each key here will be graphically represented afterwards.

Example about the format of one file with the elements that form a subset.
element1
element2
element3
...

3 Select the features and input names

Selection of the columns of the file of features uploaded in the first step for the full set of elements. For those items included in the subset during the second step, a graphical representation will be generated from their characteristics. Users must label each column with a description that will be inserted in the picture. Spaces will be substituted for the symbol "_".

4 Captions and titles

Text labels that will be used to characterize the resulting boxplot (title and both axes). Spaces and special characters will be substituted for the symbol "_".

5 Graphical parameters

Definition of graphical parameters to customize the global appearance of the resulting boxplots.

- Color palette is the combination of colors that is used for the boxes.
- Color style for boxes can be used to determine whether the boxes are filled on a solid style, using colors for lines or black and white.
- Log scale converts the distributions of values submitted by the user into log scale distributions.
. Pseudocount value is added to each value of the distributions to avoid the calcularion of log 0.
- Violin plots or violin plots with boxplots inside can be included in the final image instead of canonical boxplots.
- A grid in grey or black and white style can be integrated as a background of the boxplot.
- Line width can be used to adapt the thickness of the boxes of the boxplot.
- Outliers of each distribution can be included into the final boxplot.
- Each individual observation can be plotted as a point (optional).
- The size of the individual observations can be customized (see above for observations).
- The value on the X axis and the Y axis for the labels can be changed depending on the final boxplot.
- The color of lines in the line plot of each individual
- The graphical appearance of lines in the line plot of each individual

HeatMapper

Application to generate the heatmap for a list of genes and values from multiple conditions.

1 Full set of records (element,value1,...,valueN)

Tab-separated plain text file that contains the list of elements and features. Each line of the list contains the same number of columns: the key that identifies the current element is introduced at the column 1 while the attributes are stored from column 2 up to column N. Commas (if any) are internally susbtituted for floating points. Not available values (NA) are included in the treatment.

Example about the format on a file with elements characterized using two features:
element1   value1   value2
element2   value1   value2
element3   value1   value2
...

2 Number of conditions/experiments (columns)

Number of values (starting from the second column) that will be graphically represented for each gene in the heat map.

3 Captions and titles

Text labels that will be used to represent each condition in the gene heat map and to characterize the resulting image (main title and legend features). Spaces and special characters will be substituted for the symbol "_".

4 Graphical parameters

Definition of graphical parameters to customize the global appearance of the resulting gene heat map.

- Color palette is the combination of colors that is used for the boxes
- Log scale converts the distributions of values submitted by the user into log scale distributions
- Define a max value allows for a direct color normalization of the gene heat map
- It is possible to perform hierachical clustering on the genes and/or the conditions (dendograms are optional)
- A grid can be superimposed to the final image and the color is configurable
- It is possible to show or hide the names of genes and/or conditions (depending on the number of elements, it is recommended to hide)
- Optionally, the actual value assigned to each gene and condition can be also displayed inside each cell
- To show the legend is useful to include the distribution of values in the upper corner of the gene heat map

PCAplotter

Application to perform the PCA analysis for a list of genes and values from multiple conditions.

1 Full list of elements and features

Tab-separated plain text file that contains the list of elements and features. Each line of the list contains the same number of columns: the key that identifies the current element is introduced at the column 1 while the attributes are stored from column 2 up to column N. Commas (if any) are internally susbtituted for floating points. Not available values (NA) are included in the treatment.

Example about the format on a file with elements characterized using two features:
element1   value1   value2
element2   value1   value2
element3   value1   value2
...

2 Select the features to associate names and groups

From the list of experiments (values) provided above, the user will select several ones (up to nine) in order to perform the PCA analysis.

Once one column (condition) is included, it is necessary to use a text label for the identification and indicate the group of this feature. The group is useful in the final PCA plot to show in the same color those features in the space belonging to the same class.

3 Captions and titles

Text label that will be used as a title in the resulting PCA plot. Spaces and special characters will be substituted

4 Graphical parameters

Definition of graphical parameters to customize the graphical appearance of the resulting PCA plot:

-One label and its color will be assigned to each group of conditions to highlight them in the bidimensional space plot.
Colors must be defined in the R software as shown in the following table.

- Log scale converts the distributions of values submitted by the user into log scale distributions.
- ncRNAs can be filtered out from the initial list of elements to help to improve the PCA analysis

Scatterplotter

Application to draw the scatterplot of points for a set of genes using two distribution of values.

1 Full set of records (element,value1,value2)

Tab-separated plain text file that contains the list of elements and features. Each line of the list contains the same number of columns: the key that identifies the current element is introduced at the column 1 while the attributes are stored from column 2 up to column N. Commas (if any) are internally susbtituted for floating points. Not available values (NA) are included in the treatment.

Example about the format on a file with elements characterized using two features:
element1   value1   value2
element2   value1   value2
element3   value1   value2
...

2 Highlight one subset of points [optional]

Tab-separated plain text file that contains a subset of the main list of elements including the values of features that contribute to their characterization. Each line of the list contains the same number of columns: the key that identifies the current element is introduced at the column 1 while the attributes are stored from column 2 up to column N. Commas (if any) are internally susbtituted for floating points. Not available values (NA) are included in the treatment.

Example about the format on a file with elements characterized using two features:
element1   value1   value2
element2   value1   value2
element3   value1   value2
...
These elements will be included into the final plot, being superimposed to the points generated from the whole list of values using a disctinct color for being highlghted. Up to two different subsets can be integrated into the scatterplot.

3 Highlight another subset of points [optional]

Tab-separated plain text file that contains a subset of the main list of elements including the values of features that contribute to their characterization. Each line of the list contains the same number of columns: the key that identifies the current element is introduced at the column 1 while the attributes are stored from column 2 up to column N. Commas (if any) are internally susbtituted for floating points. Not available values (NA) are included in the treatment.

Example about the format on a file with elements characterized using two features:
element1   value1   value2
element2   value1   value2
element3   value1   value2
...
These elements will be included into the final plot, being superimposed to the points generated from the whole list of values using a disctinct color for being highlghted. Up to two different subsets can be integrated into the scatterplot.

4 Captions and titles

Text labels that will be used to provide the title and the names of the two conditions being compared in the final scatterplot. Spaces and special characters will be substituted for the symbol "_".

5 Graphical parameters

Definition of graphical parameters to customize the global appearance of the resulting scatterplot.

- Log scale converts the distributions of values submitted by the user into log scale distributions.
- Null lines that contain 0 in both conditions can be excluded from the scatterplot
- Color palette is the combination of colors that is used for the points and the background
- Both optional subsets of points can be customized with a distinct color
- It is able to include several classes of guidelines (diagonal y=x, lines to denote several fold-change conditions)
- The regression line can be integrated into the picture (optional), customizing the color and the thickness
- Show lowest density area points is useful to highlight the areas with less points (potential outliers)
- Binarization values are useful to play with the balance between image smoothness and size of the scatterplot
- It is possible to crop the image by defining particular values for X and Y

Colors must be defined in the R software as shown in the following table.

Gene Sets

AlluvialPlotter

Application to generate the alluvial diagram from a list of gene class assignments and multiple lists of annotations.

1 Membership of each gene in the other lists

Tab-separated plain text file that contains one list of genes. Each gene is described for the membership to a user-defined category.

Genes belonging to gene sets provided by the user will be annotated and classified following both type of information.

Example about the format on a file with genes and classes:
gene1   classx
gene2   classy
...

2 Choose the number of sets

Number of input sets of elements to be represented as columns in the alluvial diagram. According to this number, the same amount of columns will appear in the same order in the final plot.

3 Upload the files

Plain text files that contain a list of elements for further characterization. Only one element is stored at each line. The name of the elements is used to be compared against the rest of files in order to identify the elements in common and those that are specific of each particular combination of lists.

Example about the format of one file with the elements that form a set.
element1
element2
element3
...

4 Names of each set of genes

Text labels that will be used to identify each set in the resulting Venn diagram. Spaces and special characters will be substituted for the symbol "_".

5 Graphical parameters of each class

Definition of graphical parameters to customize the Alluvial diagram of elements.

Each class as defined by the user in the membership file will be assigned a color in the same order.

Colors must be defined in the R software as shown in the following table.

Compare2Genes

Application to draw the Venn diagram between two sets with proportional sizes and provide each list of common and specific elements.

1 First set of genes

Plain text file that contains a list of elements for further characterization. Only one element is stored at each line. The name of the elements is used to be compared against another file in order to identify the elements in common and those that are specific of each list.

Example about the format of one file with the elements that form a set.
element1
element2
element3
...

2 Second set of genes

Plain text file that contains a list of elements for further characterization. Only one element is stored at each line. The name of the elements is used to be compared against another file in order to identify the elements in common and those that are specific of each list.

Example about the format of one file with the elements that form a set.
element1
element2
element3
...

3 Names of each set of genes

Text labels that will be used to identify each set in the resulting Venn diagram. Spaces and special characters will be substituted for the symbol "_".

4 Graphical parameters of each set

Definition of graphical parameters to customize the Venn diagram between both sets of elements.

- Background color is used to fill the circles inside.
- Circle line color is used for the line around each circle.

Colors must be defined in the R software as shown in the following table.

- Transparency degree: alpha channel value related to the opacity in overlapping areas.
- Extract the number: this option is useful when there is not enough space to put a number inside an area of the diagram.
- Distance of the set name to circle: this option is useful to increase the space between the label of the set and the corresponding circle.

5 Global parameters of the Venn diagram

Definition of graphical parameters to customize the Venn diagram global appearance.

- Font size of gene names is useful to change the size of labels outside the diagram.
- Font size of numbers is useful to change the size of numbers inside the diagram.
- Line width of circles can be used to adapt the thickness of the Venn diagram circles.
- Figure size (small/normal): this option adapts the margins of the page and generates smaller plots (if necessary).
- Extract the common number permits the user to extract the value in the intersection of the diagram.
- Distance of external numbers together with the previous option is able to change the distance of numbers to the diagram.
- Proportional sizes: this option allows the users to generate proportional sets in the final diagram according to their sizes.
- Font family (helvetica/times) defines the family of the font (sans-serif or serif).

6 Statistical significance of the overlap

Total number of genes in the genome to compute the significance of the overlap between both sets.

UpSetPlotter

Application to generate the UpSet chart of multiple sets of elements.

1 Choose the number of sets

Number of input sets of elements to be compared (between 2 and 10). According to this number, the same number of files in the section below will be processed in the same order.

2 Upload the files

Plain text files that contain a list of elements for further characterization. Only one element is stored at each line. The name of the elements is used to be compared against the rest of files in order to identify the elements in common and those that are specific of each particular combination of lists.

Example about the format of one file with the elements that form a set.
element1
element2
element3
...

3 Choose the species

This option indicates the species, to be used when generating the table with the comparison codes.

4 Names of each set of genes

Text labels that will be used to identify each set in the resulting plot. Spaces and special characters will be substituted for the symbol "_".

5 Graphical parameters

Definition of graphical parameters to customize the global appearance of the resulting UpSet diagram.

- Font size of the title is useful to change the size of the title above the plot
- Point size is useful to change the size of the points denoting presence of an element in a particular combination
- Font size of the top and bottom axis is useful to change the size of values on both parts of the plot
- Intersections can be ranked by the number of elements that belong to each class in the dataset. Alternatively,
this can be done using a fixed ranking based on the class of combinations irrespectively of the example
- Font size of the number of intersections is useful to change the size of these values in the plot
- Highlight the perfect combination is useful to paint in a different color the option that contains elements of every class
- Bar color is useful to change the color of the bars that denote the size of each list of elements

Colors must be defined in the R software as shown in the following table.

VennPlotter

Application to draw the Venn diagram of multiple sets of elements.

1 Choose the number of sets

Number of input sets of elements to be compared (between 2 and 5). According to this number, the same number of files in the section below will be processed in the same order.

2 Upload the files

Plain text files that contain a list of elements for further characterization. Only one element is stored at each line. The name of the elements is used to be compared against the rest of files in order to identify the elements in common and those that are specific of each particular combination of lists.

Example about the format of one file with the elements that form a set.
element1
element2
element3
...

3 Names of each set of genes

Text labels that will be used to identify each set in the resulting Venn diagram. Spaces and special characters will be substituted for the symbol "_".

4 Graphical parameters of each set

Definition of graphical parameters to customize the Venn diagram among multiple sets of elements.

- Background color is used to fill the circles inside.
- Circle line color is used for the line around each circle.
(colors must defined in the R software as shown in the following table)

5 Global parameters of the Venn diagram

Definition of graphical parameters to customize the Venn diagram global appearance.

- Font size of gene names is useful to change the size of labels outside the diagram.
- Font size of numbers is useful to change the size of numbers inside the diagram.
- Line width of circles can be used to adapt the thickness of the Venn diagram circles.
- Figure size (small/normal): this option adapts the margins of the page and generates smaller plots (if necessary).
- Transparency degree: alpha channel value related to the opacity in overlapping areas.
- Distance of external numbers together with the previous option is able to change the distance of numbers to the diagram.
- Show percentages: this option shows the percentage of the total set on each area instead of the actual overlapping number.
- Font family (helvetica/times) defines the family of the font (sans-serif or serif).

List Operations

FCAnalysis

Application to extract the elements of a list that present a fold-change increase/decrease between conditions.

1 Full list of elements and values

Tab-separated plain text file that contains the list of elements and features. Each line of the list contains the same number of columns: the key that identifies the current element is introduced at the column 1 while the attributes are stored from column 2 up to column N. Commas (if any) are internally susbtituted for floating points. Not available values (NA) are included in the treatment.

Example about the format on a file with elements characterized using two features:
element1   value1   value2
element2   value1   value2
element3   value1   value2
...

2 Select the two features to be compared

Selection of two columns of the file of features uploaded in the first step for the full set of elements. The ratio between both values will be used to establish the fold-change

3 Captions and titles

Text label that will be used to characterize the resulting list of values. Spaces and special characters will be substituted for the symbol "_".

4 Conditions about the FC

Users can apply to filters on the ratio calculated between both selected values:

- Fold-change: the ratio must be higher/lower than this proportion (up or down elements)
- Minimum value: useful to dismiss the elements under this minimum threshold before calculating the ratio

FilterValues

Application to extract the elements of a list that fit into a particular condition or filter rule.

1 Full list of elements and values

Tab-separated plain text file that contains the list of elements and features. Each line of the list contains the same number of columns: the key that identifies the current element is introduced at the column 1 while the attributes are stored from column 2 up to column N. Commas (if any) are internally susbtituted for floating points. Not available values (NA) are included in the treatment.

Example about the format on a file with elements characterized using two features:
element1   value1   value2
element2   value1   value2
element3   value1   value2
...

2 Define the filters on basic conditions

To filter those lines that do not match certain rules, users on up to three fileds are able to check whether the corresponding attributes are lower/higher/equal than particular values.

3 Captions and titles

Text label that will be used to characterize the resulting list of values. Spaces and special characters will be substituted for the symbol "_".

Join2Lists

Application to extract the elements of two lists in common using the value of a particular attribute.

1 First list of elements

Plain text file that contains a list of elements and values for further comparison. Only one element is stored at each line (together with its set of features). One particular column defined arbitrarily by the user (the index or key) is used to be compared against another file in order to identify the elements in common and those that are specific of each list.

Example about the format of one file with the elements that form a set.
value11 ... value1N
value21 ... value2N
value31 ... value3N
...

2 Second list of elements

Plain text file that contains a list of elements and values for further comparison. Only one element is stored at each line (together with its set of features). One particular column defined arbitrarily by the user (the index or key) is used to be compared against another file in order to identify the elements in common and those that are specific of each list.

Example about the format of one file with the elements that form a set.
value11 ... value1N
value21 ... value2N
value31 ... value3N
...

3 Captions and titles

Text label that will be used to characterize the elements of each list in the comparison. Spaces and special characters will be substituted for the symbol "_".

4 Define the common column of each set

Users must indicate which column/field of each file will be used as the key or index for the comparison.