We have listed the methods used below so you at all times can read more of these. Which methods that are utilized in the data analysis, is based on the type of analysis you request. Each type of analysis has a strictly defined pipeline, so it is reproducible, and the method is described in the report. Thus you would be able to redo the analysis if this should be necessary. We also provide a pipeline-reference, so we always will be able to resend you the method used on your data.
Currently, we focus on factorial studies. All methods used are listed below, but we are always expanding our reach, thus this is continuously updated.
We use Median absolute deviation (MAD) in our outlier estimation, as it is a robust estimator of scale when comparing to ex. Standard deviation.
We use Principal Component Analysis to convert possibly correlated subjects into a set of linearly uncorrelated subjects and thus, enabling visualization of high dimensional data in fewer dimensions with a minimal loss of information. Furthermore, it is used to emphasize strong patterns among subjects and thereby we can identify potential outliers if such should be in the data.
We use clustering methods to investigate the variance and to detect potential sub-groups or sub-studies in larger datasets. Should potential sub-groups or studies appear in the data, we will include statistical analysis between the cluster and sample-groups within, thus bringing the analysis a step towards personalized based analysis. The Ward’s method, also known as the minimum variance method, uses the error sum of squares in its objective function. It is a criterion applied to hierarchical cluster analysis, and we used it to investigate how the expression data cluster accordingly to variance. We include this clustering method, as it is more likely to detect clusters with unequally diameter and less dependent on round-shaped clusters when compared to k-means clustering.
Although the Ward method and K-mean use the same objective function, they have different approaches since the Ward method is an agglomerative (bottom-up) approach and divisive (top-down) approach. Furthermore, K-means assume data to be elliptically arranged. Furthermore, the K-means Clusters can change in arbitrary ways when the number of clusters is changed.
Depending on the data, we used various statistical tests. In factorial studies, we use a parametric test such as Student’s t-test and ANOVA if data is normally distributed. If normal distribution cannot be achieved, we use nonparametric tests like the Mann–Whitney U test. Multiple Comparison correction is performed using the Benjamini-Hochberg procedure.
Here we address, which e.g., pathways the genes/proteins found significantly regulated can be annotated to, through enrichment analysis utilizing the DAVID and Reactome database. We also investigate if genes/proteins have been found in multiple pathways etc. and suggest which have been found connecting the annotated pathways.
The Jackknife resampling method is used to make a robust variance analysis of sub-groups. It is a leave-one-out approach similar to the bootstrap method and thus, makes the variance analysis more robust concerning study participants. This method can be usual full when dealing with a low sample size.