PTIFS software
The algorithm and software is developed by Mr. Zhanfeng Wang.
The program can also be downloaded from his web: http://202.38.64.10/~zfw/PTIFS.htm
Supplementary information (simulation results)
Source: Wang Z, Chang YC, Ying Z, Zhu L, Yang Y. (2007) PTIFS: A
parsimonious threshold-independent protein feature selection method through the area under receiver
operating characteristic curve. Bioinformatics. 2007 Oct 15;23(20):2788-9
DOWNLOAD:
PTIFS program, flow chart of PTIFS algorithm, example, data set 1, data set 2
GENERAL DESCRIPTION:
The parsimonious threshold-independent protein feature selection (PTIFS) method was proposed for selecting protein (peptide) biomarkers using mass spectrometry data, but it can also be used for other types of data such as gene expression data. The PTIFS is a parsimonious feature selection method. It selects features (proteins) in a similar way as the LARS method. The area under the receiver operating characteristic curve (ROC) is used as the criterion for selecting features. The threshold parameter is determined by cross-validation and therefore is threshold-independent. The current version of PTIFS is designed for two-class classification problem.
PROGRAM:
PTIFS is a release version of PTIFS.f90 and it
can be executed on Windows system. The program is designed for two-class
classification problem. It needs two input data files (casedata.txt and controldata.txt)
and one file for parameter specifications (para.txt):
INPUT:
(1) Data sets (casedata.txt and controldata.txt)
The casedata.txt and controldata.txt are respectively the data sets for case (diseased) group and control (normal) group. Note that for both data sets, the rows are for features and columns for individuals.
(2) Parameter File (para.txt)
1st row: number of features
2nd row: sample sizes for case and control group
3rd row: training sample sizes for case and control groups
4th row: K (number of partition of samples in K-fold cross-validation)
5th row: step size in the gradient descent optimization algorithm
6th row: lambda
OUTPUT:
Results are stored in file results.txt
1st row: number of features selected
2nd row: indices of the selected features
3th row: threshold parameter tau selected by cross-validation
4th row: AUCs for trainning and testing data sets
5th row: classification result for trainning data set
6th row: classification result for testing data set
EXAMPLE:
An example session is available here. Download it and extract it. Then you can see the input data sets (casedata.txt and controldata.txt) and parameter file (para.txt) in the folder. Click PTIFS.exe to run the program. Results are stored in results.txt.
Last updated July 30, 2007