CSAHdetect package, version 2.5 ================================================================================ CREDITS csahdetect.pl is a wrapper to invoke scan4csah and/or ft_charge and return a uniformly formatted output. When using this program, scan4csah or ft_charge, kindly cite one - supposedly the most recent or the most specific for you - of the following references: * Akos Kovacs, Daniel Dudola, Laszlo Nyitray, Gabor Toth, Zoltan Nagy, Zoltan Gaspari: Detection of single alpha-helices in large protein sequence sets using hardware acceleration. Submitted * Daniel Dudola, Gabor Toth, Laszlo Nyitray, Zoltán Gaspari: Consensus prediction of charged single alpha-helices with CSAHserver. Zhou, Kloczkowski, Faraggi, Yang (eds): Prediction of Protein Secondary Structure Methods Mol Biol Vol. 1484, Springer, 2017, pp. 25-34. * Zoltan Gaspari, Daniel Suveges, András Perczel, Laszló Nyitray, Gabor Toth: Charged single alpha-helices in proteomes revealed by a consensus prediction approach. Biochem. Biophys. Acta - Proteins and Proteomics (2012) 1824:637-646. * Daniel Suveges, Zoltan Gaspari, Gabor Toth, Laszlo Nyitray: Charged single alpha-helix: a versatile protein structural motif Proteins (2009) 74:905-916. ================================================================================ INSTALLATION The only specific requirement is that FFT.pm should be avalable for ft_charge to be able to run. FFT.pm is obtainable from CPAN (http://www.cpan.org). However, the install script can invke 'sudo cpan Math::FFT' for a straightforward install, if you chose so. Installation steps: - Unpack CSAHdetect.zip in a suitable diretory - Type 'perl INSTALL.PL' from that direcotry. - The install script will - check the availability of Math::FFT and offer to install it using cpan (you can skip this but you will have to install Math::FFT manually later to use the FT_CHARGE method) - ask you where to put the scan4csah.pl and ft_charge.pl executables and the EVD parameter files - ask you where to put the csahdetect.pl executable - will put the executables into their respective locations and apply the given settings. If everything went OK, you can now invoke csahdetect.pl --help for usage. The notes below are for users who might want to use the scan4csah.pl and ft_charge.pl programs separately for some purpose. It is advised that both scan4csah.pl and ft_charge.pl are copied to a location where they can be easily executable (e.g. /usr/local/bin). It is also advised that their respective EVD parameter files are moved to a standard location and their default location is changed by editing the default values of the corresponding variables in the scripts: [scan4csah.pl, line 206]: #my = "/home/szpari/csahserver/scan4csah_evdtable.txt"; (remove the "#" to make the change active, by defult, scan4csah.pl is able to run without an external parameter file) [ft_charge.pl, line 46]: $opt_e="/home/szpari/csahserver/ftcharge_evdtable.txt"; ================================================================================ USAGE Running csahdetect.pl Invoking csahdetect.pl without any options will give you an overview of options and input/output files. By default, you can just type csahdetect.pl --infasta= The algorithms used are selected with the '--mode' option: C[onsensus] (default): invoke both programs S[can4csah] : invoke scan4csah F[t_charge] : invoke ft_charge (only the first character is meaningful). As the FT_CHARGE method is much slower than SCAN4CSAH, in the consensus mode SCAN4CSAH is invoked first and then a fasta file is generated containing only those sequences where CSAH segments were predicted by SCAN4CSAH. FT_CHARGE is then invoked on this reduced sequence set to save time. (If you are not happy with this, you can get a full FT_CHARGE-based prediction by invoking 'csahdetect.pl --mode=F' or running ft_charge.pl separately.) By default, the programs are invoked without any special options, meaning they use their defaults (they use their default evd files). If you use non-standard locations or want to change any of their respective options to non-default values, you might either specify these on the command line or change the default values of the relevant variables in this script. The user might also chose to use precomputed SCAN4CSAH and/or FT_CHARGE outputs, in this case the output files specified with --s4coutfile and --ftcoutfile will be read in and SCAN4CSAH and/or FT_CHARGE will not be invoked. This is useful e.g. to extract consensus from runs of the two algorithms on different sequence sets. The minimum length of consensus CSAHs is set to 30 (the default minumum length in scan4csah as it is shorter than the default window size of 32-64 in ft_charge), you can change it using the --minconslen option. The program checks the helicity of the candidate segments at two steps: - before reporting the final segment, a check is performed using the value set by --helicalP. - during processing the output of FT_CHARGE, all individual segments are also checked, using the probabilites calculated from the product of the valuse set with --helicalP and --ftcmhP. Default is the half of the value set with --helicalP (i.e. --ftcmhP = 0.5) This step helps to filter out hits that supposedly do not form alpha-helical structures (e.g. contain proline residues). Examples: - using command-line options: > csahdetect.pl --scan4csah='/home/pompom/bin/scan4csah.pl --minlen=40' --ft_charge='/programs/ft_charge.pl -p 0.1' --infasta=... - rewriting defaults in csahdetect.pl: [...] $SCAN4CSAH="/home/pompom/scan4csah.pl"; $FTCHARGE="/programs/ft_charge.pl"; [...] If option --maskedfasta is given, a masked FASTA file (with residues in CSAH regions masked as 'x') will also be written. For more information on the usage of scan4csah.pl and ft_charge.pl kindly refer to their own description (invoke 'scan4csah.pl --help' or 'ft_charge.pl' -h) ================================================================================ DISCLAIMER AMD RIGHTS This program, as well as scan4csah.pl and ft_charge.pl are provided on an "as is" basis in the hope that they will prove meaningful. The authors are not responsible for any damage or malfunction caused by the usage of these programs. All programs in this csahdetect package are distributed freely and the user is free to make any modifications provided (s)he 1) acknowledges the use of the programs by citing the references above and 2) distributes the modified versions as free software. ================================================================================