Software and Tutorials
Six tutorials in Pytorch Machine Learning and R
Overview [Readme]
A simple artificial neural network
[Pytorch_1.pptx] [Pytorch_script][train_example.csv]
Model performance
[Pytorch_2.pptx] [Pytorch_script2][determine_regress_pytorch.cpp][predict_y_v3.cpp][train_example.csv][test_example.csv]
Bells and whistles
[Pytorch_3.pptx] [Pytorch_script3]
Moving from CPU to GPU
[Pytorch_4.pptx] [Pytorch_script4]
A GPU classifier for 7 outputs
[Pytorch_script5][class_train.csv][class_test.csv]
Assessing Model Performance
[Readme][R_AUC_ROC_script][actual.txt][predicted.txt][plot.pdf]
Convolutional Neural Networks (CNNs) in PyTorch and C++ using the MNIST dataset
Overview [Readme]
A simple convolutional neural network
[Pytorch_MNIST_script]
C++ Convolutional Neural Network based on weights and biases of trained Pytorch model
[C++ program]
[weights_0.txt]
[biases_0.txt]
[weights_0.txt]
[biases_1.txt]
[biases_1.txt]
[image_look.cpp]
[add_shuffle.cpp]
[example_input.txt]
[example_input.png]
Advanced CNNs for detection of disease in Sugarcane leaves
Overview
[Readme]
[Pytorch_Sugarcane_model_1_script][Results_pdf]
[TensorFlow_EfficientNet_model_script][Results_pdf]
[Pytorch_Sugarcane_model2_script][Results_pdf
Advanced CNNs for classifying dog emotions: Sound Waves
Extracting audio files for CNN input
[Readme]
[Data: dog_sounds_spectogram]
[ Data: dog_sounds mfcc]
[Results_pdf] [Pytorch_script]
[C++ program to crop sound images]
[Pytorch_Dog_spectrogram_CNN_script][Results_pdf]
[Pytorch_Dog_mfcc_CNN_script][Results_pdf]
Advanced CNNs for classifying dog emotions: Images
Extracting image files for CNN input
[Readme]
[Data: dog_images]
[Results_pdf] [Pytorch_script]
Chaos Gene Representation
Overview [Readme][Proof][Benchmarking paper][Application paper]
Convert a gene sequence into x- and y- coordinates
[Readme][Source code][test fasta file][output file you should get by running software]
Count sequences within CGR boxes using x- and y- coordinates
[Readme][Source code][test fasta file][output file you should get by running software]
Count the number of targets (e.g., AAACCA) in a specific gene sequence using the CGR coordinates
[Readme][Source code][input file][output file you should get by running software]
Neuroet: a user-friendly Machine Learning tool for scientists
Overview
[Readme][Benchmarking paper][Application paper]
Download the Neuroet app and follow the instructions
[Neuroet] [Instructions]
C++ programs:
[Sensitivity analysis]
[Predict Y's from model]
MS Excel files:
[Extract model]
[Analyze sensitivity results]
Example input files:
[train_x.txt]
[train_y.txt]
[test_x.txt]
[combined_x.txt]
[combined_y.txt]
Natural Language Processing (NLP) to identify smokers from non-smokers in unstructured electronic health records (EHRs)
Determining the number of unique target words and their count in a document. [Source code][Input File ][Output file]
Extract words to the left and right of target word. [Source code][Input File #1 ][Input File #2][Output file] [Excel file]
Vectorize the words. [Source code][Input File #1 ][Input File #2][Output file #1][Output file #2 ]
Make training and testing datafiles and run Neuroet. [Results from Neuroet using 100 vectors with a hidden layer of 3 neurons][Excel file of validation and confusion matrix results]
Determine the Response Curve of the model in R. [ R script][Actual results][Predicted results][AUC plot]
ANOVA and SNK test analysis
Overview [Readme]
C++ source code and input/output files:
[Source code][input file][ output file]
Analytics to predict future ICD codes based on Bayesian probabilities and network analyses
Build Bayesian Probability Dataset. [C++ source file][Non-redundant patient report file][Non-redundant codes file]
Generate Network edges and nodes based on Bayesian Probabilities of ICDs. [C++ source file][R file to make pretty network diagrams]
Determine the coefficients of an equation using matrix algebra
Overview [Readme][Benchmarking paper][Application paper 1][Application paper 2]
How to compile the source code and example test and output files
[Readme][Source code][matrix_h][Input file #1][Input file #2][output file you should get by running software]
Calibrate a DNA microarray and test the calibration with real data
Fast-track Overview
[short version][long version][Benchmarking paper][Application paper 1][Application paper 2]
Average probe signal intensities
[Readme][Source code][input_1 file][input_2 file][input_3 file][input_4 file][input_5 file][input_6 file][output file you should get by running software]
Calibrate probe signal intensities
[Readme][Source code][input_1 file][output file you should get by running software]
Calculate concentrations from intensities and calibrations
[Readme][Source code][input_1 file][expt_test file][output file you should get by running software]
Create a Blast database and query with fasta files
How to make a NCBI blast database
[Readme][test file]
How to query a NCBI database with a fasta file
[Readme][test file][output file you should get by running software][output file with header]
Purge Blast output by Percent Similarity and Minimum Alignment Length
[Readme][Source code][test file][output file you should get by running software]
Manipulate fasta files
Convert fasta files to one line files and remove verbose text
[Readme][Source code][test fasta file][output file you should get by running software]
Insert sequence length and convert one-line sequence files to fasta format
[Readme][Source code][test file][output file you should get by running software]
Determine the GC content of a DNA sequencing run
[Readme][Source code][test file][output file you should get by running software]
Determine the GC content of each sequence in a DNA sequencing run
[Readme][Source code][test file][output file you should get by running software]
Make a GC histogram of a sequencing run
[Readme][Source code][test file][output file you should get by running software]
Calculate hexanucleotide frequencies and determine co-occurring sequences in a metagenomic sample
Determine hexanucleotide frequences for a metagenomic sample
[Readme][Source code][test file][hexamer file][output file you should get by running software]
Determine the co-occurrence of sequences in a 454 sample
[Readme][Source code][test file][output file you should get by running software]