DataBase
ModMashup.Database
— Type.Store general and in-depth information for network integration and label propagation.
Generally, it contains all similairty networks's file name, patient id-name dictionary, query, patients labels, (keyword: smooth or not, use 1 as query or -1, selection or patients ranking, if ranking , provide top_net file containing selected network).
Inside Database, I simply use a dictionary to map patients' name to their internal id.
For example, you can access labels information in Database through database.labels
Fields
string_nets::Vector{String}
: A vector of similarity networks file name.
labels::OneHotAnnotation
: Disease annotation for patients.
n_patients::Int
: The number of patients in the databse.
patients_index::Dict{String,Int}
: Map patient name to their id.
inverse_index::Dict{Int,String}
: Map patient id to their name.
num_cv::Int
: The number of cross validation round. Default is 10.
query_attr::Int
: Set the annotaion for query . Default is 1.
string_querys::Vector{String}
: A list of query filename.
smooth::Int
: Perform smooth in the simialarty or not. Default is true.
int_type::Symbol
: Symbol indicate the dabase is for networks selection or patients ranking. It could be :ranking
or :selection
, Default is :selection.
thread::Int
: The number of thread used to running the program. Default it 1.
Keywords
num_cv::Int
: The number of cross validation round. Default is 10.
query_attr::Int
: Set the annotaion for query . Default is 1.
string_querys::Vector{String}
: A list of query filename.
smooth::Int
: Perform smooth in the simialarty or not. Default is true.
int_type::Symbol
: Symbol indicate the dabase is for networks selection or patients ranking. It could be :ranking
or :selection
, Default is :selection.
thread::Int
: The number of thread used to running the program. Default it 1.
top_net::String
: a txt file contains the name of selected top ranked networks.
Constructor
Database(network_dir, id, query_dir;kwarg...)
Create new Database
. See example data in test/data
folder.
Example
# enter example data directory
cd(joinpath(Pkg.dir("ModMashup"), "test/data"))
# dir should be a directory containing similairty networks flat file.
network_dir = "networks"
# target_file should be a flat file contains labels for patient
labels = "target.txt"
# Directory where a list of query flat files are located using the
# same format and naming manner with genemania query.
# If database is used to ranking instead of selection,
# query_dir should be a single query file instead of a directory.
# query files should contains keyword `query`.
query_dir = "."
# If runs for patient ranking, you only need to provide one query file.
# So do provide the name of query file instead of the directory.
query = "CV_1.query"
# Id file contains all the name of patients.
id = "ids.txt"
# Other setting
## Do smooth in the network or not for mashup integration.
smooth = true
## Txt file containing the name of selected networks
## for patients ranking
top_net = "top_networks.txt"
# Construct the dabase, which contains the preliminary file.
# Mode 1: Construct the dabase for networks selection
database = ModMashup.Database(network_dir, id,
query_dir, labels_file = labels,
smooth = smooth,
int_type = :selection)
# Mode 2: Construct the database for patients ranking.
database = ModMashup.Database(network_dir, id,
query, smooth = smooth,
top_net = top_net,
int_type = :ranking)