Database

DataBase

Store general and in-depth information for network integration and label propagation.

Generally, it contains all similairty networks's file name, patient id-name dictionary, query, patients labels, (keyword: smooth or not, use 1 as query or -1, selection or patients ranking, if ranking , provide top_net file containing selected network).

Inside Database, I simply use a dictionary to map patients' name to their internal id.

For example, you can access labels information in Database through database.labels

Fields

string_nets::Vector{String}: A vector of similarity networks file name.

labels::OneHotAnnotation: Disease annotation for patients.

n_patients::Int: The number of patients in the databse.

patients_index::Dict{String,Int}: Map patient name to their id.

inverse_index::Dict{Int,String}: Map patient id to their name.

num_cv::Int: The number of cross validation round. Default is 10.

query_attr::Int: Set the annotaion for query . Default is 1.

string_querys::Vector{String}: A list of query filename.

smooth::Int: Perform smooth in the simialarty or not. Default is true.

int_type::Symbol: Symbol indicate the dabase is for networks selection or patients ranking. It could be :ranking or :selection, Default is :selection.

thread::Int: The number of thread used to running the program. Default it 1.

Keywords

num_cv::Int: The number of cross validation round. Default is 10.

query_attr::Int: Set the annotaion for query . Default is 1.

string_querys::Vector{String}: A list of query filename.

smooth::Int: Perform smooth in the simialarty or not. Default is true.

int_type::Symbol: Symbol indicate the dabase is for networks selection or patients ranking. It could be :ranking or :selection, Default is :selection.

thread::Int: The number of thread used to running the program. Default it 1.

top_net::String: a txt file contains the name of selected top ranked networks.

Constructor

Database(network_dir, id, query_dir;kwarg...)

Create new Database. See example data in test/data folder.

Example

# enter example data directory
cd(joinpath(Pkg.dir("ModMashup"), "test/data"))

# dir should be a directory containing similairty networks flat file.
network_dir = "networks"

# target_file should be a flat file contains labels for patient
labels = "target.txt"

# Directory where a list of query flat files are located using the 
# same format and naming manner with genemania query.
# If database is used to ranking instead of selection,
# query_dir should be a single query file instead of a directory.
# query files should contains keyword `query`.
query_dir = "."
# If runs for patient ranking, you only need to provide one query file.
# So do provide the name of query file instead of the directory.
query = "CV_1.query"

# Id file contains all the name of patients.
id = "ids.txt"

# Other setting
## Do smooth in the network or not for mashup integration.
smooth = true
## Txt file containing the name of selected networks 
## for patients ranking
top_net = "top_networks.txt"

# Construct the dabase, which contains the preliminary file.
# Mode 1: Construct the dabase for networks selection
database = ModMashup.Database(network_dir, id,
                            query_dir, labels_file = labels,
                            smooth = smooth,
                            int_type = :selection)

# Mode 2: Construct the database for patients ranking.

database = ModMashup.Database(network_dir, id, 
                            query, smooth = smooth,
                            top_net = top_net,
                            int_type = :ranking)
source