## A multiple kernel learning algorithm for drug-target interaction prediction

Nascimento *et al. BMC Bioinformatics * (2016) 17:46 DOI 10.1186/s12859-016-0890-3
**Open Access**

A multiple kernel learning algorithm fordrug-target interaction predictionAndré C. A. Nascimento1,2,3*, Ricardo B. C. Prudêncio1 and Ivan G. Costa1,3,4
**Background: **Drug-target networks are receiving a lot of attention in late years, given its relevance for

pharmaceutical innovation and drug lead discovery. Different in silico approaches have been proposed for the

identification of new drug-target interactions, many of which are based on kernel methods. Despite technical

advances in the latest years, these methods are not able to cope with large drug-target interaction spaces and to

integrate multiple sources of biological information.

**Results: **We propose KronRLS-MKL, which models the drug-target interaction problem as a link prediction task on

bipartite networks. This method allows the integration of multiple heterogeneous information sources for the

identification of new interactions, and can also work with networks of arbitrary size. Moreover, it automatically selects

the more relevant kernels by returning weights indicating their importance in the drug-target prediction at hand.

Empirical analysis on four data sets using twenty distinct kernels indicates that our method has higher or comparable

predictive performance than 18 competing methods in all prediction tasks. Moreover, the predicted weights reflect

the predictive quality of each kernel on exhaustive pairwise experiments, which indicates the success of the method

to automatically reveal relevant biological sources.

**Conclusions: **Our analysis show that the proposed data integration strategy is able to improve the quality of the

predicted interactions, and can speed up the identification of new drug-target interactions as well as identify relevant

information for the task.

**Availability: **The source code and data sets are available at

**Keywords: **Artificial intelligence, Supervised machine learning, Kernel methods, Multiple kernel learning,

Drug discovery

interactions between these entities. Nevertheless, as the
Drug-target networks are receiving a lot of attention in
experimental verification of such interactions does not
late years, given their relevance for pharmaceutical inno-
scale with the demand for innovation, the use of computa-
vation and drug repositioning purposes Although
tional methods for the large scale prediction is mandatory.

the amount of known interactions between drugs and
There is also a clear need for systems-based approaches to
target proteins has been increasing, the number of tar-
integrate these data for drug discovery and repositioning
gets for approved drugs is still only a small proportion
(*< *10 %) from the human proteome Recent advances
Recently, an increasing number of methods have been
on high-throughput methods provide ways for the pro-
proposed for drug-target interaction (DTI) prediction.

duction of large data sets about molecular entities as
They can be categorized in ligand-based, docking-based,
drugs and proteins. There is also an increase in the avail-
or network-based methods The docking approach,
ability of reliable databases integrating information about
which can provide accurate estimates to DTIs, is com-putationally demanding and requires a 3D model of the
*Correspondence:
target protein. Ligand-based methods, such as the quan-
1Center of Informatics, UFPE, Recife, Brazil
titative structure activity relationship (QSAR), are based
Department of Statistics and Informatics, UFRPE, Recife, Brazil
Full list of author information is available at the end of the article
2016 Nascimento et al. **Open Access **This article is distributed under the terms of the Creative Commons Attribution 4.0

International License which permits unrestricted use, distribution, and

reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the

Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver

applies to the data made available in this article, unless otherwise stated.

Nascimento *et al. BMC Bioinformatics * (2016) 17:46
on a comparison of a candidate ligand to the known lig-
These approaches use *base kernels *to measure the sim-
ands of a biological target However, the utility of these
ilarity between drugs (or targets) using distinct sources
ligand-based methods is limited when there are few lig-
of information (e.g., structural, pharmacophore, sequence
ands for a given target Alternatively, network
and function similarity). A *pairwise kernel *function, which
based approaches use computational methods and known
measures the similarity between drug-target pairs, is
DTIs to predict new interactions Even though
obtained by combining a drug and a protein base kernel
ligand-based and docking-based methods are more pre-
via kernel product.

cise when compared to network based approaches, the
The majority of previous network approaches use clas-
latter are more adequate for the estimation of new inter-
sification methods, as Support Vector Machines (SVM),
actions from complete proteomes and drugs catalogs
to perform predictions over the drug-target interaction
Therefore, it can indicate novel candidates to be evaluated
space However, such techniques have major limi-
by more accurate methods.

tations. First, they can only incorporate one pair of base
Most network approaches are based on bipartite graphs,
kernels at a time (one for drugs and one for proteins)
in which the nodes are composed of drugs (small
to perform predictions. Second, the computation of the
molecules) and biological targets (proteins) Edges
pairwise kernel matrix for the whole interaction space (all
between drugs and targets indicate a known DTI (Fig
possible drug-target pairs) is computationally unfeasible
Given a known interaction network, kernel based meth-
even for a moderate number of drugs and targets. More-
ods can be used to predict unknown drug-target inter-
over, most drug target interaction databases provide no
actions A *kernel *can be seen as a similarity
true negative interaction examples. The common solution
matrix estimated on all pairs of instances. The main
for these issues is to randomly sample a small proportion
assumption behind network kernel methods is that simi-
of unknown interactions to be used as negative examples.

lar ligands tend to bind to similar targets and vice versa.

While this approach provides a computationally trackable
Sequence kernels,
chemical structures,
functional annotations,
**Fig. 1 **Overview of the proposed method. **a **The drug-target is a bipartite graph with drugs (left) and proteins (right). Edges between drugs and

proteins (solid line) indicates a known drug-protein interaction. The drug-protein interaction problem is defined as finding unknown edges (dashed

lines) with the assumption that similar drugs (or proteins) should share the same edges. **b **KronRLS-MKL uses several drugs (and protein) kernels to

solve the drug-target interaction problem. Distinct Kernels are obtained by measuring similarities of drugs (or proteins) using distinct information

sources. **c **KronRLS-MKL provides not only novel predicted interactions as it indicates the relevance (weights) of each kernel used in the predictions

Nascimento *et al. BMC Bioinformatics * (2016) 17:46
small drug-target pairwise kernel, it generates an easier
competitive in the majority of evaluated scenarios. More-
but unreal classification task with balanced class size
over, KronRLS-MKL was able to select and also indicate
An emerging machine learning (ML) discipline focused
the relevance of kernels, in the form of weights, for each
on the search for an optimal combination of kernels,
called Multiple Kernel Learning (MKL) MKL-likemethods have been previously proposed to the prob-
lem of DTI prediction and the closely related
In this work, we propose an extension of the Kron-
protein-protein interaction (PPI) prediction problem
RLS algorithm under recent developments of the MKL
This is extremely relevant, as it allows the use of
framework to address the problem of link predic-
distinct sources of biological information to define sim-
tion on bipartite networks with multiple kernels. Before
ilarities between molecular entities. However, since tra-
introducing our method, we will describe the RLS and
ditional MKL methods are SVM-based they are
the KronRLS algorithms (for further information, see
subject to memory limitations imposed by the pairwise
kernel, and are not able to perform predictions in the com-plete drugs vs. protein space. Moreover, MKL approaches
**RLS and KronRLS**

used in PPI prediction problem and protein
Given a set of drugs *D *= {*d*1, *. . *, *dn *}, targets *T *=
function prediction can not be applied to bipar-
{*t*1, *. . *, *tn *}, and the set of training inputs *x*
*i *(drug-target
tite graphs, as the problem at hand. Currently, we are
pairs) and their binary labels *yi *∈ R (where 1 stands for a
only aware of two recent works proposing MKL
known interaction and 0 otherwise), with 1 *< i *≤ *n*, *n *=
approach to integrate similarity measures for drugs and
*D* *T* (number of drug-target pairs). The RLS approach
minimizes the following function
Drug-target prediction fits a link prediction problem
which can be solved by a Kronecker regularized least
squares approach (KronRLS) . A single kernel version
*i *− *f (xi))*2 +
of this method has been recently applied to drug-targetprediction problem A recent survey indicated
where *f **K *is the norm of the prediction function *f *on
that KronRLS outperforms SVM based methods in DTI
the Hilbert space associated to the kernel *K*, and *λ > *0
prediction KronRLS uses Kronecker product algebraic
is a regularization parameter which determines the com-
properties to be able to perform predictions on the whole
promise between the prediction error and the complexity
drug-target space, without the explicit calculation of the
of the model. According to the representer theorem
pairwise kernels. Therefore, it can cope with problems on
a minimizer of the above objective function admits a dual
large drugs vs. proteins spaces. However, KronRLS can
representation of the following form
not be used on a MKL context.

In this work, we propose a new MKL algorithm to
automatically select and combine kernels on a bipar-
*iK (x*, *xi) *,
tite drug-protein prediction problem, the KronRLS-MKLalgorithm (Fig For this, we extend the KronRLS method
where *K *: *D* *T* × *D* *T* → R is named the pair-
to a MKL scenario. Our method uses *L*2 regularization
wise kernel function and **a **is the vector of dual variables

to produce a non-sparse combination of base kernels.

corresponding to each separation constraint. The RLS
The proposed method can cope with large drug vs. target
algorithm obtains the minimizer of Eq. solving a sys-
interaction matrices; does not requires sub-sampling of
tem of linear equations defined by *(K *+ *λI)***a **= **y**, where

the drug-target network; and is also able to combine and
**a **and **y **are both *n*-dimensional vectors consisting of the

select relevant kernels. We perform an empirical analysis
parameters *ai *and labels *yi*.

using drug-target datasets previously described and a
One can construct such pairwise kernel as the prod-
diverse set of drug kernels (10) and protein kernels (10).

uct of two base kernels, namely *K ((d*, *t)*, *(d*, *t**)) *=
In our experiments, we considered three different sce-
*KD(d*, *d**)KT (t*, *t**)*, where *KD *and *KT *are the base kernels
narios in the DTI prediction pair prediction,
for drugs and targets, respectively. This is equivalent to
where every drug and target in the training set have
the Kronecker product of the two base kernels
at least one known interaction; or the ‘new drug' and
*K *= *KD *⊗ *KT *. The size of the kernel matrix makes
‘new target' setting, where some drugs and targets are
the model training computationally unfeasible even for
present only in the test set, respectively. A comparative
moderate number of drugs and targets
analysis with top performance single kernel approaches
The KronRLS algorithm is a modification of RLS, and
and all competing integrative approaches
takes advantage of two specific algebraic properties of the
demonstrates that our method is better or
Kronecker product to speed up model training: the so
Nascimento *et al. BMC Bioinformatics * (2016) 17:46
called *vec trick *and the relation of the eigendecom-
This way, we can rewrite the classification function as
position of the Kronecker product to the eigendecompo-
*K *∗ *A K *∗
, where *A *= *unvec(***a**). Using the same itera-

sition of its factors
tive approach considered in previous MKL strategies
Let *KD *= *QDDQTD *and *KT *= *QTTQT *be the
we propose the use of a two step optimization process, in
eigendecomposition of the kernel matrices *KD *e *KT *. The
which the optimization of the vector **a **is interleaved with

solution **a **can be given by solving the following equation

the optimization of the kernel weights. Given two initial
weight vectors, **β**0*D *and **β**0 , an optimal value for the vec-

**a **= *vec(Q*

tor **a**, using Eq. is found, and with such optimal **a**, we

can proceed to find optimal **β**D and **β**T . More specifically,

where *vec(*·*) *is the vectorization operator that stacks the
Eq. can be redefined when **a **is fixed, and knowing that

columns of a matrix into a vector, and *C *is a matrix
*f *2 = **a**TK**a **we have:

*C *= *(*
**u **= **y **−

*D *⊗ *T *+ *λI)*−1*vec(QT*
*T Y T QD) *.

The KronRLS algorithm is well suited for the large pair-
wise space involved on the DTI prediction problem, since

the estimation of vector **a **using Eqs. and is a much

*J(fa) *= 1 **u **− *K***a**2 + 1 **a**T (y − *λ***a**).

faster solution compared to the original RLS estimationprocess in such scenario. However, it does not support the
Since the second term does not depend on *K *(and thus
use of multiple kernels.

does not depend on the kernel weights), and, as **y **and **a**

are fixed, it can be discarded from the weights optimiza-

**KronRLS MKL**

tion procedure. Note that we are not interested in a sparse
In this work, a vector of different kernels is considered,
selection of base kernels as in therefore we intro-
duce a *L*2 regularization term to control sparsity of
*D *= *(K *1
*D*, *K *2
*D*, *. . *, *K PD*
and **k**T = *(K*1 , *K*2 , *. . *, *KPT *,

the kernel weights, also known as a ball constraint. This
*D *and *PT *indicate the number of base kernels defined
over the drugs and target set, respectively. In this section,
term is parameterized by the *σ *regularization coefficient.

we propose an extension of KronRLS to handle multiple
Additionally, we can convert **u **to its matrix form by the

application of the *unvec *operator, i.e., *U *= *unvec(***u**),

The kernels can be combined by a linear function, i.e.,
and also use a more appropriate matrix norm (Frobenius,
the weighted sum of base kernels, corresponding to the
*A *2≤ *A **F *In this way, for any fixed values of
optimal kernels *K *∗
**a **and **β**T , the optimal value for the combination vector is

*D *and *K *∗ :
obtained by solving the optimization problem defined as:
*DK iD *, *K *∗
*U *− **m**D**β**

*D **F *+ *σ * **β**D2

**m**D = *K*∗*TA K*1*D*

, *. . *, *K*∗*TA KPA*
*D*, *. . *, *βPD*
and **β**T =

, *. . *, *βPT *,
correspond to the weights of drug and protein kernels,
while the optimal **β**

In the author demonstrated that MKL can be inter-
*T *can be found fixing the values of **a**

preted as a particular instance of a kernel machine with
*D*, according to:
two layers, in which the second layer is a linear function.

*U *− **β**

His work provides the theoretical basis for the develop-
*T ***m**T *F *+ *σ * **β**T 2

ment of a MKL extension for the closely related KronRLS
, ., *K PT A K *∗
algorithm in our work.

The classification function of Eq. can be written in
matricial form, *fa *= *K***a **and applying the well known

The optimization method used here is the interior-point
property of the Kronecker product, *(A *⊗ *B)vec(X) *=
optimization algorithm implemented in MATLAB
*vec BXAT *we have:
*fa(X) *= *K***a**

= *K*∗ ⊗
*T vec QT CQT*
The datasets considered were first proposed by and
used by most competing methods
*QT CQTD K*∗*D*
Each dataset consists of a binary matrix, containing the
Nascimento *et al. BMC Bioinformatics * (2016) 17:46
known interactions of a determined set of drug targets,
**Table 2 **Network entities and respective kernels considered for

namely Enzyme (E), Ion Channel (IC), GPCR and Nuclear
combination purposes
Receptors (NR), based on information extracted from the
KEGG BRITE BRENDA SuperTarget and
DrugBank databases All four datasets are extremely
AERS-bit - AERS bit
unbalanced, if we consider the whole drug-target inter-
AERS-freq - AERS freq
action space, i.e., the number of known interactions is
GIP - Gaussian Interaction Profile
extremely lower than the number of unknown interac-
LAMBDA - Lambda-k Kernel
Chem. Struct.

tions, as presented in Table
In order to analyze each type of entity from differ-
MARG - Marginalized Kernel
Chem. Struct.

ent points of view, we extracted 20 (10 for targets and
MINMAX - MinMax Kernel
Chem. Struct.

10 for drugs) distinct kernels from chemical structures,
SIMCOMP - Graph kernel
Chem. Struct.

side-effects, amino acid sequence, biological function, PPI
SIDER - Side-effects Similarity
interactions and network topology (a summary of base
SPEC - Spectrum Kernel
Chem. Struct.

kernels is presented in Table .

TAN - Tanimoto Kernel
Chem. Struct.

GIP - Gaussian Interaction Profile
Here we use the following information sources about tar-
GO - Gene Ontology Semantic Similarity
Func. Annot.

get proteins: amino acid sequence, functional annotationand proximity in the protein-protein network. Concern-
MIS-k3m1 - Mismatch kernel
(*k *= 3, *m *= 1)
ing sequence information, we consider the normalizedscore of the Smith-Waterman alignment of the amino acid
MIS-k4m1 - Mismatch kernel
(*k *= 4, *m *= 1)
sequence (SW) as well as different parametrizations
MIS-k3m2 - Mismatch kernel
of the Mismatch (MIS) and the Spectrum (SPEC)
(*k *= 3, *m *= 2)
kernels. For the Mismatch kernel, we evaluated four com-
MIS-k4m2 - Mismatch kernel
binations of distinct values for the k-mers length (*k *= 3
(*k *= 3, *m *= 2)
and *k *= 4) and the number of maximal mismatches
PPI - Proximity in protein-protein
per k-mer (*m *= 1 and *m *= 2), namely MIS-k3m1,
MIS-k3m2, MIS-k4m1 and MIS-k4m2; for the Spec-
SPEC-k3 - Spectrum kernel (*k *= 3)
trum kernel, we varied the k-mers length (*k *= 3 and *k *=
SPEC-k4 - Spectrum kernel (*k *= 4)
4, SPEC-k3 and SPEC-k4, respectively). Both Mismatchand Spectrum kernels were calculated using the R package
SW - Smith-Waterman aligment score
The Gene Ontology semantic similarity kernel (GO)
with the Resnik algorithm We also extracted a
was used to encode functional information. GO terms
similarity measure from the human protein-protein net-
were extracted from the BioMART database and
work (PPI), obtained from the BioGRID database
the semantic similarity scores between the GO annota-
The similarity between each pair of targets was calculated
tion terms were calculated using the csbl.go R package
based on the shortest distance on the corresponding PPInetwork, according to:
**Table 1 **Number drugs, targets and positive instances (known

interactions) vs. the number of negative (or unknown)

*S(p*, *p**) *= *AebD(p*,*p**)*,
interactions on each dataset
where *A *and *b *parameters were set as in (*A *=
0.9, *b *= 1), and *D(p*, *p**) *is the shortest hop distance
Nuclear receptors
between proteins *p *and *p*.

As drug information sources, we consider 6 distinct chem-
ical structure and 3 side-effects kernels. Chemical struc-
ture similarity between drugs was achieved by the applica-
tion of the SIMCOMP algorithm (obtained from
defined as the ratio of common substructures betweentwo drugs based on the chemical graph alignment. We
also computed the Lambda-k kernel (LAMBDA) the
Marginalized kernel (MARG), the MINMAX kernel
Nascimento *et al. BMC Bioinformatics * (2016) 17:46
the Spectrum kernel (SPEC) and the Tanimoto kernel
of multiple kernels (respectively for drugs and targets);
(TAN). These later kernels were calculated with the R
(2) actual MKL methods specifically proposed for DTI
Package Rchemcpp with default parameters.

Two distinct side-effects data sources were also consid-
ered. The FDA adverse event reporting system (AERS),
from which side effect keywords (adverse event keywords)
We extend state-of-the-art methods for the
similarities for drugs were first retrieved by . The
DTI prediction problem for a multiple kernel context. For
authors introduced two types of pharmacological profiles
this, initially we average multiple kernels to produce a sin-
for drugs, one based on the frequency information of side
gle kernel (respectively for drugs and targets). Once we
effect keywords in adverse event reports (AERS-freq)
have a single average kernel (one for drug and one for
and another based on the binary information (presence
target), we adopt a standard kernel method for DTI pre-
or absence) of a particular side-effect in adverse event
diction, i.e., the base learner. In our experiments, two dis-
reports (AERS-bit). Since not every drug in the Nuclear
tinct previous combinations strategies are used: the mean
Receptors, Ion Channel, GPCR and Enzyme datasets is
of base kernels and the kernel alignment (KA) heuristic,
also present on AERS-based data, we extracted the simi-
previously proposed by We will briefly describe the
larities of the drugs in AERS, and assigned zero similarity
base learners, followed by a short overview of the two
to drugs not present.

combination strategies considered.

The second side-effect resource was the SIDER
The Bipartite Local Model (BLM) is a machine
database1 This database contains information about
learning based algorithm, where drug-target pairs are pre-
commercial drugs and their recorded side effects or
dicted by the construction of the so called ‘local models',
adverse drug reactions. Each drug is represented by a
i.e., a SVM classifier is trained for each drug in the training
binary profile, in which the presence or absence of each
set, and the same is done for targets. Then, the maxi-
side effect keyword is coded 1 or 0, respectively. Both
mum scores for drugs and targets are used to predict new
AERS and SIDER based profile similarities were obtained
drug-target interactions. Since BLM demonstrated supe-
by the weighted cosine correlation coefficient between
rior performance than Kernel Regression Method (KRM)
each pair of drug profiles
in previous studies we did not consider KRMin our experiments.

**Network topology information**

The Network-based Random Walk with Restart on the
We also use drug-target network structure in the form of
Heterogeneous network (NRWRH) algorithm predicts
a network interaction profile as a similarity measure for
new interactions between drugs and targets by the simu-
both proteins and drugs. The idea is to encode the con-
lation of a random walk in the network of known drug-
nectivity behavior of each node in the subjacent network.

target predictions as well as in the drug-drug and protein-
The Gaussian Interaction Profile kernel (GIP) was
protein similarity networks. LapRLS and NetLapRLS are
calculated for both drugs and targets.

both proposed in Both are based on the RLS learn-ing algorithm, and perform similarity normalization by
the application of the Laplacian operator. Predictions are
We compare the predictive performance of the KronRLS-
done for drugs and targets separately, and the final predic-
MKL algorithm against other MKL approaches, as well as
tion scores are obtained by averaging the prediction result
in a single kernel context (one kernel for drugs, and one for
from drug and target spaces.

targets). In the latter, we evaluate the performance of each
As said previously, most previous SVM-based methods
possible combination of base kernels (Table with the
found on the literature can be reduced to the Pairwise Ker-
KronRLS algorithm, recently reported as the best method
nel Method (PKM) with the distinction being made
for predicting drug-target pairs with single paired kernels
by the kernels used and the adopted combination strat-
This resulted in a total of 10 × 10 = 100 different
egy. PKM starts with the construction of a pairwise kernel,
combinations. The best performing pairs were then used
computed from the drug and target similarities. Given two
as baselines in our method evaluation, selected according
drug-target pairs, *(d*, *p) *and *(d*, *p**)*, and the respective
to two distinct criteria: the kernel pair that achieved the
drug and target similarities, *KD *and *KP*, the pairwise ker-
largest area under the precision recall curve (AUPR) on
nel is given by *K ((d*, *p)*, *(d*, *p**) *= *KD(d*, *d**) *× *KP(p*, *p**)*.

the training set, and, a more optimistic approach, which
Once the pairwise matrix is computed, it is then used to
considered the largest AUPR on the testing set.

train a SVM classifier.

Besides the combination of single kernels for drugs and
The PKM KronRLS, BLM, NRWRH, LapRLS and
targets, two different kinds of methods were adopted to
NetLapRLS algorithms cannot cope with multiple kernels.

integrate multiple kernels: (1) standard non-MKL ker-
For this reason, we consider two simple methods avail-
nel methods for DTI prediction, trained on the average
able for kernel combination: the mean of base kernels and
Nascimento *et al. BMC Bioinformatics * (2016) 17:46
the kernel alignment (KA) heuristic The mean drug
pairwise matrix is computed, it is then used to train a SVM
kernel is computed as *K *∗ =
*i*=1 *K iD*, and the same
classifier. This procedure is also known as the Pairwise
can be done for targets, analogously. KA is a heuristic for
Kernel Method (PKM) For this reason, we refer to the
the estimation of kernel weights based on the notion of
approach proposed by by PKM-MAX.

kernel alignment More specifically, the weight vector,
The authors in suggest as further work a weighted
**β**D for instance, can be obtained by:

sum approach. They suggest to learn the optimal convex
combination of data sources maximizing the correlation
of the obtained kernel matrix with the topology of drug-
protein network. This objective can be achieved by solving
a linear programming problem, as follows:
where **yy**T stands for the ideal kernel and **y **being the label

*D*, *dist)*
vector. The alignment *A K *, **yy**T of a given kernel *K *and

where *K *∗
the ideal kernel **yy**T is defined as:

*D *correspond to the optimal combination of drug
kernel matrices with weight vector **β**

*D*, *dist *is the drug-
*K *, **yy**T

drug distance matrix in the DTI network, and *corr *rep-
*A K *, **yy**T =

*K*, *K*
resents the correlation coefficient. Analogously, the same
can be done for targets. We call this method WANG-MKL.

where *K *, **yy**T

*(K) ***yy**T . Once such com-

*i*=1 *j*=1
binations are performed, the resulting drug and protein
Previous work suggest that, in the context of
kernels are then used as input to the learning algorithm.

paired input problems, one should consider separately
We refer to the mean and KA heuristics appending the
the experiments where the training and test sets share
-MEAN and -KA, respectively, to each base learner.

common drugs or proteins. In order to achieve a clearnotion of the performance of each method, all competing
**Multiple kernel approaches**

approaches were evaluated under 5 runs of three distinct
Similarity-based Inference of drug-TARgets (SITAR)
5-fold cross-validation (CV) procedures:
constructs a feature vector with the similarity values,where each feature is based on one drug-drug and one
1. ‘new drug' scenario: it simulates the task of
gene-gene similarity measure, resulting in a total of *PD *×
predicting targets for new drugs. In this scenario, the
*PT *features. Each one is calculated by combining the drug-
drugs in a dataset were divided in 5 disjoint subsets
drug similarities between the query drug and other drugs
(folds). Then the pairs associated to 4 folds of drugs
and the gene-gene similarities between the query gene and
were used to train the classifier and the remaining
other target genes across all true drug-target associations.

pairs are used to test;
The method also performs a feature selection procedure
2. ‘new target' scenario: it corresponds in turn to
and yields the final classification scores using a logistic
predicting interacting drugs for new targets. This is
analogous to the above scenario, however
Gönen and Kaski proposed the Kernelized Bayesian
considering 5 folds of targets;
Matrix Factorization with Twin Multiple Kernel Learn-
3. pair prediction: is consists of predicting unknown
ing (KBMF2MKL) algorithm, extending a previous work
interactions between known drugs and targets. All
to handle multiple kernels. The KBMF2MKL factor-
drug-target interactionswere split in five folds, from
izes the drug-target interaction matrix by projecting the
which 4 were used for training and 1 for testing.

drugs and the targets into a common subspace, where the
Some of the competing methods (PKM-based,
projected drug and target kernels are multiplied. Normally
WANG-MKL and SITAR) were trained with
distributed Kernel weights for each subspace projected
sub-sampled datasets, i.e., we randomly selected the
kernel are then estimated without any constraints. The
same number of known interactions among the
product of the final combined matrices is then used to
unknown interaction set, since these methods cannot
make predictions.

be executed in large networks Although
Wang et al. proposes to use a simple heuristic to
balanced classes are unlikely in real scenarios, we also
previously combine the drug and target similarities, and
performed experiments in context (3), using a
then use a SVM classifier to perform the predictions.

sub-sampled test set, obtained by sampling as many
Only the maximum similarity values of drug and target
negative examples as positive examples from
kernel matrices are selected, resulting in two distinct ker-
the test fold. This experiment is relevant for
nels. They are then used to construct a pairwise kernel,
comparison to previous work, since most previous
computed from the drug and target similarities. Once the
studies on drug-target prediction performed
Nascimento *et al. BMC Bioinformatics * (2016) 17:46
under-sampling to evaluate predictive performance
the interval {0, 0.25, 0.5, 0.75, 1}. The number of com-
(see Additional file Table S1).2
ponents in KBMF2MKL was varied in the interval *R *∈
{5, 10, *. . *, 40}, and for the LapRLS and NetLapRLS
The hyperparameters of each competing methods were
we varied *βd*, *βt *∈ {0.25, 0.50, *. . *, 1}. In NetLapRLS
optimized under a nested CV procedure, using the fol-
we also considered two distinct values for *γd*2, *γt*2 ∈
lowing values: for the SVM-based methods (PKM, BLM
{0.01, 0.1}. For NRWRH the restart probability was eval-
and WANG-MKL), the SVM cost parameter was evaluated
uated in the set {0.1, 0.2, *. . *, 0.9}. After the hyperpa-
under the interval {2−1, *. . *, 23}; for the KronRLS-based
rameters were selected for each method, the outer
methods, the *λ *parameter was evaluated in the inter-
loop evaluated the predictive performance for the test
val {2−15, 2−10, *. . *, 230}. The *σ *regularization coefficient
set partition with the model built using the selected
of the KRONRLS-MKL algorithm was also optimized in
**Fig. 2 **Average performance of each single kernel with the KronRLS algorithm as base learner. The boxplots shows the AUPR performance of drug

and protein kernels across different kernel combinations

Nascimento *et al. BMC Bioinformatics * (2016) 17:46
**Table 3 **Results on MKL Experiments on 5 × 5 cross-validation experiments

[SPEC-k4]-[GIP] ∗
[SPEC-k4]-[MINMAX] †
Nascimento *et al. BMC Bioinformatics * (2016) 17:46
**Table 3 **Results on MKL Experiments on 5 × 5 cross-validation experiments *(Continued)*

Best performing methods are indicated in bold. Standart deviation is indicated in brackets. Training of the PKM, SITAR and WANG algorithms was done with the balanced
training set†best on training
∗ best on testing
The evaluation metric considered was the AUPR, as it
According to this metric provides a better qual-
allows a good quantitative estimate of the ability to sep-
ity estimate for highly unbalanced data, since it punishes
arate the positive interactions from the negative ones.

more heavily the existence of false positives (FP). This
Nascimento *et al. BMC Bioinformatics * (2016) 17:46
is specially true for the datasets considered, as demon-
better than all evaluated methods with the exception of
strated on Table in which all datasets are extremely
BLM-Mean, KBMF-MKL, KRONRLS-KA, KRONRL-MEAN
and KRONRLS-MKL (*α *= 0.05 Additional file Inthe 'new drug' problem, KRONRLS-MKL obtained higher
**Results and discussion**

AUPR in the NR and GPCR datasets, while BLM-KA had
**Paired kernel experiments**

higher AUPR values in the IC and Enzyme data. Both
As a base study, we evaluate the performance of KronRLS
KRONRLS-MKL and BLM-KA had statistically significant
on all pairs of kernels (10 × 10 pairs). The AUPR results of
higher AUPR (at *α *= 0.05; Additional file than all other
all pairs of kernels for the Nuclear Receptors, GPCR, Ion
competing methods. In order to give an overview of the
Channel and Enzyme datasets are show in more detail in
performance of the evaluated methods, an average rank-
the supplementary material (see Additional file
ing of the AUPR values obtained by all methods across the
The performance of KronRLS varies drastically with
four datasets is presented in Table
the kernel choice, as clearly demonstrated by the average
Methods also displayed distinct computational require-
performance of each kernel on the single kernel experi-
ments. Memory usage was stable accross all methods,
ments (Fig. For Nuclear Receptors, the best kernel pair
except from the SVM-based algorithms, which demon-
combination was SPEC-k4 and GIP, while GIP and SW
strated quadratic growth of the memory used in relation
performed best in all other data sets. It is also important
to the size of the dataset (BLM, PKM, WANG-MKL). This
to notice the impact of different parametrizations of the
is in part due to the construction of the explicit pairwise
Mismatch sequence kernel. Its performance decreases as
kernel (see Additional file Table S3). This fact turns such
more mismatches are allowed inside a k-mer. Overall, both
methods inadequate for contexts in which subsampling of
versions of AERS, SIMCOMP, GIP, MINIMAX and SIDER
pairs is undesirable.

drug kernels showed better performance, while LAMBDA,
We now discuss about computational time in the pair
MARG, SPEC and TAN performed worse. For targets, GIP,
prediction scenario. The precomputed kernels approaches
GO, MIS-k4m1, SPEC and SW kernels performed better
(MEAN and KA) were overall the fastest on average,
than other target kernels.

with PKM-based methods requiring less time to train and
**Table 4 **Average ranking over all four datasets

In this section, we compare the competing methods interms of AUPR for all datasets. Concerning KronRLS,
we will use the best kernel pair (Best Pair) with largest
AUPR as described in the previous section. This will serve
as a baseline to evaluate the MKL approaches. Results
are presented in Table In the pair prediction scenario,
KRONRLS-MKL obtained highest AUPR in all datasets. Itsresults are even superior than the performance in compar-
ison to the best kernel pair under the optimistic selection.

The results of KRONRLS-MKL in pair prediction are statis-
tically significant against all other methods (at *α *= 0.05),
except from KRONRLS-KA and KRONRLS-MEAN, accord-
ing to the Wilcoxon rank sum test (Additional file Con-
cerning the subsampled pair prediction, KRONRLS-MKLachieved highest AUPR in the NR and IC data sets,
and SITAR performed best in the GPCR and Enzyme
data. There it performed second, just after SITAR (see
Additional file Table S1). The highest AUPR values
obtained in the subsampled data sets in comparison to
the unbalanced data sets clearly indicate that performing
predictions in the complete data is a more difficult task.

Moreover, the number of positive examples was negatively
correlated to the dataset size for the complete datasets.

In the 'new target' scenario, BLM-KA performed best
in 3 of 4 datasets, followed closely by BLM-mean and
KRONRLS-MKL, demonstrating that the local SVM model
† best on training
is more effective in such scenario. BLM-KA performed
∗ best on testing
Nascimento *et al. BMC Bioinformatics * (2016) 17:46
test the models (∼1 min), followed by KronRLS-based
datasets were generated almost eight years ago, new
and LapRLS-based algorithms(∼20 and 27 min, respec-
interactions included in these databases will serve as a
tively). KBMF2MKL and BLM were the slowest, requiring
external validation set. We exclude interactions already
more than 100 min on average at the same task. The
present in the training data.

lower computation time of the heuristic-based methods is
We trained all methods with all interactions present
explained by the absence of complex optimization proce-
in the original datasets. In the specific case of BLM
dures to find the kernel weights. KronRLS-MKL took a lit-
and NRWRH, one model for drugs and another for tar-
tle less time than KBMF2MKL, taking an average over the
gets was trained, and then the maximum score for each
four datasets of 74 min. (see Additional file Table S4).

DT pair was considered for prediction. Then, we cal-culated the AUPR for each dataset separately, discard-
**Predictions on new drug-target interactions**

ing already known interactions (see Additional file
In order to evaluate the quality of final predictions in
Table S2). The low AUPR values of all methods indi-
a more realistic scenario, we performed an experiment
cate the difficulty in performing predictions in such large
similar to that described by We estimate the
search space. An average ranking (Fig. of each method
most highly ranked drug–target pairs as most likely true
across all databases indicates that KronRLS methods as
interactions, and performed a search on the current
best performing algorithms followed by single kernel
release of four major databases (DrugBank MATA-
approaches. It is also important to highlight the poor
DOR KEGG and ChEMBL As the training
performance of BLM-KA and BLM-MEAN in this task.

**Fig. 3 **Mean AUPR ranking of each method when compared to the new interactions found on updated databases. The KronRLS-based methods

achieved superior performance when compared to other integration strategies

Nascimento *et al. BMC Bioinformatics * (2016) 17:46
This indicates a poor generalization capacity of the BLM
(SW mean value is 0.1563). In addition, one of the targets
framework to the drug-target prediction problem (see
RORa is NR0B1 (nuclear receptor subfamily 0, group B,
member 1). This protein is very close to RORa in the PPI
Next, a more practical assessment of the predicting
network (similarity score of 0.90).

power of KRONRLS-MKL is done, by looking to the top
Concerning Ion Channel models, prediction ranked 2
5 ranked interactions predicted by our method (Table
and 3 indicate the interaction of Verapamil and Diazox-
We observe that the great majority of interactions (14 out
ide with ATP-binding cassete sub-family C (ABBCC8).

of 20) have been already described in ChEMBL, Drug-
ABBCC8 is one of the proteins encoding the sulfonylurea
Bank or Matador. We focus our discussion in selected
receptor (SUR1) and is associated to calcium regulation
novel interactions. For example, in the Nuclear Receptor
and diabetes type I Interestingly, there are positive
database, the 5th ranked prediction indicates the asso-
reports of Diazoxide treatments to prevent diabetes in rats
ciation of Tretinoin with the nuclear factor RAR-related
orphan receptor A (RORa). Tretinoin is a drug currentlyused to treatment of acnes Interestingly, its molec-
**Evaluation of kernel weigths**

ular activity is associated with the activation of nuclear
The kernel weights given by KBMF2MKL, KRONRLS-MKL
receptors of the closelly related RAR family.

and WANG-MKL, as well as the KA heuristic, can be
This is also a good example to illustrate the benefits for
used to analyze the ability of such methods to identify
incorporation of multiple sources of data. Both RORa and
the most relevant information sources. As there is no
Tretinoin do not share nodes in the training set. All tar-
guideline or gold standard for this, we resort to a sim-
gets of Tretinoin have a high GO similarity to RORa (mean
ple approach: compare the kernel weights (Fig. with
value of 0.8368) despite of theirr low sequence similarity
the average performance of each kernel on the single
**Table 5 **Top five predicted interactions by KRONRLS-MKL

Nuclear Receptors
*estrogen receptor 1*
*estrogen receptor 1*
*estrogen receptor 1*
*RAR-related orphan receptor A*
*adrenoceptor beta 2, surface*
*dopamine receptor D3*
*adenosine A2a receptor*
*adenosine A1 receptor*
*adrenoceptor beta 3*
*glutamate receptor, ionotropic, kainate 2*
*ATP-binding cassette, sub-family C (CFTR*
*ATP-binding cassette, sub-family C (CFTR*
*potassium channel, two pore domain subfamily K, member 12*
*cholinergic receptor, nicotinic, alpha 1 (muscle)*
*cytochrome P450, family 2, subfamily E, polypeptide 1*
*cytochrome P450, family 2, subfamily C, polypeptide 9*
*cytochrome P450, family 2, subfamily A, polypeptide 7*
*cytochrome P450, family 4, subfamily A, polypeptide 11*
*cytochrome P450, family 1, subfamily A, polypeptide 1*
Interactions found in KEGG, DrugBank, ChEMBL and Matador are marked as K, D, C and M respectively
Nascimento *et al. BMC Bioinformatics * (2016) 17:46
**Fig. 4 **Comparison of the average final weights obtained by the Kernel Alignment (KA) heuristic, KBMF2MKL, KronRLS-MKL and WANG-MKL

algorithms. As one can note, the KA heuristic demonstrated close to mean weights, while KRONRLS-MKL and WANG-MKL effectively discarded

the most irrelevant kernels

kernel experiments (Fig. First, it is noticeable that
for the lower quality MIS-k3m2 in three of the four
the KA weights are very similar to the average selec-
tion (0.10). This indicates that no clear kernel selectionis performed. WANG-MKL and KRONRLS-MKL give low
weights to drug kernels LAMBDA, MARG, MINIMAX, SPEC
We have presented a new Multiple Kernel Learning algo-
and TAN and protein kernel MIS-k3m2. These kernels
rithm for the bipartite link prediction problem, which is
have overall worst AUPR in the single kernel experiments,
able to identify and select the most relevant information
which indicates an agreement with both selection pro-
sources for DTI prediction. Most previous MKL methods
cedures. Although the weights assigned by KBMF2MKL
mainly solve the problem of MKL when kernels are built
are not subject to convex constraints, as indicated by
over the same set of entities, which is not the case for
the larger weights assigned to all kernels, they also pro-
the bipartite link prediction problem, e.g. drug-target net-
vide a notion of quality of base kernels. We can observe
works. Regarding predictions in drug-target networks, the
a stronger preference to the GIP kernel, in all datasets,
sampling of negative/unknown examples, as a way to cope
even though the algorithm assigned a high weight
with large data sets, is a clear limitation . Our method
Nascimento *et al. BMC Bioinformatics * (2016) 17:46
takes advantage of the KronRLS framework to efficiently
perform link prediction on data with arbitrary size.

Center of Informatics, UFPE, Recife, Brazil. 2Department of Statistics and
Informatics, UFRPE, Recife, Brazil. 3IZKF Computational Biology Research Group,
In our experiments, the KronRLS-MKL algorithm
Institute for Biomedical Engineering, RWTH Aachen University Medical School,
demonstrated an interesting balance between accuracy
Aachen, Germany. 4Aachen Institute for Advanced Study in Computational
and computational cost in relation to other approaches.

Engineering Science (AICES), RWTH Aachen University, Aachen, Germany.

It performed best in the "pair" prediciton problem and
Received: 23 July 2015 Accepted: 5 January 2016
the "new target" problem. In the 'new drug' and 'new tar-get' prediction tasks, BLM-KA was also top ranked. Thismethod has a high computational cost. This arises from
the fact it requires a classifier for each DT pair More-
Csermely P, Korcsmáros T, Kiss HJM, London G, Nussinov R. Structureand dynamics of molecular networks: a novel paradigm of drug
over, it obtained poor results in the evaluation scenario to
discovery: a comprehensive review. Pharmacol Ther. 2013;138(3):
predict novel drug-protein pairs interactions.

The convex constraint estimation of kernel weights cor-
Ding H, Takigawa I, Mamitsuka H, Zhu S. Similarity-based machinelearning methods for predicting drug-target interactions: a brief review.

related well with the accuracy of a brute force pair kernel
Brief Bioinform. 2013.
search. This non-sparse combination of kernels possibly
Chen X, Yan CC, Zhang X, Zhang X, Dai F. Drug – target interaction
increased the generalization of the model by reducing the
prediction : databases , web servers and computational models. BriefBioinform. 20151–17.
bias for a specific type of kernel. This usually leads to
Yamanishi Y. Chemogenomic approaches to infer drug–target interaction
better performance, since the model can benefit from dif-
networks. Data Min Syst Biol. 2013;939:97–113.

ferent heterogeneous information sources in a systematic
Dudek AZ, Arodz T, Gálvez J. Computational methods in developing
way Finally, the algorithm performance was not sen-
quantitative structure-activity relationships (QSAR): a review. Comb Chem
sitive to class unbalance and can be trained over the whole
High Throughput Screen. 2006;9(3):213–8.

interaction space without sacrificing performance.

Sawada R, Kotera M, Yamanishi Y. Benchmarking a wide range ofchemical descriptors for drug-target interaction prediction using achemogenomic approach. Mol Inform. 2014;33(11-12):719–31.

Cheng F, Liu C, Jiang J, Lu W, Li W, Liu G, et al. Prediction of drug-target
2NRWRH cannot be applied to the pair prediction
interactions and drug repositioning via network-based inference. PLoSComput Biol. 2012;8(5):1002503.
by which this method was not considered in such context.

Chen X, Liu MX, Yan GY. Drug-target interaction prediction by randomwalk on the heterogeneous network. Mol BioSyst. 2012;8(7):1970–8.

Yamanishi Y, Kotera M, Kanehisa M, Goto S. Drug-target interaction
prediction from chemical, genomic and pharmacological data in anintegrated framework. Bioinformatics (Oxford, England). 2010;26(12):246–54.
**Figure. **Single kernel experiments on the Nuclear

10. van Laarhoven T, Nabuurs SB, Marchiori E. Gaussian interaction profile
Receptor dataset with the KronRLS algorithm as base learner. The heatmap
kernels for predicting drug-target interaction. Bioinformatics (Oxford,
shows the AUPR performance of different kernel combinations; red means
England). 2011;27(21):3036–43.
higher AUPR. (PDF 460 kb)
11. Pahikkala T, Airola A, Pietila S, Shakyawar S, Szwajda A, Tang J, et al.

**Spreadsheet. ***p*-values under pairwise Wilcoxon Rank

Toward more realistic drug-target interaction predictions. Brief Bioinform.

Sum statistical tests of all competing methods in pair, drug and target
prediction tasks. (XLS 24 kb)
12. Pahikkala T, Airola A, Stock M, Baets BD, Waegeman W. Efficient
regularized least-squares algorithms for conditional ranking on relational
**Supplementary Tables. **AUPR Results of competing

data. Mach Learn. 2013;93:321–356. arXiv:1209.4825v2.

methods under pair prediction setting considering subsampled test sets
13. Gönen M, Alpaydın E. Multiple kernel learning algorithms. J Mach Learn
(S1); AUPR results of predicted scores against new interactions found on
current release of KEGG, Matador, Drugbank and ChEMBL databases (S2);
14. Perlman L, Gottlieb A, Atias N, Ruppin E, Sharan R. Combining drug and
Average memory (MB) usage during training and testing of competing
gene similarity measures for drug-target elucidation. J Comput Biol.

methods (S3); Average time (minutes) required to train and test the models
2011;18(2):133–45.
with the competing methods (S4). (PDF 89.7 kb)
15. Wang YC, Zhang CH, Deng NY, Wang Y. Kernel-based data fusion
improves the drug-protein interaction prediction. Comput Biol Chem.

2011;35(6):353–62.
The authors declare that they have no competing interests.

16. Wang Y, Chen S, Deng N, Wang Y. Drug repositioning by kernel-based
integration of molecular structure, molecular activity, and phenotype
data. PLoS ONE. 2013;8(11):78518.
Conceived and designed the experiments: AN RP IC. Performed the
17. Ben-Hur A, Noble WS. Kernel methods for predicting protein-protein
experiments: AN. Analyzed the data: AN RP IC. All authors read and approved
interactions,. Bioinformatics (Oxford, England). 2005;21 Suppl 1:38–46.

the final manuscript.

18. Hue M, Riffle M, Vert J-p, Noble WS. Large-scale prediction of
protein-protein interactions from structures. BMC Bioinforma.

The authors thank the authors of the studies by for making their data
publicly available. This work was supported by the Interdisciplinary Center for
19. Ammad-Ud-Din M, Georgii E, Gönen M, Laitinen T, Kallioniemi O,
Clinical Research (IZKF Aachen), RWTH Aachen University Medical School,
Wennerberg K, et al. Integrative and Personalized QSAR Analysis in Cancer
Aachen, Germany; DAAD; and Brazilian research agencies: FACEPE, CAPES and
by Kernelized Bayesian Matrix Factorization. J Chem Inf Model. 2014;1:.

Nascimento *et al. BMC Bioinformatics * (2016) 17:46
20. Lanckriet GR, Deng M, Cristianini N, Jordan MI, Noble WS. Kernel-based
45. Resnik P. Semantic Similarity in a Taxonomy: An Information Based
data fusion and its application to protein function prediction in yeast. In:
Measure and Its Application to Problems of Ambiguity in Natural
Pacific Symposium on Biocomputing. World Scientific; 2004. p. 300–11.

Language. J Artif Intell Res. 1999;11:95–130.

21. Yu G, Zhu H, Domeniconi C, Guo M. Integrating multiple networks for
46. Stark C, Breitkreutz BJ, Reguly T, Boucher L, Breitkreutz A, Tyers M.

protein function prediction. BMC Syst Biol. 2015;9(Suppl 1):3.

BioGRID: a general repository for interaction datasets. Nucleic Acids Res.

22. Gönen M, Kaski S. Kernelized Bayesian Matrix Factorization. IEEE Trans
47. Hattori M, Okuno Y, Goto S, Kanehisa M. Development of a chemical
Pattern Anal Mach Intell. 2014;36(10):2047–2060.

structure comparison method for integrated analysis of chemical and
23. Yamanishi Y, Araki M, Gutteridge A, Honda W, Kanehisa M. Prediction of
genomic information in the metabolic pathways. J Am Ceram Soc.

drug-target interaction networks from the integration of chemical and
genomic spaces. Bioinformatics (Oxford, England). 2008;24(13):232–40.

48. Klambauer G, Wischenbart M, Mahr M, Unterthiner T, Mayr A,
Hochreiter S. Rchemcpp: a web service for structural analoging in
24. Park Y, Marcotte EM. Flaws in evaluation schemes for pair-input
ChEMBL, Drugbank and the Connectivity Map. Bioinformatics.

computational predictions. Nat Methods. 2012;9(12):1134–6.

2015. *Advance access *
49. Kashima H, Tsuda K, Inokuchi A. Marginalized kernels between labeled
25. Xia Z, Wu LY, Zhou X, Wong STC. Semi-supervised drug-protein
graphs. In: ICML, vol. 3; 2003. p. 321–328.

interaction prediction from heterogeneous biological spaces. BMC Syst
50. Ralaivola L, Swamidass SJ, Saigo H, Baldi P. Graph kernels for chemical
Biol. 2010;4 Suppl 2(Suppl 2):6.
informatics. Neural Netw. 2005;18(8):1093–110.

26. Bleakley K, Yamanishi Y. Supervised prediction of drug-target interactions
using bipartite local models. Bioinformatics (Oxford, England).

51. Takarabe M, Kotera M, Nishimura Y, Goto S, Yamanishi Y. Drug target
2009;25(18):2397–403.
prediction using adverse event report systems: A pharmacogenomic
27. Jacob L, Vert JP. Protein-ligand interaction prediction: an improved
chemogenomics approach. Bioinformatics (Oxford, England). 2008;24(19):
52. Kuhn M, Campillos M, Letunic I, Jensen LJ, Bork P. A side effect resource
28. Dinuzzo F. Learning functions with kernel methods. 2011. PhD thesis,
to capture phenotypic effects of drugs. Mol Syst Biol. 2010;6(1):343.

University of Pavia.

53. Qiu S, Lane T. A framework for multiple kernel support vector regression
29. Rifkin R, Yeo G, Poggio T. Regularized least-squares classification. Nato
and its applications to siRNA efficacy prediction. IEEE/ACM Trans Comput
Science Series Sub Series III Computer and Systems Sciences. 2003;190:
Biol Bioinf. 2009;6(2):190–9.

54. Cristianini N, Kandola J, Elisseeff A, Shawe-Taylor J. On kernel-target
30. Kimeldorf G, Wahba G. Some results on Tchebycheffian spline functions.

alignment. In: Advances in Neural Information Processing Systems 14.

J Math Anal Appl. 1971;33(1):82–95.

Cambridge MA: MIT Press; 2002. p. 367–73.

31. Kashima H, Oyama S, Yamanishi Y, Tsuda K. On pairwise kernels: an
55. Gönen M. Predicting drug-target interactions from chemical and genomic
efficient alternative and generalization analysis. Adv Data Min Knowl Disc.

kernels using Bayesian matrix factorization. Bioinformatics (Oxford,
England). 2012;28(18):2304–10.
32. Laub AJ. Matrix Analysis for Scientists and Engineers. Davis, California:
56. Davis J, Goadrich M. The relationship between Precision-Recall and ROC
SIAM; 2005, pp. 139–44.

curves. In: Proceedings of the 23rd international conference on Machine
33. Kloft M, Brefeld U, Laskov P, Sonnenburg S. Non-sparse multiple kernel
learning - ICML '06. New York, NY, USA: ACM; 2006. p. 233–40.

learning. In: NIPS Workshop on Kernel Learning: Automatic Selection of
Optimal Kernels (Vol. 4); 2008.

57. Kanehisa M, Goto S. KEGG: kyoto encyclopedia of genes and genomes.

34. Byrd RH, Hribar ME, Nocedal J. An interior point algorithm for large-scale
Nucleic Acids Res. 2000;28(1):27–30.

nonlinear programming. SIAM J Optim. 1999;9(4):877–900.

58. Bento AP, Gaulton A, Hersey A, Bellis LJ, Chambers J, Davies M, et al.

The ChEMBL bioactivity database: an update. Nucleic Acids Res.

35. MATLAB. version 8.1.0 (R2013a). Natick, Massachusetts: The MathWorks
2014;42(D1):1083–90.
59. Webster GF. Topical tretinoin in acne therapy. J Am Acad Dermatol.

36. Kanehisa M, Araki M, Goto S, Hattori M, Hirakawa M, Itoh M, et al. KEGG
for linking genomes to life and the environment. Nucleic Acids Res.

60. REIS A, VELHO G. Sulfonylurea receptor-1 (sur1): Genetic and metabolic
evidences for a role in the susceptibility to type 2 diabetes mellitus.

37. Schomburg I, Chang A, Ebeling C, Gremse M, Heldt C, Huhn G, et al.

Diabetes Metab. 2002;28(1):14–19.

BRENDA, the enzyme database: updates and major new developments.

61. Huang Q, Bu S, Yu Y, Guo Z, Ghatnekar G, Bu M, et al. Diazoxide
Nucleic Acids Res. 2004;32(suppl 1):431–3.

prevents diabetes through inhibiting pancreatic *β*-cells from apoptosis
38. Günther S, Kuhn M, Dunkel M, Campillos M, Senger C, Petsalaki E, et al.

via bcl-2/bax rate and p38-*β *mitogen-activated protein kinase.

SuperTarget and Matador: resources for exploring drug-target
relationships. Nucleic Acids Res. 2008;36(suppl 1):919–22.

39. Wishart DS, Knox C, Guo AC, Cheng D, Shrivastava S, Tzur D, et al.

DrugBank: a knowledgebase for drugs, drug actions and drug targets.

Nucleic Acids Res. 2008;36(suppl 1):901–6.

40. Eskin E, Weston J, Noble WS, Leslie CS. Mismatch String Kernels for SVM
Protein Classification. In: Advances in neural information processingsystems-NIPS; 2002. p. 1417–1424.

41. Leslie CS, Eskin E, Noble WS. The spectrum kernel: a string kernel for SVM
protein classification. In: Pac Symp Biocomput vol. 7; 2002. p. 566–575.

42. Palme J, Hochreiter S, Bodenhofer U. KeBABS - an R package for kernel-
based analysis of biological sequences. Bioinformatics. 2015;31(15):2574–2576.
43. Smedley D, Haider S, Durinck S, Al E. The BioMart community portal: an
innovative alternative to large, centralized data repositories. Nucleic AcidsRes. 2015.
44. Ovaska K, Laakso M, Hautaniemi S. Fast Gene Ontology based clustering
for microarray experiments. BioData Min. 2008;1(1):11.

Source: https://publications.rwth-aachen.de/record/660984/files/s12859-016-0890-3.pdf

J Periodontol • August 2008 Subantimicrobial-Dose DoxycyclineModulates Gingival Crevicular FluidBiomarkers of Periodontitisin Postmenopausal Osteopenic WomenLorne M. Golub,* Hsi Ming Lee,* Julie A. Stoner,† Timo Sorsa,‡ Richard A. Reinhardt,§Mark S. Wolff,*i Maria E. Ryan,* Pirkka V. Nummikoski,¶ and Jeffrey B. Payne§ Background: We recently demonstrated that a 2-year subantimicrobial-

Part 13: First Aid: 2010 American Heart Association and American Red Cross International Consensus on First Aid Science With Treatment Recommendations David Markenson, Jeffrey D. Ferguson, Leon Chameides, Pascal Cassan, Kin-Lai Chung, Jonathan L. Epstein, Louis Gonzales, Mary Fran Hazinski, Rita Ann Herrington, Jeffrey L. Pellegrino, Norda Ratcliff and Adam J. Singer