Need help?

800-5315-2751 Hours: 8am-5pm PST M-Th;  8am-4pm PST Fri
Medicine Lakex

Amie: association rule mining under incomplete evidence in ontological knowledge bases

AMIE: Association Rule Mining under Incomplete Evidence
in Ontological Knowledge Bases
Luis Galárraga1, Christina Teflioudi1, Katja Hose2, Fabian M. Suchanek1 1Max-Planck Institute for Informatics, Saarbrücken, Germany 2Aalborg University, Aalborg, Denmark 1{lgalarra, chteflio, suchanek}, 2{khose} inevitably exhibit gaps. Others are created and extendedmanually. Making these KBs complete requires great effort Recent advances in information extraction have led to huge to extract facts, check them for correctness, and add them knowledge bases (KBs), which capture knowledge in a ma- to the KB. However, KBs themselves often already contain chine-readable format. Inductive Logic Programming (ILP) enough information to derive and add new facts. If, for in- can be used to mine logical rules from the KB. These rules stance, a KB contains the fact that a child has a mother, can help deduce and add missing knowledge to the KB.
then the mother's husband is most likely the father: While ILP is a mature field, mining logical rules from KBs isdifferent in two aspects: First, current rule mining systems motherOf (m, c) ∧ marriedTo(m, f ) ⇒ fatherOf (f, c) are easily overwhelmed by the amount of data (state-of-theart systems cannot even run on today's KBs). Second, ILP As for any rule, there can be exceptions, but in the vast usually requires counterexamples. KBs, however, implement majority of cases, the rule will hold.
Finding such rules the open world assumption (OWA), meaning that absent can serve four purposes: First, by applying such rules on data cannot be used as counterexamples. In this paper, we the data, new facts can be derived that make the KB more develop a rule mining model that is explicitly tailored to complete. Second, such rules can identify potential errors support the OWA scenario. It is inspired by association rule in the knowledge base. If, for instance, the KB contains the mining and introduces a novel measure for confidence. Our statement that a totally unrelated person is the father of a extensive experiments show that our approach outperforms child, then maybe this statement is wrong. Third, the rules state-of-the-art approaches in terms of precision and cover- can be used for reasoning. Many reasoning approaches rely age. Furthermore, our system, AMIE, mines rules orders of on other parties to provide rules (e.g., . Last, rules magnitude faster than state-of-the-art approaches.
describing general regularities can help us understand thedata better. We can, e.g., find out that countries often trade Categories and Subject Descriptors with countries speaking the same language, that marriage isa symmetric relationship, that musicians who influence each H2.8 [Information Systems]: Database Applications other often play the same instrument, and so on.
The goal of this paper is to mine such rules from KBs.
We focus on RDF-style KBs in the spirit of the Seman- tic Web, such as YAGO Freebase, and DBpedia These KBs provide binary relationships in the form of RDF triplSince RDF has only positive inference rules, theseKBs contain only positive statements and no negations. Fur- Rule Mining, Inductive Logic Programming, ILP thermore, they operate under the Open World Assumption(OWA). Under the OWA, a statement that is not contained in the KB is not necessarily false; it is just unknown. This In recent years, we have experienced the rise of large is a crucial difference to many standard database settings knowledge bases (KBs), such as Cyc YAGO DB- that operate under the Closed World Assumption (CWA).
pedia and FreebaThese KBs provide information Consider an example KB that does not contain the infor- about a great variety of entities, such as people, countries, mation that a particular person is married. Under CWA we rivers, cities, universities, movies, animals, etc. Moreover, can conclude that the person is not married. Under OWA, KBs also contain facts relating these entities, e.g., who was however, the person could be either married or single.
born where, which actor acted in which movie, or which city Mining rules from a given dataset is a problem that has is located in which country. Today's KBs contain millions a long history. It has been studied in the context of asso- of entities and hundreds of millions of facts.
ciation rule mining and inductive logic programming (ILP).
Yet, even these large KBs are not complete.
Association rule mining is well-known in the context of them are extracted from natural language resources that sales databases. It can find rules such as "If a client boughtbeer and wine, then he also bought aspirin". The confidence of such a rule is the ratio of cases where beer and wine wasactually bought together with aspirin. Association rule min- Copyright is held by the International World Wide Web ConferenceCommittee (IW3C2). IW3C2 reserves the right to provide a hyperlink ing inherently implements a closed world assumption: A rule to the author's site if the Material is used in electronic media.
WWW, '13 Rio de Janeiro, Brazil that predicts new items that are not in the database has a preprocessing step and by penalizing longer rules in the in- low confidence. It cannot be used to (and is not intended to ference part. For mining the rules, Sherlock uses 2 heuristics: be used to) add new items to the database.
statistical significance and statistical relevance.
ILP approaches deduce logical rules from ground facts.
The WARMR system mines patterns in databases Yet, current ILP systems cannot be applied to semantic that correspond to conjunctive queries. It uses a declara- KBs for two reasons: First, they usually require negative tive language bias to reduce the search space. An extension statements as counter-examples. Semantic KBs, however, of the system, WARMER modified the approach to usually do not contain negative statements. The semantics support a broader range of conjunctive queries and increase of RDF are too weak to deduce negative evidence from the efficiency of search space exploration.
facts in a KBecause of the OWA, absent statements can- ALEPis a general purpose ILP system, which imple- not serve as counter-evidence either. Second, today's ILP ments Muggleton's Inverse Entailment algorithm in Pro- systems are slow and cannot handle the huge amount of log. It employs a variety of evaluation functions for the rules, data that KBs provide. In our experiments, we ran state-of- and a variety of search strategies.
the-art approaches on YAGO2 for a couple of days without These approaches are not tailored to deal with large KBs obtaining any results.
under the Open World Assumption. We compare our sys- In this paper, we propose a rule mining system that is tem, AMIE, to WARMR and ALEPH, which are the only inherently designed to work under the OWA, and efficient ones available for download. Our experiments do not only enough to handle the size of today's KBs. More precisely, show that these systems mine less sensible rules than AMIE, our contributions are as follows: but also that it takes them much longer to do so.
(1) A method to simulate negative examples for positive KBs Expert Rule Mining. Another rule mining approach over (the Partial Completeness Assumption) RDF data was proposed to discover causal relations in (2) An algorithm for the efficient mining of rules.
RDF-based medical data. It requires a domain expert who (3) A system, AMIE, that mines rules on millions of facts defines targets and contexts of the mining process, so that in a few minutes without the need for parameter tuning or the correct transactions are generated. Our approach, in expert input.
contrast, does not rely on the user to define any context or The rest of this paper is structured as follows. Section dis- target. It works out-of-the-box.
cusses related work and Section introduces preliminaries.
Generating Schemas. In this paper, we aim to generate Sections and are the main part of the paper, present- Horn rules on a KB. Other approaches use rule mining to ing our mining model and its implementation. Section generate the schema or taxonomy of a KB. applies clus- presents our experiments before Section concludes.
tering techniques based on context vectors and formal con-cept analysis to construct taxonomies. Other approaches use clustering and ILP-based approaches . For the We aim to mine rules of the form friend-of-a-friend network on the Semantic Web, ap-plies clustering to identify classes of people and ILP to learn motherOf (m, c) ∧ marriedTo(m, f ) ⇒ fatherOf (f, c) descriptions of these groups. Another example of an ILP-based approach is the DL-Learner , which has success- Technically, these are Horn rules on binary predicates. Rule fully been applied to generate OWL class expressions mining has been an area of active research for the past couple from YAGO As an alternative to ILP techniques, of years. Some approaches mine association rules, some mine propose a statistical method that does not require negative logical rules, others mine a schema for the KB, and again In contrast to our approach, these techniques others use rule mining for application purposes.
aim at generating a schema for a given RDF repository, not Association Rule Mining. Association rules are mined logical rules in general.
on a list of transactions. A transaction is a set of items. For Learning Rules From Hybrid Sources. proposes to example, in the context of sales analysis, a transaction is the learn association rules from hybrid sources (RDBMS and set of products bought together by a customer in a specific Ontologies) under the OWA. For this purpose, the defini- event. The mined rules are of the form {ElvisCD, Elvis- tion of frequency (and thus of support and confidence) is Book } ⇒ ElvisCostume, meaning that people who bought changed so that unknown statements contribute with half an Elvis CD and an Elvis book usually also bought an Elvis of the weight of the true statements. Another approach costume. However, these are not the kind of rules that we makes use of an ontology and a constraint Datalog program.
aim to mine in this paper. We aim to mine Horn rules.
The goal is to learn association rules at different levels of One problem for association rule mining is that for some granularity w.r.t. the type hierarchy of the ontology. While applications the standard measurements for support and these approaches focus more on the benefits of combining confidence do not produce good results. discusses a num- hybrid sources, our approach focuses on pure RDFS KBs.
ber of alternatives to measure the interestingness of a rule in Further Applications of Rule Mining. proposes an general. Our approach is inspired by this work and we also algorithm for frequent pattern mining in KBs that use DL- make use of a language bias to reduce the search space.
safe rules. Such KBs can be transformed into a disjunctive Logical Rule Mining. Sherlock is an unsupervised Datalog program, which allows seeing patterns as queries.
ILP method to learn first-order Horn clauses from a set of This approach does not mine the Horn rules that we aim at.
extracted facts for a given target relation. It uses probabilis- Some approaches use rule mining for ontology merging tic graphical models (PGMs) to infer new facts. It tackles and alignment The AROMA system , e.g., the noise of the extracted facts by extensive filtering in a 3RDF has only positive rules and no disjointness constraints or similar concepts.
uses association rules on extracted terms to find subsump- Rules. An atom is a fact that can have variables at the tion relations between classes and properties of different on- subject and/or object position. A (Horn) rule consists of a tologies. Again, these systems do not mine the kind of rules head and a body, where the head is a single atom and the we are interested in.
body is a set of atoms. We denote a rule with head r(x, y) In association rules and frequency analysis are used to and body {B1, ., Bn} by an implication identify and classify common misusage patterns for relations in DBpedia. In contrast to our work, this approach does not 1 ∧ B2 ∧ . ∧ Bn ⇒ r(x, y) mine logical rules, but association rules on the co-occurrence which we abbreviate as B ⇒ r(x, y). One example of such of values. Since RDF data can be seen as a graph, mining frequent subtrees is another related field of research.
However, as the URIs of resources in knowledge bases are hasChild (p, c) ∧ isCitizenOf (p, s) ⇒ isCitizenOf (c, s) unique, these techniques are limited to mining frequent com- An instantiation of a rule is a copy of the rule, where all binations of classes.
variables have been substituted by entities. A prediction of Several approaches, such as Markov Logic or URDF a rule is the head atom of an instantiated rule if all body use Horn rules to perform reasoning. These approaches atoms of the instantiated rule appear in the KB. For ex- can be consumers of the rules we mine with AMIE.
ample, the above rule can predict isCitizenOf(Lisa,USA) ifthe KB knows a parent of Lisa (hasChild(Elvis,Lisa)) whois American (isCitizenOf(Elvis,USA)).
Language Bias. As most ILP systems, AMIE uses a lan- RDF KBs. In this paper, we focus on RDF knowledge guage bias to restrict the search space. We say that two baseAn RDF KB can be considered a set of facts, where atoms in a rule are connected if they share a variable or an each fact is a triple of the form hx, r, yi with x denoting the entity. A rule is connected if every atom is connected tran- subject, r the relation (or predicate), and y the object of the sitively to every other atom of the rule. AMIE mines only fact. There are several equivalent alternative representations connected rules, i.e., it avoids constructing rules that con- of facts; in this paper we use a logical notation and represent tain unrelated atoms. We say that a rule is closed if every a fact as r(x, y). For example, we write father(Elvis,Lisa).
variable in the rule appears at least twice. Such rules do not The facts of an RDF KB can usually be divided into an A- predict merely the existence of a fact (e.g. diedIn(x, y) ⇒ Box and a T-Box. While the A-Box contains instance data, ∃z : wasBornIn(x, z)), but also concrete arguments for it the T-Box is the subset of facts that define classes, domains, (e.g. diedIn(x, y) ⇒ wasBornIn(x, y)). AMIE mines only ranges for predicates, and the class hierarchy. Although T- closed rules. We allow recursive rules that contain the head Box information can also be used by our mining approach, relation in the body.
we are mainly concerned with the A-Box, i.e., the set of facts Parallels to Association Rule Mining. Association Rule relating one particular entity to another.
Mining discovers correlations in shopping transactions. Thus, In the following, we assume a given KB K as input. Let association rules are different in nature from the Horn rules R = πrelation(K) denote the set of relations contained in K Still, we can show some similarities between and E = πsubject(K) ∪ πobject(K) the set of entities.
the two approaches. Let us define one transaction for every Functions. A function is a relation r that has at most one set of n entities that are connected in the KB. For exam- object for every subject, i.e., ∀x : {y : r(x, y)} ≤ 1. A ple, in Figure we will define a transaction for the enti- relation is an inverse function if each of its objects has at ties Elvis, Lisa and Priscilla, because they are connected most one subject. Since RDF KBs are usually noisy, even through the facts mother(Priscilla,Lisa), father(Elvis,Lisa), relations that should be functions (such as hasBirthdate) marr(Elvis, Priscilla). We label the transaction with the may exhibit two objects for the same subject. Therefore, set of these entities. Each atom r(xi, xj ) on variables in- we use the notion of functionality . The functionality of dexed by 1 ≤ i, j ≤ n corresponds to an item. A transaction a relation r is a value between 0 and 1, that is 1 if r is a with label hC1, . . , Cni contains an item r(xi, xj ) if r(Ci, Cj ) is in the KB. For example, the transaction hElvis, Lisa, #x : ∃y : r(x, y) Priscillai contains the items {mother(x3,x2), father(x1,x2), f un(r) := #(x, y) : r(x, y) marr(x1,x3)}, since the ground atoms mother(Priscilla,Lisa),father(Elvis,Lisa) and marr(Elvis, Priscilla) are in the KB.
with #x : X as an abbreviation for {x : X ∈ K} . The In this representation, association rules are Horn rules. In inverse functionality is defined accordingly as if un(r) := the example, we can mine the association rule f un(r−1). Without loss of generality, we assume that ∀r ∈R : f un(r) ≥ if un(r) (FUN-Property).
{mother(x3, x2), marr(x1, x3)} ⇒ {f ather(x1, x2)} the case for a relation r, we can replace all facts r(x, y) which corresponds to the Horn rule with the inverse relation, r−(y, x), which entails f un(r−) ≥if un(r−).
For example, if the KB contains the inverse mother(x3, x2) ∧ marr(x1, x3) ⇒ f ather(x1, x2) functional relation directed(person,movie), we can create the Transaction Label Transaction Items functional relation isDirectedBy(movie,person) and use only that one in the rule mining process.
Manual inspection shows, however, that relations in semantic KBs tend to be more functional than inverse functional. Intuitively, this al- Figure Mining Rules with 3 Variables lows us to consider a fact r(x, y) as a fact about x.
Constructing such a table with all possible combinations of entities is practically not very viable. Apart from that, it faces a number of design issues (e.g., how to deal with ically if more body atoms are added and avoids equivalent transactions that contain the same entities in different or- rules with different support values. With this in mind, we derings). Therefore, association rule mining cannot be used define the support of a rule as the number of distinct pairs directly to mine Horn rules. However, we take inspiration of subjects and objects in the head of all instantiations that from the parallels between the two types of mining for our appear in the KB: system, AMIE.
B ⇒ r(x, y)) := #(x, y) : ∃z1, ., zm : where z1, ., zm are the variables of the rule apart from xand y.
Model. Let us consider a given Horn rule B ⇒ r(x, y). Let Head Coverage. Support is an absolute number. This us look at all facts with relation r (Figure . We distinguish means that a user who thresholds on support has to know 4 types of facts: True facts that are known to the KB (KB- the absolute size of the KB to give meaningful values. To true), true facts that are unknown to the KB (NEWtrue), avoid this, we also define a proportional version of support.
facts that are known to be false in the KB (KBfalse), and A naive way would be to use the absolute number of support, facts that are false but unknown to the KB (NEWfalse).
as defined in the previous paragraph, over the size of the KB.
The rule will make certain predictions (blue circle). These In this case, however, relations that do not have many facts predictions can be known to be true (A), known to be false (either because of the incompleteness of the KB or because of (C), or unknown (B and D). When they are unknown to the their nature), will not be considered in the head of rules, i.e.
KB, they can still be true (B) or false (D) with respect to we will not learn rules predicting such relations. Therefore, the real world.
we propose to use the notion of head coverage. This is theproportion of pairs from the head relation that are covered by the predictions of the rule B ⇒ r(x, y)) := #(x0, y0) : r(x0, y0) Negative Examples. The central challenge of our setting is to provide counter-examples for the rule mining. These can take the role of KBfalse, so that we can estimate the ar- eas NEWtrue and NEWfalse. There are several approachesto this problem: The standard confidence, the standardpositive-only learning evaluation score of ILP, and our new Figure Prediction under Incompleteness partial completeness assumption.
Standard Confidence. The standard confidence measure Goal. Our goal is to find rules that make true predictions takes all facts that are not in the KB (i.e., NEWtrue and that go beyond the current KB. In the figure, we wish maxi- NEWfalse) as negative evidence. Thus, the standard confi- mize the area B, and to minimize the area D. There are two dence of a rule is the ratio of its predictions that are in the obvious challenges in our context: First, the areas NEWtrue KB, i.e., the share of A in the set of predictions: and NEWfalse are unknown. So if we wish to maximize Bat the expense of D, we are operating in an area outside our B ⇒ r(x, y)) := KB. We would want to use the areas KBtrue and KBfalse #(x, y) : ∃z1, ., zm : to estimate the unknown area. This, however, leads to the The standard confidence is blind to the distinction between second challenge: Semantic KBs do not contain negative "false" and "unknown". Thus, it implements a closed world evidence. Thus, the area KBfalse is empty. We will now setting. It mainly describes the known data and penalizes present different measures that address these challenges.
rules that make a large number of predictions in the un- Support. The support of a rule quantifies the number of known region. We, in contrast, aim to maximize the number correct predictions, i.e., the size of A. There are several ways of true predictions that go beyond the current knowledge.
to define the support: It can be the number of instantiations We do not want to describe data, but to predict data.
of the rule that appear in the KB. This is what our analogy Positive-Only Learning. For cases where the KB does to association rule mining suggests (Section .
not contain negative examples, Muggleton has developed a measure, however, is not monotonic if we add atoms to the positive-only learning evaluation score for ILP , . It body. Consider, for example, the rule takes random facts as negative evidence: marriedTo( x, y) ⇒ marriedTo( y, x) Score = log(P ) − log If we add hasGender(x,male) to the body, the number of instantiations that are in the KB decreases. If we add an Here, P is the number of known true facts covered (A in atom with a fresh variable, e.g., hasFriend(x,z), to the body, the figure), R is the number of randoms covered, Rsize is the number of instantiations increases for every friend of x.
the total number of randoms and L is the number of atoms This is true even if we add another atom with z to make the in the hypothesis. The intuition is that a good rule should rule closed. Alternatively, we can count the number of facts cover many positive examples, and few or no randomly gen- in one particular body atom. This definition, however, de- erated examples. This ensures that the rule is not overly pends on the choice of the body atom, so that the same rule general. Furthermore, the rule should use as few atoms as can have different supports. We can also count the number possible, and thus achieve a high compression. This measure of facts of the head atom. This measure decreases monoton- is implemented (among others) in the ALEPH system.
Partial Completeness. We propose to generate negative By repeated application of these operators, we can generate evidence by the partial completeness assumption (PCA).
the entire space of rules as defined in Section The oper- This is the assumption that if r(x, y) ∈ KBtrue for some ators generate even more rules than those we are interested in, because they also produce rules that are not closed. Analternative set of operators could consist of OD and an op- ∀y0 : r(x, y0) ∈ KBtrue ∪ NEWtrue ⇒ r(x, y0) ∈ KBtrue erator for instantiation. But these operators would not be In other words, we assume that if the database knows some monotonic, in the sense that an atom generated by one oper- r-attribute of x, then it knows all r-attributes of x. This ator can be modified in the next step by the other operator.
assumption is certainly true for functional relations r, such Therefore, we chose the above 3 operators as a canonic set.
as birth dates, capitals, etc. Thanks to the FUN-Property Algorithm. We mine rules with Algorithm The algo- (see Section , it is also true for inverse-functional relations, rithm maintains a queue of rules, which initially just con- such as owns, created, etc. The assumption is also true in the tains the empty rule. The algorithm iteratively dequeues a vast majority of cases for relations that are not functional, rule from the queue. If the rule is closed (see Section , but that have a high functionality. Even for other relations, the rule is output, otherwise, it is not. Then, the algorithm the PCA is still reasonable for knowledge bases that have applies all operators to the rule and adds the resulting rules been extracted from a single source (such as DBpedia and to the queue (unless they are pruned out, s.b.). This pro- YAGO). These usually contain either all r-values or none for cess is repeated until the queue is empty. We parallelize this a given entity.
process by maintaining a centralized queue, from which the PCA Confidence. Under the PCA, we normalize the con- threads dequeue and enqueue. We do not feed predictions of fidence not by the entire set of facts, but by the set of facts the rules back into the KB. All measures (such as confidence of which we know that they are true, together with the facts and support) are always computed on the original KB.
of which we assume that they are false. If the head atomof the rule is r(x, y), then this set is just the set of facts Algorithm 1 Rule Mining {r(x, y0) : r(x, y0) ∈ K}. Thanks to the FUN-Property, the 1: function AMIE(KB K) PCA is always applied to the first argument of the head Execute in parallel: while ¬q.isEmpty() do B ⇒ r(x, y)) := #(x, y) : ∃z1, ., zm, y0 : if r is closed ∧ r is not pruned for output then We show in our experiments that the PCA confidence iden- tifies much more productive rules than the other measures.
for all operators o do for all rules r0 ∈ o(r) do if r0 is not pruned then After having outlined the basic definitions and the mining model in Sections and we now outline the core algorithm of our framework and its implementation.
Goal. Our goal is to mine rules of the form defined in Sec- tion One of the main problems of any mining approachis to find an efficient way to explore the search space. The Pruning. If executed naively, our algorithm will have pro- naive algorithm of enumerating all possible rules is infeasi- hibitively high run-times. The instantiation operator OI , in ble for large KBs. Hence, we explore the search space by particular, generates atoms in the order of R × E . We iteratively extending rules by mining operators.
first observe that we are usually not interested in rules that Mining Operators. We see a rule as a sequence of atoms.
cover only very few facts of the head relation. Rules that The first atom is the head atom and the others are the body cover, for example, less than 1% of the facts of the head atoms. In the process of traversing the search space, we can relation can safely assumed to be marginal. Therefore, we extend a rule by using one of the following operators: set θ = 0.01 as a lower bound for the head coverage. Weobserve that head coverage decreases monotonically as we 1. Add Dangling Atom (OD) add more atoms. This allows us to safely discard any rule This operator adds a new atom to a rule. The new that trespasses the threshold (Lines 11 and 12).
atom uses a fresh variable for one of its two arguments.
The monotonicity of head coverage gives us another op- The other argument (variable or entity) is shared with portunity to prune: If a rule B the rule, i.e., it occurs in some other atom of the rule.
1 ∧ . ∧ Bn ∧ Bn+1 ⇒ H does not have larger confidence than the rule B1 ∧ . ∧ Bn ⇒ H, 2. Add Instantiated Atom (O then we do not output the longer rule. This is because both This operator adds a new atom to a rule that uses an the confidence and the head coverage of the longer rule are entity for one argument and shares the other argument necessarily dominated by the shorter rule. This way, we can (variable or entity) with the rule.
reduce the number of produced rules (Lines 6 and 7).
Last, we never enqueue a rule that is already in the queue.
3. Add Closing Atom (OC ) It is expensive to check two rules for equality. However, it This operator adds a new atom to a rule so that both is easy to compute measures such as head coverage, confi- of its arguments are shared with the rule.
dence, and PCA confidence for each rule. Two rules can only be equal if they have the same values for these mea- GROUP BY(H.x1, H.xr, H.x2) sures. This restricts the rules that have to be checked. If HAVING COUNT(*) ≥ k a rule is duplicate, we do not enqueue it (Lines 11 and 12).
where ?x is replaced with a reference to any of the intro- We can be sure that any potential duplicates will still be in duced columns. The WHERE clause lists all variables that the queue. This is because the length of the rules increases are shared between any two atoms in the rule, i.e., all join monotonically: When we dequeue a rule with n atoms, no columns and conditions between atom tables.
rule with n + 1 atoms has ever been dequeued. Thus, when LECT can only select variables that appear in the GROUP we apply the operators to the rule with n atoms, and gener- BY statement, the above template is for the case where ?x ate a rule with n + 1 atoms, any potential duplicate of that appears in H. The case where ?x does not appear in H will new rule must be in the queue.
require a nested query. Our experience shows that already Projection Queries. No matter what operator is applied running the non-nested query on a database of a few mil- in particular, the algorithm needs to choose a relation for lion facts can easily take several minutes on an off-the-shelf the new atom that is added to the rule. In addition, the RDBMS. Hence, efficient SPARQL engines such as RDF- instantiation operator OI also allows the choice of an entity.
3X are an alternative option. In SPARQL 1.1, the pro- In order to select only relations and entities that will fulfill jection query template is: the head coverage constraint, we rely on the KB to answerprojection queries. These are queries of the form SELECT ?x WHERE H ∧ B1 ∧ . ∧ Bn H.x1, H.xr, H.x2 .
HAVING COUNT(H)≥ k B1.x1, B1.xr, B1.x2 .
. .
where B1, ., Bn are atoms and k is a natural number. H is the projection atom on which we project. ?x is the selection n.x1, Bn.xr , Bn.x2 .
variable. It is a variable that appears in one or more atoms at the position of one of the arguments or at the position HAVING COUNT(*) ≥ k of the relation (as it is common in SPARQ. Such queriesselect an entity or relation x such that the result of the query Again, this is only for the case where ?x appears in H. RDF- H ∧ B1 ∧ . ∧ Bn on the KB contains more than k distinct 3X does not support aggregate functions in this way. Thus, query answers for H.
we would need extensive postprocessing of query results to Using Projection Queries. Projection queries allow us compute a projection query – already in the case where ?x to select the relationship for the operators OD, OI , and OC in such a way that the head coverage of the resulting rule is In-Memory Database. We have implemented a vanilla above θ. This works by firing a projection query of the form in-memory database for semantic KBs. Our implementa-tion indexes the facts aggressively with one index for each SELECT ?r WHERE H ∧ B1 ∧ . ∧ Bn∧ ?r(X, Y ) permutation of subject, relation, and object. Each index is HAVING COUNT(H)≥ k a map from the first item to a map from the second itemto a set of third items (e.g., a map from relations to a map where X and Y are variables or constants, depending on the from subjects to a set of objects). This allows retrieving the type of atoms that the operator generates. The results for instantiations of a single atom in constant time. The exis- ?r will be the relations that, once bound in the query, ensure tence of a query answer can be checked naively by selecting that the support of the rule B1 ∧ . ∧ Bn∧?r(X, Y ) ⇒ H is the atom with fewest instantiations, running through all of greater than k. If we choose k equal to θ times the num- its instantiations, instantiating the remaining atoms accord- ber of facts of the relation of H, then the head coverage of ingly, and repeating this process recursively until we find an the resulting rules will be greater than θ – which is what instantiation of the query that appears in the KB. Select we want. For the instantiation operator OI , we first fire queries are similar.
a projection query to select relations, and then fire projec- Projection Queries. Algorithm shows how we answer tion queries to retrieve entities. This way, projection queries projection queries. The algorithm takes as input a selec- allow us to choose the relationships and entities for the op- tion variable ?x, a projection atom H = R(X, Y ), remain- erators in such a way that the head coverage for the new rules is guaranteed to be above θ. Next, we discuss how to 1, .Bn, a constant k, and a KB K.
check whether ?x appears in the projection atom. If that implement projection queries efficiently.
is the case, we run through all instantiations of the projec- tion atom, instantiate the query accordingly, and check forexistence. Each existing instantiation increases the counter SQL and SPARQL. Projection queries are essential for for the respective value of ?x. We return all values whose the efficiency of our system. Yet, standard database imple- counter exceeds k. If the selection variable does not appear mentations do not provide special support for these types in the projection atom, we iterate through all instantiations of queries. Assuming that the KB K is stored as a three- of the projection atom. We instantiate the query accord- column table (i.e., each fact is a row with three elements), ingly, and fire a SELECT query for ?x. We increase the the projection query template in SQL would be: counter for each value of ?x. We report all values whose counter exceeds k.
Summary. We have identified projection queries as the cru- cial type of queries for rule mining. Since standard database i = Bj .xm, . .
systems and standard SPARQL systems provide no specifi- cally tuned support for these queries, we have implemented a vanilla in-memory database, which has specific support for strings) from the KBs. Literal values (such as geographi- projection queries. Our entire implementation is in Java.
cal coordinates) are shared by only very few entities, whichmakes them less interesting for rule mining.
Algorithm 2 Answering Projection Queries Evaluations. In all experiments, our goal is twofold: First, we want to produce as many predictions as possible beyond SELECT(?x, R(X, Y ) ∧ B1 ∧ . ∧ Bn, k, K) the current KB. Second, the percentage of correct predic- ∨ X ≡ ?x ∨ Y ≡ ?x tions shall be as large as possible. The particular challenge for all instantiations r(x, y) of R(X, Y ) ∈ K do is that we want to evaluate predictions that go beyond the current KB. We are not interested in describing the exist- ing data, but in generating new data. Therefore, we pro- if exists instantiation q ∈ K then ceed as follows: We run the systems on an older dataset map(value of ?x) + + (YAGO2 . We generate all predictions, i.e., the head atoms of the instantiated rules (see Section We remove all predictions that are in the old KB. Then we compare the remaining predicted facts to the successor of that dataset for all instantiations r(x, y) of R(X, Y ) ∈ K do (YAGO2s . A prediction is "correct" if it appears in the newer KB. A prediction is "incorrect" if it has a highly func- tional or highly inverse functional relation and contradicts for all x ∈ SELECT ?x FROM K WHERE q do an existing fact in the newer KB, e.g., a different birth place.
For all other predictions, we manually validated the facts by checking a sample of 30 of them against Wikipedia pages.
This classifies the remaining predictions as "correct" or "in- correct" – except for a few cases where the fact is "unknown", return {x : map(x) ≥ k} such as the death place of a person that is still alive. The ratio of correct predictions out of the correct and incorrectpredictions yields the precision of the rule.
Outlook. We note that with the project of predicting be- yond current knowledge, we are entering a new, and very risky area of research. We do not expect Horn rules to haveextraordinary precisions in the unknown region. Rules can Experiments. We conducted 3 groups of experiments: In only yield hypotheses about possible facts.
the first group, we compare AMIE to two popular, state-of-the-art systems that are publicly available, WARMR AMIE vs. WARMR and ALEPH and ALEPH. In the second group of experiments, wecompare the standard confidence to the novel PCA confi- In this section, we compare AMIE to WARMR and ALEPH.
dence that we have introduced in this paper (Section In For each system, we conduct 3 experiments: We first com- the third group of experiments, we run AMIE on different pare the usability of the competitor system to AMIE. Then, datasets to show the applicability of the system.
we compare their runtimes. Last, we compare their outputs.
Settings. By default, AMIE finds all rules whose head cov-erage exceeds the default threshold of θ = 1%. AMIE ranks the resulting rules by decreasing PCA confidence. There is Usability. WARMR is a system that unifies ILP and as- no need to deviate from this default configuration when a sociation rule mining. Similar to APRIORI algorithms , user runs AMIE. There are no parameters to tune. All ex- it performs a breadth-first search in order to find frequent periments with AMIE on all datasets are run in this setting, patterns. WARMR generates Datalog queries of the form unless otherwise mentioned.
"? − A1, A2, ., An", where Ai are logical atoms.
For some experiments, we want to compare AMIE's run- To discover frequent patterns (as in association rule min- time with other systems. To have an equal basis, we make ing), we need to have a notion of frequency. Given that AMIE simulate the metrics of the competitor systems. AMIE WARMR considers queries as patterns and that queries can can threshold on support, head coverage, confidence, and have variables, it is not immediately obvious what the fre- PCA confidence, and she can rank by any of these. AMIE quency of a given query is. Therefore, the user needs to can also count the support not on two variables, but on a sin- specify the predicate that is being counted by the system gle variable. AMIE can also output non-closed rules. Since (the key predicate). In the usual scenario of market basket this is just a choice of what to output, it does not influ- analysis, e.g., the system counts customer transactions. In ence runtime. All experiments with all systems are run on a a scenario in which the database is a KB, one solution is to server with 48GB RAM and 8 CPUs. We always mine rules count entities. Since the key predicate determines what is without constants (i.e., without the instantiation operator), counted, it is necessary that it is contained in all queries.
unless otherwise mentioned.
Therefore, we add a predicate entity(x), which we fill with Knowledge Bases. We run our experiments on different all entities of the KB. AMIE does not require such a choice.
KBs. In all cases, we removed the rdf:type relationship, be- For WARMR, the user needs to provide specific informa- cause it inflates the size of the KBs. We are aware that tion about which predicates can be added to a query, which the rdf:type relationship can be very helpful for rule mining.
of their variables can be fresh, and which arguments of pred- However, currently no approach (including ours) makes spe- icates are allowed to be unified (type declarations). In con- cific use of it. We plan to make use of it in future work. Fur- trast, AMIE requires none of these. AMIE simply takes as thermore, we removed all facts with literals (numbers and input the KB in triple format.
WARMR is also able to mine rules with constants. The rules, which included the ones mined by WARMR. We checked user can define which predicates and arguments should be back with the WARMR team and learned that for a given set instantiated with constants (we call this mode MODE1).
of atoms B1, .Bn, WARMR will mine only one rule, picking WARMR then checks all the constants appearing in the facts one of the atoms as head atom (e.g., B1 ∧ . ∧ Bn−1 ⇒ Bn).
of that specific predicate and argument and afterwards uses AMIE, in contrast, will mine one rule for each possible choice them in the queries. MODE1 naturally entails an increase of of head atom (as long as the thresholds are met). In other the branching factor in the search space and an explosion in words, AMIE with the standard support and confidence the number of candidates that need to be evaluated. Alter- measures simulates WARMR, but mines more rules. Fur- natively, WARMR allows the user to set a maximum num- thermore, it runs orders of magnitude faster.
ber of constants to be used for each predicate and argument for large datasets for which the user would have needed to (MODE2). Unfortunately, though, it does not provide a way use complicated sampling schemes in order to use WARMR, for the user to influence the selection of these constants. In AMIE can be a very attractive alternative. Even for smaller other words, there is no guarantee that the constants that datasets with rules with constants, AMIE can provide results WARMR will use are the most promising ones.
while WARMR cannot. Moreover, AMIE comes with met- WARMR produces rules as output. These rules are not rics that go beyond the standard confidence and the stan- necessarily connected. For example, WARMR mines dard support. We will show later that these improve thequality of the results.
isMarriedTo(B,C), ∧ isLeaderOf (A,D)⇒ hasAcademicAdvisor (C,E) This rule is not only nonsensical from a semantic perspec- Usability. ALEPH can be run with different commands tive, but also redundant, because the second atom does not that influence the search strategy. We chose the induce com- influence the implication. Therefore, the user has to filter mand, which runs fastest. For running ALEPH, the user out these rules from the output.
has to specify the target predicate for learning (the head Thus, we conclude that the broader mission and the broader predicate of the rules). In the following, we ran ALEPH applicability of WARMR entails that much more configura- successively with all predicates of the KB as targets. In ad- tion, acquaintance, and expert knowledge is needed to make dition, the user has to specify a series of type and mode it mine Horn rules on semantic KBs.
declarations (similar to WARMR), which will be used as a Runtime. YAGO2 contains around 940K facts about language bias in order to restrict the search space. In addi- 470K entities. WARMR was not able to terminate on this tion, the user needs to provide ALEPH with files containing data in a time period of 1 day. Therefore, we created a the background knowledge and positive examples for the tar- sample of YAGO2. Randomly selecting a number of facts get predicate. In contrast, AMIE requires no such input. It from the initial dataset could break the interesting links be- will run on a KB without any prespecified choices of predi- tween the entities. Therefore, we randomly selected 10,000 seed entities and included their 3-hop neighborhood. This yielded 14K entities and 47K facts. This sample contains 4.96s to > 1 day all available information in a radius of 3 hops around the 0.05s to > 1 day seed entities, but much less information about the entities Table Runtimes ALEPH vs. AMIE at the periphery of the subgraph. Therefore, we restrictedthe values for the key predicate to the seed entities only.
Since the sample is much smaller than the original KB, we isPoliticianOf, hasCapital, hasCurrency lowered the support threshold to 5 entities. We ran AMIE dealsWith, hasOfficialLanguage, imports with these parameters on the sample. AMIE mined her rules in 3.90 seconds. WARMR, in contrast, took 18 hours. We also ran both systems allowing them to mine rules with con- isMarriedTo, livesIn, worksAt, isLocatedIn stants. AMIE completed the task in 1.53 minutes. WARMR Table Runtimes of ALEPH on YAGO2 in MODE1 for all relations did not terminate in 3 days.
Therefore, we ran it also only for the relations diedIn, livesIn, diedIn, directed, hasAcademicAdvisor wasBornIn, for which it took 48h. We also ran WARMR in graduatedFrom, isPoliticianOf, playsFor MODE2. To have reasonable runtimes, we allowed WARMR wasBornIn, worksAt, isLeaderOf to find constants only for one predicate (diedIn). We also re- exports, livesIn, isCitizenOf stricted it to find only 20 constants. WARMR ran 19 hours.
actedIn, produced, hasChild, isMarriedTo Table summarizes the runtime results. We conclude that Table Runtimes of ALEPH on YAGO2 Sample AMIE is better suited for large KBs than WARMR. Thisis because WARMR is an ILP algorithm written in a logic We ran AMIE and ALEPH on YAGO2 programming environment, which makes the evaluation of For ALEPH, we used the positive-only evaluation function all candidate queries inefficient.
with Rsize = 50 and we considered only clauses that wereable to explain at least 2 positive examples, so that we will not get grounded facts as rules in the output. For a fair comparison, we also instructed AMIE to run with a support Table Runtimes on YAGO2 Sample threshold of 2 facts. AMIE terminated in 3.62 minutes, andfound rules for all relations. ALEPH ran for one head re- Results. After filtering out non-connected rules, WARMR lation at a time. For some relations (e.g.isPoliticianOf ), it mined 41 closed rules. AMIE, in contrast, mined 207 closed terminated in a few seconds. For others, however, we had to abort the system after 1 day without results (Tables and 10 rules already produce 135K predictions – at a precision . For each relation, ALEPH treats one positive example of 39%. The top 30 rules produce 3 times more predictions at a time. Some examples need little processing time, others than the top 30 rules by standard confidence – at compa- block the system for hours. We could not figure out a way rable precision. This is because the PCA confidence is less to choose examples in such a way that ALEPH runs faster.
conservative than the standard confidence.
Hence, we used the sample of YAGO2 that we created for WARMR. Again, runtimes varied widely between relations PCA Confidence (rules 1-30) Std. Confidence (rules 1-30) (Table . Some relations ran in a few seconds, others did Std. Confidence (rules 31-46) not terminate in a day. The runtimes with constants are similarly heterogenous, with at least 7 relations not termi- nating in 1 day.
Results. We compared the output of ALEPH on the head relations for which it terminated to the output of AMIE on these head relations, on the sample dataset.
100000 150000 200000 250000 300000 350000 400000 450000 Aggregated predictions (beyond the initial KB) mined 56 rules, while AMIE mined 335 rules.
the rules by decreasing score (ALEPH) and decreasing PCA Figure Std. Confidence vs. PCA Confidence confidence (AMIE). Table shows the number of predic-tions, and their total precision as described in Section Discussion. The precision of the rules is in the range of We show the aggregated values at the points where both 30%-40%. Thus, only a third of the predictions in the un- approaches have produced around 3K, 5K and 8K predic- known region will be correct. Still, even imperfect rules can tions. AMIE's PCA confidence succeeds in sorting the rules be useful: If, e.g., a human checks the facts before they roughly by descending precision, so that the initial rules have are added, then reducing the number of false predictions is an extraordinary precision compared to ALEPH's. AMIE a great advantage. If games with a purpose are employed, needs more rules to produce the same number of predictions then the rules can help pre-select candidate facts. If multiple as ALEPH (but she also mines more).
sources are combined, then rules can contribute evidence. If ALEPH's positives-only evaluation function manages to fil- reasoning approaches are used, then rules can be taken into ter out overly general rules only to some extent. ALEPH consideration according to their estimated performance. Fi- will mine, e.g, livesIn(A, C),isLocatedIn(C, B) ⇒ isPoliti- nally, the precision is better if standard confidence is used.
cianOf (A, B). The problem is that ALEPH generates coun- Predicting Precision. The confidence measures can serve terexamples by randomly using valid constants for variables to estimate the actual precision of a rule. In Table we rank A and B. This means that the probability of creating a the mined rules by their precision and report the average ab- random example in which B is the place of residence of the solute error of the standard and PCA confidence weighted specific person A is very low.
by the number of predictions produced by the rules. Wecan observe that, on average, the PCA confidence estimates the precision of the rules better than the normal confidence.
Thus, reasoning approaches can use the PCA confidence as a weight for the rule.
Table Top Rules of ALEPH vs. AMIE Table Average Absolute Error to Precision AMIE with Different Metrics We also note that our rules are insightful. Table shows In this section, we compare the standard confidence mea- some of the rules we mined. Being able to mine reasonable sure to the PCA confidence measure. We ran AMIE with rules on semantic KBs of this size is an achievement beyond the default head coverage threshold on the YAGO2 dataset.
the current state of the art.
It contains nearly 500K entities and 948K facts. We sortthe rules first by descending PCA confidence, and then by isMarriedTo(x, y)∧ livesIn(x, z) ⇒ livesIn(y, z) descending standard confidence, and look at the top rules.
isCitizenOf (x, y) ⇒ livesIn(x, y) For each rule, we evaluated the predictions beyond YAGO2 hasAdvisor (x, y)∧ graduatedFrom(x, z) ⇒ worksAt (y, z) as described in Section Figure uses aggregated predic- wasBornIn(x, y)∧ isLocatedIn(y, z) ⇒ isCitizenOf (x, z) tions and aggregated precision to illustrate the results. The hasWonPrize(x, G. W. Leibniz) ⇒ livesIn(x, Germany) n-th dot from the left tells us the total number of predictions hasWonPrize(x, Grammy) ⇒ hasMusicalRole(x, Guitar) and the total precision of these predictions, aggregated over Table Some Rules by AMIE the first n rules. As we see, ranking the rules by standardconfidence is a very conservative approach: It identifies rules AMIE on Different Datasets with reasonable precision, but these do not produce many As a proof of concept, we ran AMIE on YAGO2 predictions. Going down in the list of ranked rules, the rules YAGO2 with constants, and DBpedia We chose an older produce more predictions – but at lower precision. The top version of DBpedia (2.0), so that we can evaluate the out- 30 rules produce 113K predictions at an aggregated preci- put to a newer version of DBpedia (3.8). Due to the large sion of 32%. If we rank the rules by PCA confidence, in con- number of relations in DBpedia 2.0, there is an enormous trast, we quickly get large numbers of predictions. The top number of rules to be found. We show the time taken to mine rules with 2 atoms. We provide also the number of [13] B. Goethals and J. Van den Bussche. Relational Association predicted facts that are in the newer version of the KB but Rules: Getting WARMER. In Pattern Detection and not in the old one (hits). As Table shows, AMIE can pro- Discovery, volume 2447. Springer Berlin / Heidelberg, 2002.
duce rules with or without constants in a reasonable time.
[14] G. A. Grimnes, P. Edwards, and A. D. Preece. Learning Meta-descriptions of the FOAF Network. In ISWC, 2004.
[15] S. Hellmann, J. Lehmann, and S. Auer. Learning of OWL Class Descriptions on Very Large Knowledge Bases. Int. J.
Semantic Web Inf. Syst., 5(2), 2009.
[16] J. Hoffart, F. M. Suchanek, K. Berberich, and G. Weikum.
YAGO2: a spatially and temporally enhanced knowledge Table AMIE on Different Datasets base from Wikipedia. Artificial Intelligence Journal, 2013.
All rules and results are available at [17] J. Jozefowska, A. Lawrynowicz, and T. Lukaszewski. The role of semantics in mining frequent patterns fromknowledge bases in description logics with rules. TheoryPract. Log. Program., 10(3), 2010.
[18] M. Kuramochi and G. Karypis. Frequent Subgraph In this paper, we have presented an approach for mining Discovery. In ICDM. IEEE Computer Society, 2001.
Horn rules on large RDF knowledge bases. We have intro- [19] J. Lehmann. DL-Learner: Learning Concepts in duced a formal model for rule mining under the Open World Description Logics. Journal of Machine Learning Research Assumption, a novel measure to simulate counter-examples, (JMLR), 10, 2009.
[20] F. A. Lisi. Building rules on top of ontologies for the and a scalable algorithm for the mining. In contrast to state- semantic web with inductive logic programming. TPLP, of-the-art approaches, our system (AMIE) requires no input other than the KB, and does not need configurations or pa- [21] A. Maedche and V. Zacharias. Clustering Ontology-Based rameter tuning. As our extensive experiments have shown, Metadata in the Semantic Web. In PKDD, 2002.
AMIE runs on millions of facts in only a few minutes and [22] T. Mamer, C. Bryant, and J. McCall. L-modified ilp outperforms state-of-the-art approaches not only in terms of evaluation functions for positive-only biological grammar runtime, but also in terms of the number and quality of the learning. In Inductive logic programming, number 5194 in output rules. Our confidence measure can reasonably pre- LNAI. Springer-Verlag, 2008.
[23] C. Matuszek, J. Cabral, M. Witbrock, and J. Deoliveira.
dict the precision of the rules. In our future work, we plan An introduction to the syntax and content of Cyc. In AAAI to consider also the T-Box of the KB in order to produce Spring Symposium, 2006.
more precise rules. We also aim to explore the synergies [24] D. L. McGuinness, R. Fikes, J. Rice, and S. Wilder. An when several rules predict the same fact, and extend the set Environment for Merging and Testing Large Ontologies. In of rules beyond Horn rules, so that even more complex facts and hidden knowledge can be predicted.
[25] S. Muggleton. Inverse entailment and progol. New Generation Comput., 13(3&4), 1995.
[26] S. Muggleton. Learning from positive data. In ILP.
[1] Z. Abedjan, J. Lorey, and F. Naumann. Reconciling ontologies and the web of data. In CIKM, 2012.
[27] N. Nakashole, M. Sozio, F. Suchanek, and M. Theobald.
e, L. Raedt, and M. Bruynooghe. Declarative bias for Query-time reasoning in uncertain rdf knowledge bases specific-to-general ilp systems. Machine Learning, 20, 1995.
with soft and hard rules. In Workshop on Very Large Data [3] R. Agrawal, T. Imieli´ nski, and A. Swami. Mining Search (VLDS) at VLDB, 2012.
association rules between sets of items in large databases.
[28] V. Nebot and R. Berlanga. Finding association rules in In SIGMOD, 1993.
semantic web data. Knowl.-Based Syst., 25(1), 2012.
[4] R. Agrawal, H. Mannila, R. Srikant, H. Toivonen, and A. I.
[29] T. Neumann and G. Weikum. RDF-3X: a RISC-style Verkamo. Fast discovery of association rules. In Advances engine for RDF. Proc. VLDB Endow., 1(1), Aug. 2008.
in knowledge discovery and data mining. 1996.
[30] N. F. Noy and M. A. Musen. PROMPT: Algorithm and [5] S. Auer, C. Bizer, G. Kobilarov, J. Lehmann, R. Cyganiak, Tool for Automated Ontology Merging and Alignment. In and Z. G. Ives. DBpedia: A nucleus for a Web of open AAAI/IAAI. AAAI Press, 2000.
data. In ISWC, 2007.
[31] M. Richardson and P. Domingos. Markov logic networks.
[6] Y. Chi, R. R. Muntz, S. Nijssen, and J. N. Kok. Frequent Machine Learning, 62(1-2), 2006.
Subtree Mining - An Overview. Fundam. Inf., 66(1-2), [32] S. Schoenmackers, O. Etzioni, D. S. Weld, and J. Davis.
Learning first-order Horn clauses from web text. In [7] P. Cimiano, A. Hotho, and S. Staab. Comparing EMNLP, 2010.
Conceptual, Divisive and Agglomerative Clustering for [33] F. M. Suchanek, S. Abiteboul, and P. Senellart. PARIS: Learning Taxonomies from Text. In ECAI, 2004.
Probabilistic Alignment of Relations, Instances, and [8] C. d'Amato, V. Bryl, and L. Serafini. Data-driven logical Schema. PVLDB, 5(3), 2011.
reasoning. In URSW, 2012.
[34] F. M. Suchanek, J. Hoffart, E. Kuzey, and [9] C. d'Amato, N. Fanizzi, and F. Esposito. Inductive learning E. Lewis-Kelham. YAGO2s: Modular High-Quality for the Semantic Web: What does it buy? Semant. web, Information Extraction. In German Database Symposium 1(1,2), Apr. 2010.
(BTW), 2013.
[10] J. David, F. Guillet, and H. Briand. Association Rule [35] F. M. Suchanek, G. Kasneci, and G. Weikum. Yago: a core Ontology Matching Approach. Int. J. Semantic Web Inf.
of semantic knowledge. In WWW, 2007.
Syst., 3(2), 2007.
[36] P.-N. Tan, V. Kumar, and J. Srivastava. Selecting the right [11] L. Dehaspe and H. Toironen. Discovery of relational interestingness measure for association patterns. In KDD, association rules. In Relational Data Mining.
Springer-Verlag New York, Inc., 2000.
olker and M. Niepert. Statistical schema induction. In [12] L. Dehaspe and H. Toivonen. Discovery of frequent DATALOG patterns. Data Min. Knowl. Discov., 3(1), Mar.


Subantimicrobial-dose doxycycline modulates gingival crevicular fluid biomarkers of periodontitis in postmenopausal osteopenic women

J Periodontol • August 2008 Subantimicrobial-Dose DoxycyclineModulates Gingival Crevicular FluidBiomarkers of Periodontitisin Postmenopausal Osteopenic WomenLorne M. Golub,* Hsi Ming Lee,* Julie A. Stoner,† Timo Sorsa,‡ Richard A. Reinhardt,§Mark S. Wolff,*i Maria E. Ryan,* Pirkka V. Nummikoski,¶ and Jeffrey B. Payne§ Background: We recently demonstrated that a 2-year subantimicrobial-

Guidelines for the initial management of paediatric liver disease

Alastair Baker 9/7/2013 Investigation and treatment of liver disease with acute onset – Local hospital protocol  Defined as EITHER sudden onset of jaundice with evidence of liver aetiology OR incidental discovery of raised transaminases in association with symptoms suggesting acute onset  Age of onset >3 months