can form the basis for more sophisticated inferences and ways to curate training data. Deep-learning techniques can
be applied to problems of entity dedu-plication and attribute inference. 2
Knowledge inference and verification. Making sure that facts are correct is a core task in constructing a
knowledge graph, and with a huge
scale it is not remotely possible to verify everything manually. This requires
an automated approach: advances in
knowledge representation and reasoning, probabilistic graphical models,
and natural language inferences can
be used to construct an automatic or
semi-automatic system for consistency
checking and fact verification.
Federation of global, domain-specific, and customer-specific knowledge.
In a case like IBM clients, who build
their own custom knowledge graphs,
the clients are not expected to tell the
graph about basic knowledge. For example, a cancer researcher is not going
to teach the knowledge graph that skin
is a form of tissue, or that St. Jude is a
hospital in Memphis, Tennessee. This
is known as “general knowledge,” captured in a general knowledge graph.
The next level of information is
knowledge that is well known to anybody in the domain—for example, carcinoma is a form of cancer or NHL more
often stands for non-Hodgkin lymphoma than National Hockey League
(though in some contexts it may still
mean that—say, in the patient record of
an NHL player). The client should need
to input only the private and confidential knowledge or any knowledge that
the system does not yet know. Isolation,
federation, and online updates of the
base and domain layers are some of the
major issues that surface because of
Security and privacy for personal-
ized, on-device knowledge graphs.
Knowledge graphs by definition are
enormous, since they aspire to create
an entity for every noun in the world,
and thus can only reasonably run in the
cloud. Realistically, however, most peo-
ple do not care about all entities that
exist in the world, but rather a small
fraction or subset that is personally rel-
evant to them. There is a lot of promise
in the area of personalizing knowledge
graphs for individual users, perhaps
even to the extent that they can shrink
to a small enough size to be shippable
to mobile devices. This will allow devel-
opers to keep providing user value in
a privacy-respecting manner by doing
more on-device learning and computa-
tion, over local small knowledge-graph
instances. (We are eager to collaborate
with the research community in pur-
suit of this goal.)
Multilingual knowledge systems.
A comprehensive knowledge graph
must cover facts expressed in multiple
languages and conflate the concepts
expressed in those languages into a cohesive set. In addition to the challenges
in knowledge extraction from multilingual sources, different cultures may
conceptualize the world in subtly different ways, which poses challenges in
the design of the ontology as well.
The natural question from our discussion in this article is whether different
knowledge graphs can someday share
certain core elements, such as descriptions of people, places, and similar
entities. One of the avenues toward
sharing these descriptions could be to
contribute them to Wikidata as a common, multilingual core. In the nearer
term, we hope to continue sharing the
results of research that each of us may
have done with researchers and practitioners outside of our companies.
Knowledge representation is a difficult skill to learn on the job. The pace
of development and the scale at which
knowledge-representation choices impact users and data do not foster an environment in which to understand and
explore its principles and alternatives.
The importance of knowledge representation in diverse industry settings,
as evidenced by the discussion in this
article, should reinforce the idea that
knowledge representation should be
a fundamental part of a computer science curriculum—as fundamental as
data structures and algorithms.
Finally, we all agree that AI systems
will unlock new opportunities for organizations in how they interact with customers, provide unique value in their
space, and transform their operations
and workforces. To realize this promise, these organizations must figure out
how to build new systems that unlock
knowledge to make them truly intelligent organizations.
The article summarizes and expands
on a panel discussion the authors conducted at the International Semantic Web
Conference in Asilomar, CA, in Oct. 2018
( https://bit.ly/2ZYVLJh). The discussion
is based on practical experiences and represents the views of the authors and not
necessarily their employers.
Schema.org: Evolution of
Structured Data on the Web
R.V. Guha, D. Brickley, and S. Macbeth
Hazy: Making it Easier to Build
and Maintain Big-data Analytics
A. Kumar, F. Niu, and C. Ré
A Primer on Provenance
L. Carata, et al.
1. Höffner, K., Walter, S., Marx, E., Usbeck, R., Lehmann,
J. and Ngonga Ngomo, A.C. Survey on challenges of
question answering in the semantic Web. Semantic
Web 8, 6 (2017), 895–920.
2. Lin, Y., Liu, Z., Sun, M., Liu, Y. and Zhu, X. Learning
entity and relation embeddings for knowledge
graph completion. In Proceedings of the Assoc.
Advancement of Artificial Intelligence 15, (2015),
3. Nickel, M., Murphy, K., Tresp, V. and Gabrilovich, E.
2016. A review of relational machine learning for
knowledge graphs. In Proceedings of the IEEE 104, 1
4. Paulheim, H., Knowledge graph refinement: a survey
of approaches and evaluation methods. Semantic Web
8, 3 (2017), 489–508.
Natasha Noy is a scientist at Google, where she works
on making structured data accessible and leads Google
Dataset Search. Previously, she worked on ontology
engineering and semantic Web at Stanford University,
Stanford, CA, USA.
Yuqing Gao is the general manager of Microsoft’s
Artificial Intelligence – Knowledge Graph organization.
She has been a key leader behind intelligent features for
Microsoft Office products, Bing Entity Search, and other
prominent AI knowledge-driven Microsoft technologies.
Anshu Jain works at IBM Watson, where he is
responsible for the architecture of the core knowledge and
language capabilities, including Knowledge Graph, natural
language understanding, and Watson Knowledge Studio,
Anant Narayanan is an engineering manager at Facebook,
where he helps build knowledge platforms to develop
a deeper understanding of entities and relationships.
Previously, he led the development of large-scale data
pipelines at Ozlo to support conversational AI systems.
Alan Patterson is a Distinguished Engineer at eBay,
heading up eBay’s efforts to build a product knowledge
graph that contains eBay’s knowledge of products, as well
as organizations, brands, people, places, and standards.
Previously, he worked at the startup True Knowledge (also
Jamie Taylor manages the Schema Team for Google’s
Knowledge Graph. The team’s responsibilities include
extending KG’s underlying semantic representation,
growing coverage of the ontology, and enforcing semantic
policy. Previously, he worked for Metaweb Technologies.
Copyright held by authors/owners.
Publication rights licensed to ACM.