pling, resulting in richer datasets at
lower costs. Interpreting sensor data
onboard allows autonomous vehicles
to make decisions guided by real-time
variations in data, or to react to unexpected deviations from the current
3. Crowdsourcing data collection for
costly observations. Citizen scientists
can contribute useful data (for example, collected through geolocated mobile devices) that would otherwise be
very costly to acquire. One challenge
in data collection through crowdsourcing is in ensuring high quality of data
required by geoscience research. A potential area of research is to improve
methods of evaluating crowdsourced
data collection empirically, and to
gain an understanding of the biases involved in the collection process.
Research vision: Model-driven sensing.
New research on sensors will create
a new generation of devices that will
contain more knowledge of the scientific context for the data being collected. These devices will use that knowledge to optimize their performance
and improve their effectiveness. This
will result in new model-driven sensors
that will have more autonomy and exploratory capabilities.
Information integration. Data, models, information, and knowledge are
scattered across different communities and disciplines, causing great
limitations to current geosciences
research. Their integration presents
major research challenges that will require the use of scientific knowledge
for information integration.
1. Integrating data from distributed
repositories. The geosciences have phenomenal data integration challenges.
Most of the hard geoscience problems
require that scientists work across sub-disciplinary boundaries and share very
large amounts of data. Another facet
of this issue is that the data spans a
wide variety of modalities and greatly
varying temporal and spatial scales.
Distributed data discovery tools, metadata translators, and more descriptive
standards are emerging in this context.
Open issues include cross-domain
concept mapping, entity resolution
and scientifically valid data linking,
and effective tools for finding, integrating, and reusing data.
and the application of Linked Open
Data are all areas of active research to
facilitate search and integration of data
without a great deal of manual effort. 5
2. Capturing scientific processes,
hypo-theses, and theories. To complement the ontologies and data representations just discussed, a great
challenge is representing the ever-evolving, uncertain, complex, and
dynamic scientific knowledge and
information. Important challenges
will arise in representing dynamic processes, uncertainty, theories and models, hypotheses and claims, and many
other aspects of a constantly growing
scientific knowledge base. These representations need to be expressive
enough to capture complex scientific
knowledge, but they also need to support scalable reasoning that integrates
disparate knowledge at different
scales. In addition, scientists will need
to understand the representations and
trust the outcomes.
3. Interoperation of diverse scientific
knowledge. Scientific knowledge comes
in many forms that use different tacit
and explicit representations: hypotheses, models, theories, equations, assumptions, data characterizations,
and others. These representations
are all interrelated, and it should be
possible to translate knowledge fluidly as needed from one representation
to another. A major research challenge is the seamless interoperation
of alternative representations of scientific knowledge, from descriptive
to taxonomic to mathematical, from
facts to interpretation and alternative
hypotheses, from smaller to larger
scales, and from isolated processes to
complex integrated phenomena.
4. Authoring scientific knowledge
collaboratively. Formal knowledge
representation languages, especially
if they are expressive and complex, are
not easily accessible to scientists for
encoding understanding. A major chal-
lenge will be creating authoring tools
that enable scientists to create, inter-
link, reuse, and disseminate knowl-
edge. Scientific knowledge needs to be
updated continuously, allow for alter-
native models, and separate facts from
interpretation and hypotheses. These
are new challenges for knowledge cap-
ture and authoring research. Finally,
scientific knowledge should be created
collaboratively, allowing different con-
tributors to weigh in based on their di-
verse expertise and perspectives.
5. Automated extraction of scientific
knowledge. Not all scientific knowledge
needs to be authored manually. Much
of the data known to geoscientists is
stored in semi-structured formats, such
as spreadsheets or text, and is inacces-
sible to structured search mechanisms.
Automated techniques are needed to
identify and import these kinds of data
into structured knowledge bases.
Research vision: Knowledge maps. We
envision rich knowledge graphs that
will contain explicit interconnected
representations of scientific knowledge linked to time and space to form
multidimensional knowledge maps.
Interpretations and assumptions will
be well documented and linked to observational data and models. Today’s
semantic networks and knowledge
graphs link together distributed facts
on the Web, but they contain simple
facts that lack the depth and grounding needed for scientific research.
Knowledge maps will have deeper spatiotemporal representations of processes, hypotheses, and theories and
will be grounded in the physical world,
interconnecting the myriad models of
Robotics and sensing. Knowledge-informed sensing and data collection has great potential to do more
cost-effective data gathering across
1. Optimizing data collection.
Geoscience data is needed across many
scales, both spatial and temporal.
Since it is not possible to monitor every measurement at all scales all of the
time, there is a crucial need for intelligent methods for sensing. New research is needed to estimate the cost
of data collection prior to sensor deployment, whether that means storage
size, energy expenditure, or monetary
cost. A related research challenge is
trade-off analysis of the cost of data
collection versus the utility of the data
to be collected.
2. Active sampling. Geoscience
knowledge can be exploited to inform
autonomous sensing systems to not
only enable long-term data collection,
but to also increase the effectiveness
of sensing through adaptive sam-