transformation synthesis also infers similar queries, but
infers from very few examples and over a richer language
that combines lookup operations with syntactic manipulations. Our table layout synthesis addresses the challenges
of dealing with spreadsheet tables, which, unlike database
relations, have ordering relationships between rows and
carry meta-information in some cells.
The PADS project in the programming languages community has simplified ad hoc data-processing tasks for
programmers by developing domain-specific languages
for describing data formats, and learning algorithms for
inferring such formats using annotations provided by
the user. 3 The learned format can then be used by programmers to implement custom data-analysis tools. In
contrast, we focus on automating the entire end-to-end
process for relatively simpler tasks, which include not only
learning the text structure from the inputs, but also learning the desired transformation from the outputs, without
any user annotations.
The area of program synthesis is gaining renewed interest. However, it has traditionally focused on synthesizing
nontrivial algorithms20 (e.g., graph algorithms9 and program inverses19) and discovering intricate code-snippets
(e.g., bit-vector tricks, 7 switching logic in hybrid systems21).
In this paper, we apply program synthesis to discover simpler programs required by the much larger class of spreadsheet end-users. Various classes of techniques have been
explored for program synthesis: exhaustive search, logical reasoning, probabilistic inference, and version-space
algebras (for a recent survey, see Gulwani5). Because the
data manipulations required by end users are usually relatively simple, we can apply version-space algebras, which
allow real-time performance, unlike other techniques.
Version-space algebras were pioneered by Mitchell for
refinement-based learning of Boolean functions, 17 while
Lau et al. extended the concept to learning more complex
functions in a PBD setting. 14 Our synthesis algorithms lift
the concepts of version-space algebra to the PBE setting,
for a fairly expressive string expression language involving
conditionals and loops.
7. ConCLuSion AnD FutuRE WoRk
General-purpose computational devices, such as smartphones and computers, are becoming accessible to people
at large at an impressive rate. In the future, even robots
will become household commodities. Unfortunately, programming such general-purpose platforms has never been
easy, because we are still mostly stuck with the model of
providing step-by-step, detailed, and syntactically correct
instructions on how to accomplish a certain task, instead
of simply describing what the task is. Program synthesis
has the potential to revolutionize this landscape, when
targeted for the right set of problems and using the right
This paper reports our initial experience with design-
ing domain-specific languages and inductive synthe-
sizers for string and table transformations. Our choice
of domains was motivated by our study of help-forum
problems that end users struggled with. A next step is to
develop a general framework that can allow synthesizer
writers to easily develop domain-specific synthesizers of
the kind described in this paper, similar to how declara-
tive parsing frameworks allow a compiler writer to easily
write a parser.
We thank Guy L. Steele Jr. for providing insightful and
detailed feedback on multiple versions of this draft.
1. Cypher, a., ed. Watch What I Do:
Programming by Demonstration, mIt
2. Das sarma, a., parameswaran,
a., garcia-molina, h., Widom, j.
synthesizing view definitions from
data. In ICD T (2010).
3. fisher, k., Walker, D. the paDs project:
an overview. In ICDT (2011).
4. gualtieri, m. Deputize end-user
developers to deliver business agility
and reduce costs. In Forrester Report
for Application Development and
Program Management Professionals
5. gulwani, s. Dimensions in program
synthesis. In PPDP (2010).
6. gulwani, s. automating string
processing in spreadsheets using input-output examples. In POPL (2011).
7. gulwani, s., jha, s., tiwari, a.,
venkatesan, r. synthesis of loop-free
programs. In PLDI (2011).
8. harris, W.r., gulwani, s. spreadsheet
table transformations from examples.
In PLDI (2011).
9. Itzhaky, s., gulwani, s., Immerman,
n., sagiv, m. a simple inductive
synthesis methodology and its
applications. In OOPSLA (2010).
10. jha, s., gulwani, s., seshia, s., tiwari,
a. oracle-guided component-based
program synthesis. In ICSE (2010).
11. kandel, s., paepcke, a., hellerstein, j.,
heer, j. Wrangler: Interactive visual
specification of data transformation
scripts. In CHI (2011).
12. ko, a.j., myers, b.a., aung, h.h.
six learning barriers in end-user
programming systems. In VL/HCC
13. lau, t. Why p b D systems fail:
lessons learned for usable aI. In CHI
2008 Workshop on Usable AI (2008).
14. lau, t., Wolfman, s., Domingos,
p., Weld, D. programming by
demonstration using version space
algebra. Mach. Learn. 53( 1–2) (2003).
15. lieberman, h. Your Wish is My
Command: Programming by Example,
morgan kaufmann, 2001.
16. miller, r.C., myers, b.a., Interactive
simultaneous editing of multiple text
regions. In USENIX Annual Technical
17. mitchell, t.m. generalization as search.
Artif. Intell. 18, 2 (1982).
18. singh, r., gulwani, s. learning
semantic string transformations from
examples. PVLDB 5 (2012), in press.
19. srivastava, s., gulwani, s., Chaudhuri,
s., foster, j.s. path-based inductive
synthesis for program inversion. In
20. srivastava, s., gulwani, s., foster, j.
from program verification to program
synthesis. In POPL (2010).
21. taly, a., gulwani, s., tiwari, a.
synthesizing switching logic using
constraint solving. In VMCAI (2009).
22. tran, Q.t., Chan, C.y., parthasarathy, s.
Query by output. In SIGMOD (2009).
23. Walkenbach, j. Excel 2010 Formulas,
john Wiley and sons, 2010.
Sumit Gulwani ( firstname.lastname@example.org),
microsoft research, redmond, Wa.
William R. harris ( email@example.com),
univ. of Wisconsin, madison, WI.
Rishabh Singh ( firstname.lastname@example.org),
mIt CsaIl, Cambridge, ma.
© 2012 aCm 0001-0782/12/08 $15.00