In Figure 8, we show a case where two input photos are
used to create one model of an object: the Obelisk in Paris.
Firstly, the base of the Obelisk is modeled from a close up
view in (a), allowing more detail to be captured. The partial
3D model is then moved to another photo where the entire
Obelisk is visible, but the base is occluded. Similar to a copy
and paste procedure, the user positions the extracted base
inside the image, and it snaps to the image contours in (b).
The user then continues the modeling process with other
parts. The texture of the transported part is blended to match
the shading of the region in the new image, to maintain con-
sistency: see the rotated view (c). Details of the base can be
seen in the close up view (d) of the final model of the Obelisk.
In Figure 9, we show a photograph with a collection of
objects that were modeled and copied from other photos.
The online video ( https://vimeo.com/148236679) shows the
process of modeling and editing these objects. The modeling and editing time for each example is shown in Table 1,
as well as the number of manually provided geo-semantic
constraints. Objects in oblique views typically need more
manual constraints, most of which designate coplanar axes,
which are difficult to infer automatically.
6. 2. Comparison to sketch based modeling
As discussed in Section 2, our method shares some simi-larities with the one in Shtof et al.,
19 which models objects
from sketches. We previously discussed the main differences
between the methods. Their system is based on sketches
rather than photographs, which makes it easier to assume that
parts have sufficient bounding curves around them. It relies
on labeling, using a drag and drop metaphor for choosing and
positioning the primitives before snapping with the sketches.
We make a comparison based on their sketch inputs (see Figure
10), since their method cannot handle the examples presented
in this paper. Comparing the modeling time shows that sketch
labeling and drag and drop snapping steps are significantly less
efficient and less intuitive compared to our 3-Sweep method.
Our modeling time (60s on average) is significantly lower than
the time they report for their technique (180s on average).
6. 3. Limitations
Our work has several limitations. First, many shapes cannot
be decomposed into generalized cylinders and cuboids, and
cannot be modeled using our framework (e.g., the base of
the menorah in Figure 1). It would be desirable to extend the
types of primitives which can be modeled using similar principles. 3-Sweep also relies on the fact that the object modeled
Figure 7. Top: modeling and replicating parts for image editing. Orange parts are replicated or deformed. Bottom: editing a telescope.
The leftmost images are the original photos. Note that different parts have been scaled differently.
Figure 8. Modeling the Obelisk in Paris from two photos. Top: the
base of the Obelisk is modeled from a closer view which captures
more details. Bottom: (a) The partial 3D model is transported to
a more distant view (in which part of the base is occluded). (b) A
rotated textured Obelisk; the texture of the transported part is
blended into the region it occupied. (c) Details of the base are visible
in the close-up of the new view.
Modeling Blending Close-up