The key idea is that if the projection of a part is fixed, its
position and orientation can be determined by only one or
two depth values. We first describe the method for simple
parts that can be modeled by a single parameter, namely
parts which were modeled using a straight axis. General
cylinders and cuboids with curved axes will be later approximated using two arbitrarily connected straight axis primitives at the start and end of the whole part.
Determining straight shapes. The position and orientation of a generalized cylinder i with a straight-axis can be
determined by two points we call anchors, Ci, 1 and Ci, 2, on its
main axis (see Figure 4). Similarly, a cuboid part can be represented by six anchors Ci, j j ∈ [ 1, 6] positioned at the center
of each face. Every opposite pair of anchors defines one main
axis of the cuboid. Even though four anchors are enough to
fix the position and orientation of a cuboid, we use six to
simplify attaching various geo-semantic constraints from
other parts to each side of the cuboid.
We define a local 3D orthogonal coordinate system for
each part using the three strokes defined by the user for the
three dimensions of the part. First, we define the origin of
the coordinate system of part i at a reference point Ri on the
part’s projection. For a cuboid part, we pick the point con-
necting the first and second user strokes, and for a cylinder
we pick the point connecting the second and third strokes.
Due to the internal orthogonality of the straight part, the pro-
file of the part is perpendicular to the main axis. Therefore,
we can use the endpoints of the user’s strokes (after snap-
ping them to the image edges) to define three points that
together with Ri create an orthogonal system (red points and
lines in Figure 5). Note that this coordinate system is defined
subsequent profile propagation step can tolerate a limited
number of missing intersections.
When an intersection point is found, we snap the contour point pij to it. If both contour points of the profile are
snapped, we adjust the location of Ai to lie at their midpoint. If only one side is successfully snapped, we mirror
the length of this side to the other side and move the other
contour point respectively. Lastly, if neither contour points
is snapped, the size of the previous profile is retained.
Post-processing. The above modeling steps closely follow user gestures, especially when modeling the profile.
This provides more intelligent understanding of the shape
but it is less accurate. Therefore, after modeling each primitive, we apply a post-snapping stage to better fit the primitive to the image as well as to correct the view. We search for
small transformations (±10% of primitive size) and changes
of vertical angle of view (± 10°) that create a better fit of the
primitive’s projection to the edge curves it was snapped to
in the editing process.
In many cases, the modeled object type has special
properties that can be used as priors to constrain the
modeling. For example, if we know that a given part has
a straight spine, we can constrain the sweep to progress
along a straight line. Similarly, we can constrain the sweep
to preserve a constant or linearly changing profile radius.
In this case, the detected radii are averaged or fitted to a
line along the sweep. We can also constrain the profile to
be a square or a circle. In fact, a single primitive can contain segments with different constraints: it can start with
a straight axis and then bend, or use a constant radius only
in a specific part. Such constraints are extremely helpful
when the edge detection provides poor results.
To further assist in modeling interaction, we also provide
a copy and paste tool. The user can drag a selected part that
is already snapped over to a new location in the image and
snap it again in the new position. While copying, the user
can rotate, scale, or flip the part.
5. COMPOSITE OBJECT CONSTRUCTION
The technique described above generates parts that fit the
object outlines. The positions of these parts in 3D are still
ambiguous and inaccurate. However, the assumption is that
these parts are components of a coherent man-made object,
and semantic geometric relationships exist among them.
Constraining the shape to satisfy such relationships allows
creation of meaningful models.
Since each component has many degrees of freedom,
direct global optimization of the positions of parts while
considering their geo-semantic relationships is computationally intensive and vulnerable to trapping in local
minima. In our setting, the modeled components are
also constrained to agree with the outlines of the object
in the image. These constraints can significantly reduce
the degrees of freedom for each part, reducing the
dimensionality of the optimization space and avoiding
local minima. In the following discussion, we describe
how we simplify the general positioning problem and
ensure that geo-semantic constraints are satisfied among
the 3-swept parts.
Figure 4. Three examples where we infer geo-semantic constraints
from primitives where such relationships “almost” hold: Collinear axes
(left), Parallel axes (top right), and Perpendicular axes (bottom right).
Cn, 2
Cn, 1
Cm, 2
Cm, 1
Cn, 2
Cn, 1 Cm, 1
Cm, 2
Cm, 1
Cm, 2
Cn, 2
Cn, 1
Ci, 1 Ci, 2
Ci, 4
Ci, 3
Ci, 5
Ci, 6
RiRi Ri
Figure 5. Determining coordinates Ci,j for axis endpoints of a cuboid
from the depth value zi of the reference point Ri.