Proof Sketch. Suppose that the algorithm runs for R time
steps, where R is a (specific, but unspecified for now) polynomial in the model parameters K, V, 1/e, and ln(1/d ). If it
is the case that the algorithm was already e-optimal on a
fraction ( 1 – e) of the R time steps, then we can argue that
the algorithm will continue to be e-optimal on at least a fraction ( 1 – e) of future time steps since the algorithm’s performance should improve on average over time as estimates
become more accurate.
On the other hand, if the algorithm chose sub-optimal
allocations on at least a fraction e of the R time steps, then
by Lemma 4, the algorithm must have incremented Nt i,ct i for
some venue i and cut-off ct i approximately e2R/(8V) times. By
definition of the ct i , it can never be the case that Nt i,ct i was incremented too many times for any fixed values of i and ct i (where
too many is a polynomial in V, 1/e, and ln(1/d )); otherwise the
cut-off would have increased. Since there are only K venues
and V possible cut-off values to consider in each venue, the
total number of increments can be no more than KV times
this polynomial, another polynomial in V, 1/e, ln(1/d ), and
now K. If R is sufficiently large (but still polynomial in all of
the desired quantities) and approximately e2 R/(8V) increments were made, we can argue that every venue must have
been fully explored, in which case, again, future allocations
will be e-optimal.
We remark that our optimistic tail modifications of the
Kaplan–Meier estimators are relatively mild. This leads us to
believe that using the same estimate–allocate loop with an
unmodified Kaplan–Meier estimator would frequently work
well in practice. We investigate a parametric version of this
learning algorithm in the experiments described below.
5. the DaRK PooL PRoBLem
The remainder of this article is devoted to the application
of our techniques to the dark pool problem. We begin with
a description of the trading data we used, and go on to
describe a variety of experiments we performed.
5. 1. summary of the dark pool data
Our data set is from the internal dark pool order flow for a
major US broker–dealer. Each (possibly censored) observation is of the form discussed throughout the paper—a triple
consisting of the dark pool name, the number of shares
sent to that pool, and the number of shares subsequently
executed within a short time interval. It is important to highlight some limitations of the data. First, note that the data set
conflates the policy the brokerage used for allocation across
the dark pools with the liquidity available in the pools themselves. For our data set, the policy in force was very similar
to the bandit-style approach we discuss below. Second, the
“parent” orders determining the overall volumes to be allocated across the pools were determined by the brokerage’s
trading needs, and are similarly out of our control.
The data set contains submissions and executions
for four active dark pools: BIDS Trading, Automated
Trading Desk, D.E. Shaw, and NYFIX, each for a dozen of
relatively actively-traded stocks,f thus yielding 48 distinct
stock–pool data sets. The average daily trading volume
of these stocks across all exchanges (light and dark)
ranges from 1 to 60 million shares, with a median volume of 15 million shares. Energy, Financials, Consumer,
Industrials, and Utilities industries are represented. Our
data set spans 30 trading days. For every stock–pool pair
we have on average 1,200 orders (from 600 to 2,000), which
corresponds to 1. 3 million shares (from 0.5 to 3 million).
Individual order sizes range from 100 to 50,000 shares,
with 1,000 shares being the median. Sixteen percent of
orders are filled at least partially (meaning that fully 84%
result in no shares executed), 9% of the total submitted
volume was executed, and 11% of all observations were
censored.
5. 2. Parametric models for dark pools
The theory and algorithm we have developed for censored
exploration permit a very general form for the venue distributions Pi. The downside of this generality is that we are
left with the problem of learning a very large number of
parameters. More parameters generally mean that more
data is necessary to guarantee that the model will generalize
well, which means more rounds of exploration are needed
before the algorithm’s future performance is near-optimal.
In some applications, it is therefore advantageous to employ
a less general but more simple parametric form for these
distributions.
We experimented with a variety of common parametric
forms for the distributions. For each such form, the basic
methodology was the same. For each of the 4 × 12 = 48
venue–stock pairs, the data for that pair was split evenly
into a training set and a test set. The training data was
used to select the maximum likelihood model from the
parametric class. Note that we can no longer directly apply
the nonparametric Kaplan–Meier estimator—within each
model class, we must directly maximize the likelihood on
the censored training data. This is a relatively straightforward and efficient computation for each of the model
classes we investigated. The test set was then used to measure the generalization performance of each maximum
likelihood model.
Our investigations revealed that the best models maintained a separate parameter for the probability of zero
shares being available (that is, Pi(0) is explicitly estimated)—
a zero bin or ZB parameter. This is due to the fact that the
vast majority of submissions (84%) to dark pools result in
no shares being executed. We then examined various parametric forms for the nonzero portions of the venue distributions, including uniform (which of course requires
no additional parameters), and Poisson, exponential and
power law forms (each of which requires a single additional
parameter); each of these forms were applied up to the largest volume submitted in the data sets, then normalized.
The generalization results strongly favor the power law
form, in which the probability of s shares being available
is proportional to 1/sb for real b—a so-called heavy-tailed
f Tickers represented are AIG, ALO, CMI, CVX, FRE, HAL, JPM, MER, MIR,
NOV, XOM, and NRG.