[ts-gen] Download Bulk historical data through Java - Ruby script
R P Herrold
herrold at owlriver.com
Sat Mar 22 13:31:50 EDT 2008
Bill posted this on the TWS Yahoo! mailing list earlier this
week. As I was responding to an item on the IB chat board, it
occurred to me that not all our users may be on as many lists
as I happen to monitor, and so it was worthy of a report here.
The context should be clear from the post, as it describes a
simple Ruby script, designed to 'show off' a Ruby co-process
driving the shim, to retrieve one minute OHLC data on each Dow
30 component, from the start of the year to the present.
---------- Forwarded message ----------
Date: Wed, 19 Mar 2008 15:48:00 -0400
From: Bill Pippin <pippin at owlriver.net>
Reply-To: TWSAPI at yahoogroups.com
To: TWSAPI at yahoogroups.com
Subject:Download Bulk historical data through Java
Aon Lazio asked:
> Has anyone here tried to download bulk historical data through
> Java? ... I have a set of 10 symbols and I want hist data from
> 20080101-present.
> Could you show me some simple code? I could not find it
> anywhere. Thank you so much.
As Kurt Bigler noted, you could modify the sample client for this
task. It's in Java, and provides only gui inputs as written. If
you try this approach, you'll probably yearn for a simple command
line text tool sooner or later.
I'm including a ruby script below that does bulk download of history
data. It works by driving the trading-shim, a C++ application that
provides a concise command language interface to the IB tws api. The
shim is GPL v3, and available from www.trading-shim.com.
E.g., for the ruby fprintf below, the resulting shim command queries for
one minute bars for Alcoa on Jan 2 2008.
printf "select past %3u %2u Ymd_T(2008%02u%02u %s);\n",
11, # database stock contract index
5, # history query configuration, and bar size
1, # month
2, # day
"16:00:00" # time of last bar
# results in the string, and valid shim command:
# select past 11 5 Ymd_T(20080102 16:00:00);
The full script follows my sig. If you need help with shim installation,
we have a mailing list at http://www.trading-shim.org/mailman/listinfo
Thanks,
Bill Pippin
________________________________________________________________________
#!/usr/bin/ruby
# author: Bill Pippin, <pippin at owlriver.com>
# copyright (c) 2008 Trading-shim.com, LLC Columbus, OH
# GPL version 3 or later, see COPYING for details
=begin
Collect one minute history bars for each:
o week day from jan 1st to yesterday
o regular trading hour from 9:00 to 16:00
o contract of the dow 30
Data will show up in the file ShimText in the current directory; the
shim must be installed, and exist as a binary in the current
directory. You'll need to uncomment out the appropriate line in the
body of the inner loop to divert text output from the stdout to the
shim subprocess.
The PastFilter index of 5 in the query selects one minute TRADE bars
during regular trading hours for the day; other parameters can be used
to change the query granularity, and the time array is provided for
that purpose. If you do want to modify this script, see the comments
following the loop nest.
=end
Shim = IO.popen("./shim --data file save", "w")
mons = [ 0, 1, 2 ]
days = [
#M T W R F M T W R F M T W R F M T W R F M T W R F
[ 2, 3, 4, 7, 8, 9,10,11, 14,15,16,17,18, 22,23,24,25, 28,29,30,31 ],
[ 1, 4, 5, 6, 7, 8, 11,12,13,14,15, 19,20,21,22, 25,26,27,28,29],
[3, 4, 5, 6, 7, 10,11,12,13,14, 17,18] ]
time = [" 9:30:00", "10:00:00", "10:30:00", "11:00:00", "11:30:00",
"12:00:00", "12:30:00", "13:00:00", "13:30:00",
"14:00:00", "14:30:00", "15:00:00", "15:30:00",
"16:00:00"]
syms = [ 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, # contract uids
21, 22, 23, 24, 25, 26, 27, 28, 29, 30, # of the dow 30
31, 32, 33, 34, 35, 36, 37, 38, 39, 40 ] # from LocalSet
past = "select past %3u 5 Ymd_T(2008%02u%02u %s);\t" # 5: PastFilter
wait = "wait 11;\n"
exit = "exit;\n"
# for each month in jan..mar
# for each weekday in the month until yesterday, excluding holidays
# for each half hour in regular trading hours
# emit query, pause the shim for 11 sec, and sleep for 11 sec
for i in mons
for j in days[i]
for s in syms
printf past, s, i+1, j, "16:00:00"; printf wait;
# Shim.printf past, s, i+1, j, "16:00:00"; Shim.printf wait; sleep(11);
end end end
Shim.printf exit
exit
=begin
For other symbols besides those selected here, you'll want to modify
the sql table LocalSet, and perhaps extend this script to look up the
resulting contract indices using sql queries.
The PastFilter index controls query configuration, with indices
between 1 and 11 selecting s01, s05, s15, s30, m01, m02, m05, m15,
m30, h01, or d01 bars, respectively, where s, m, h, and d refer to
seconds, minutes, hours, and days.
As written, with queries for one minute bars over the trading day,
this script produces 1590 queries. That's 53 days, times 30 symbols
in the dow 30. At eleven seconds each, this'll take around 4 hours to
run.
If you want to increase the bar frequency, e.g., to one second
intervals, consider first that using a half-hour duration for queries
will require *22260* queries, or nearly two 40 hour collection weeks.
Catching up is hard to do!
As noted above, for one second bars, the PastFilter index of 5, which
corresponds to the fifth of the IB history bars, would be replaced
by 1, standing for the first, what we call s01 above. In addition,
since queries can return at most 2k records, multiple queries would be
needed for each day, so that it would also be necessary to insert a
for loop using the table "time", and replace the end-of-day literal in
the query print statement with the value indexed from time.
At this point you'll be needing to stop and restart the script from
one day to the next. Modifying this script to parameterize the
PastFilter index, choose subranges of the data, understand the
calendar, check regular trading hours for the route and symbol, and
otherwise preen data, is left as an exercise.
=end
More information about the ts-general
mailing list