[ts-gen] Download Bulk historical data through Java - Ruby script

R P Herrold herrold at owlriver.com
Sat Mar 22 13:31:50 EDT 2008


Bill posted this on the TWS Yahoo! mailing list earlier this 
week.  As I was responding to an item on the IB chat board, it 
occurred to me that not all our users may be on as many lists 
as I happen to monitor, and so it was worthy of a report here.

The context should be clear from the post, as it describes a 
simple Ruby script, designed to 'show off' a Ruby co-process 
driving the shim, to retrieve one minute OHLC data on each Dow 
30 component, from the start of the year to the present.


---------- Forwarded message ----------
Date: Wed, 19 Mar 2008 15:48:00 -0400
From: Bill Pippin <pippin at owlriver.net>
Reply-To: TWSAPI at yahoogroups.com
To: TWSAPI at yahoogroups.com
Subject:Download Bulk historical data through Java

Aon Lazio asked:

> Has anyone here tried to download bulk historical data through
> Java?  ... I have a set of 10 symbols and I want hist data from
> 20080101-present.

> Could you show me some simple code?  I could not find it
> anywhere.  Thank you so much.

As Kurt Bigler noted, you could modify the sample client for this
task.  It's in Java, and provides only gui inputs as written.  If
you try this approach, you'll probably yearn for a simple command
line text tool sooner or later.

I'm including a ruby script below that does bulk download of history
data.  It works by driving the trading-shim, a C++ application that
provides a concise command language interface to the IB tws api.  The
shim is GPL v3, and available from www.trading-shim.com.

E.g., for the ruby fprintf below, the resulting shim command queries for
one minute bars for Alcoa on Jan 2 2008.

     printf "select past %3u %2u Ymd_T(2008%02u%02u  %s);\n",
         11,		# database stock contract index
          5,		# history query configuration, and bar size
          1,		# month
          2,		# day
          "16:00:00"	# time of last bar
 			# results in the string, and valid shim command:

#   select past 11  5 Ymd_T(20080102  16:00:00);

The full script follows my sig.  If you need help with shim installation,
we have a mailing list at http://www.trading-shim.org/mailman/listinfo

Thanks,

Bill Pippin
________________________________________________________________________

#!/usr/bin/ruby

#  author: Bill Pippin, <pippin at owlriver.com>
#  copyright (c) 2008 Trading-shim.com, LLC  Columbus, OH
#  GPL version 3 or later, see COPYING for details

=begin

   Collect one minute history bars for each:

     o week day from jan 1st to yesterday
     o regular trading hour from 9:00 to 16:00
     o contract of the dow 30

   Data will show up in the file ShimText in the current directory; the
   shim must be installed, and exist as a binary in the current
   directory.  You'll need to uncomment out the appropriate line in the
   body of the inner loop to divert text output from the stdout to the
   shim subprocess.

   The PastFilter index of 5 in the query selects one minute TRADE bars
   during regular trading hours for the day; other parameters can be used
   to change the query granularity, and the time array is provided for
   that purpose.  If you do want to modify this script, see the comments
   following the loop nest.

=end

Shim = IO.popen("./shim --data file save", "w")
mons = [ 0, 1, 2 ]
days = [

#M  T  W  R  F   M  T  W  R  F   M  T  W  R  F   M  T  W  R  F   M  T  W  R  F
[      2, 3, 4,  7, 8, 9,10,11, 14,15,16,17,18,    22,23,24,25, 28,29,30,31   ],
[            1,  4, 5, 6, 7, 8, 11,12,13,14,15,    19,20,21,22, 25,26,27,28,29],
[3, 4, 5, 6, 7, 10,11,12,13,14, 17,18] ]

time = [" 9:30:00", "10:00:00", "10:30:00", "11:00:00", "11:30:00",
                     "12:00:00", "12:30:00", "13:00:00", "13:30:00",
                     "14:00:00", "14:30:00", "15:00:00", "15:30:00",
         "16:00:00"]

syms = [ 11, 12, 13, 14, 15, 16, 17, 18, 19, 20,	# contract uids
          21, 22, 23, 24, 25, 26, 27, 28, 29, 30,	# of the dow 30
          31, 32, 33, 34, 35, 36, 37, 38, 39, 40 ]	# from LocalSet

past = "select past %3u  5 Ymd_T(2008%02u%02u  %s);\t"	# 5: PastFilter
wait = "wait 11;\n"
exit = "exit;\n"

# for each month in jan..mar
# for each weekday in the month until yesterday, excluding holidays
# for each half hour in regular trading hours
#   emit query, pause the shim for 11 sec, and sleep for 11 sec

   for i in mons
   for j in days[i]
   for s in syms
          printf past, s, i+1, j, "16:00:00";      printf wait;
#   Shim.printf past, s, i+1, j, "16:00:00"; Shim.printf wait; sleep(11);
   end end end
   Shim.printf exit
exit

=begin

   For other symbols besides those selected here, you'll want to modify
   the sql table LocalSet, and perhaps extend this script to look up the
   resulting contract indices using sql queries.

   The PastFilter index controls query configuration, with indices
   between 1 and 11 selecting s01, s05, s15, s30, m01, m02, m05, m15,
   m30, h01, or d01 bars, respectively, where s, m, h, and d refer to
   seconds, minutes, hours, and days.

   As written, with queries for one minute bars over the trading day,
   this script produces 1590 queries.  That's 53 days, times 30 symbols
   in the dow 30.  At eleven seconds each, this'll take around 4 hours to
   run.

   If you want to increase the bar frequency, e.g., to one second
   intervals, consider first that using a half-hour duration for queries
   will require *22260* queries, or nearly two 40 hour collection weeks.
   Catching up is hard to do!

   As noted above, for one second bars, the PastFilter index of 5, which
   corresponds to the fifth of the IB history bars, would be replaced
   by 1, standing for the first, what we call s01 above.  In addition,
   since queries can return at most 2k records, multiple queries would be
   needed for each day, so that it would also be necessary to insert a
   for loop using the table "time", and replace the end-of-day literal in
   the query print statement with the value indexed from time.

   At this point you'll be needing to stop and restart the script from
   one day to the next.  Modifying this script to parameterize the
   PastFilter index, choose subranges of the data, understand the
   calendar, check regular trading hours for the route and symbol, and
   otherwise preen data, is left as an exercise.

=end


More information about the ts-general mailing list