[ts-gen] Database for tick data

R P Herrold herrold at owlriver.com
Tue Oct 9 16:30:10 EDT 2007


On Tue, 9 Oct 2007, an anonymous poster wrote:

> Anyhow, I was hoping that the historybar could store the 
> current flow of incoming ticks.

The real issue is that the shim does not have a database 
designed to hold non-summarized (raw) tick information.  For 
at least the last couple of years, IB provided ticks at 300 
mSec intervals.  IB has recently moved to 100 mSec intervals 
for some Symbols.

Doing some math, we end up with an expected maximum 'detail 
line' count on some data thus:

One second OHLC history, for single NYSE symbol (09:30 to 
1600 US ET) is 6.5 hr. or 390 minutes, x 60 seconds per minute
 	23400 samples

HistoryBar is designed to hold _summarized_ OHLC data:

mysql> describe HistoryBar ;
+----------+------------------+------+-----+---------+----------------+
| Field    | Type             | Null | Key | Default | Extra          |
+----------+------------------+------+-----+---------+----------------+
| uid      | int(10) unsigned | NO   | PRI | NULL    | auto_increment |
| cid      | int(10) unsigned | NO   | MUL |         |                |
| bid      | int(10) unsigned | NO   | MUL |         |                |
| time     | datetime         | NO   |     |         |                |
| open     | decimal(10,4)    | NO   |     |         |                |
| high     | decimal(10,4)    | NO   |     |         |                |
| low      | decimal(10,4)    | NO   |     |         |                |
| close    | decimal(10,4)    | NO   |     |         |                |
| vol      | int(11)          | NO   |     |         |                |
| wap      | decimal(10,4)    | NO   |     |         |                |
| has_gaps | tinyint(1)       | NO   |     |         |                |
+----------+------------------+------+-----+---------+----------------+

The shim command language query elements can be built up with 
a trivial shell script:

[herrold at centos-4 shim.070928]$ cat ../get_day.sh
#!/bin/sh
#
#       emit a series of commands to get a day's worth of History
#       on Contract.uid 15 (AIG), using PastFilter.uid 13
for i in ` seq 9 16`; do
         [ $i -gt 9 ] && {
                 echo -n "past add 15 13 Ymd_T(20071009  "
                 echo "${i}:00:00);"
                 echo "wait 10;"
                 }
         [ $i -lt 16 ] && {
                 echo -n "past add 15 13 Ymd_T(20071009  "
                 [ $i -lt 10 ] && echo -n "0"
                 echo "${i}:30:00);"
                 echo "wait 10;"
                 }
done
echo "quit;";
#
[herrold at centos-4 shim.070928]$ ../get_day.sh
past add 15 13 Ymd_T(20071009  09:30:00);
wait 10;
past add 15 13 Ymd_T(20071009  10:00:00);
wait 10;
past add 15 13 Ymd_T(20071009  10:30:00);
wait 10;
past add 15 13 Ymd_T(20071009  11:00:00);
wait 10;
past add 15 13 Ymd_T(20071009  11:30:00);
wait 10;
past add 15 13 Ymd_T(20071009  12:00:00);
wait 10;
past add 15 13 Ymd_T(20071009  12:30:00);
wait 10;
past add 15 13 Ymd_T(20071009  13:00:00);
wait 10;
past add 15 13 Ymd_T(20071009  13:30:00);
wait 10;
past add 15 13 Ymd_T(20071009  14:00:00);
wait 10;
past add 15 13 Ymd_T(20071009  14:30:00);
wait 10;
past add 15 13 Ymd_T(20071009  15:00:00);
wait 10;
past add 15 13 Ymd_T(20071009  15:30:00);
wait 10;
past add 15 13 Ymd_T(20071009  16:00:00);
wait 10;
quit;
[herrold at centos-4 shim.070928]$

Which we can then feed into a shim instance using the standard 
pipe shell construct:

 	../get_day.sh | ./shim --data logd

which will emit the day's data into a file (here the one 
pointed to by the /etc/syslog.conf; i.e., /var/log/messages ).

Or to the standard out:

 	../get_day.sh | ./shim --data cout

where it may be captured into a file:

 	../get_day.sh | ./shim --data cout > shimout.txt 2> /dev/null

Or of course, as is the stock configuration, into the shim's 
own detail file:

 	../get_day.sh | ./shim --data

which places the information in ShimText in the current 
working directory, for later manual parsing.  Tick data is no 
different in its ability to reach syslog, 'cout', or 
'ShimText', but is not well structured for an immediate 
database insert.

The History data will also end up in HistoryLog as OHLC detail 
rows.

[Caveat:  We noticed today that the current shim can over-run 
the history database insert this way (it pulled those 25k 
lines (3.4 million characters) in about two minutes from IB in 
process id: 9894); we'll push a new release addressing this in 
a bit.]

[herrold at centos-4 shim.070928]$ sudo tail -2 /var/log/messages ; \
 	sudo grep 9894/var/log/messages | wc
Oct  9 16:10:55 centos-4 :  9894|58252| 132062070|3| 1|
 	1|20071009  15:59:58|70.09|70.1|70.09|70.1|
 	18|70.09|false|STK.SMART.AIG.
Oct  9 16:10:55 centos-4 :  9894|58252| 132062089|3| 1|
 	1|20071009  15:59:59|70.11|70.12|70.11|70.12|
 	10|70.12|false|STK.SMART.AIG.
   25264  277849 3421453
[herrold at centos-4 shim.070928]$

Tick data needs more massaging as there is less structure to 
it, as is well known.

-- Russ Herrold


More information about the ts-general mailing list