[ts-gen] Database for tick data
R P Herrold
herrold at owlriver.com
Tue Oct 9 16:30:10 EDT 2007
On Tue, 9 Oct 2007, an anonymous poster wrote:
> Anyhow, I was hoping that the historybar could store the
> current flow of incoming ticks.
The real issue is that the shim does not have a database
designed to hold non-summarized (raw) tick information. For
at least the last couple of years, IB provided ticks at 300
mSec intervals. IB has recently moved to 100 mSec intervals
for some Symbols.
Doing some math, we end up with an expected maximum 'detail
line' count on some data thus:
One second OHLC history, for single NYSE symbol (09:30 to
1600 US ET) is 6.5 hr. or 390 minutes, x 60 seconds per minute
23400 samples
HistoryBar is designed to hold _summarized_ OHLC data:
mysql> describe HistoryBar ;
+----------+------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+----------+------------------+------+-----+---------+----------------+
| uid | int(10) unsigned | NO | PRI | NULL | auto_increment |
| cid | int(10) unsigned | NO | MUL | | |
| bid | int(10) unsigned | NO | MUL | | |
| time | datetime | NO | | | |
| open | decimal(10,4) | NO | | | |
| high | decimal(10,4) | NO | | | |
| low | decimal(10,4) | NO | | | |
| close | decimal(10,4) | NO | | | |
| vol | int(11) | NO | | | |
| wap | decimal(10,4) | NO | | | |
| has_gaps | tinyint(1) | NO | | | |
+----------+------------------+------+-----+---------+----------------+
The shim command language query elements can be built up with
a trivial shell script:
[herrold at centos-4 shim.070928]$ cat ../get_day.sh
#!/bin/sh
#
# emit a series of commands to get a day's worth of History
# on Contract.uid 15 (AIG), using PastFilter.uid 13
for i in ` seq 9 16`; do
[ $i -gt 9 ] && {
echo -n "past add 15 13 Ymd_T(20071009 "
echo "${i}:00:00);"
echo "wait 10;"
}
[ $i -lt 16 ] && {
echo -n "past add 15 13 Ymd_T(20071009 "
[ $i -lt 10 ] && echo -n "0"
echo "${i}:30:00);"
echo "wait 10;"
}
done
echo "quit;";
#
[herrold at centos-4 shim.070928]$ ../get_day.sh
past add 15 13 Ymd_T(20071009 09:30:00);
wait 10;
past add 15 13 Ymd_T(20071009 10:00:00);
wait 10;
past add 15 13 Ymd_T(20071009 10:30:00);
wait 10;
past add 15 13 Ymd_T(20071009 11:00:00);
wait 10;
past add 15 13 Ymd_T(20071009 11:30:00);
wait 10;
past add 15 13 Ymd_T(20071009 12:00:00);
wait 10;
past add 15 13 Ymd_T(20071009 12:30:00);
wait 10;
past add 15 13 Ymd_T(20071009 13:00:00);
wait 10;
past add 15 13 Ymd_T(20071009 13:30:00);
wait 10;
past add 15 13 Ymd_T(20071009 14:00:00);
wait 10;
past add 15 13 Ymd_T(20071009 14:30:00);
wait 10;
past add 15 13 Ymd_T(20071009 15:00:00);
wait 10;
past add 15 13 Ymd_T(20071009 15:30:00);
wait 10;
past add 15 13 Ymd_T(20071009 16:00:00);
wait 10;
quit;
[herrold at centos-4 shim.070928]$
Which we can then feed into a shim instance using the standard
pipe shell construct:
../get_day.sh | ./shim --data logd
which will emit the day's data into a file (here the one
pointed to by the /etc/syslog.conf; i.e., /var/log/messages ).
Or to the standard out:
../get_day.sh | ./shim --data cout
where it may be captured into a file:
../get_day.sh | ./shim --data cout > shimout.txt 2> /dev/null
Or of course, as is the stock configuration, into the shim's
own detail file:
../get_day.sh | ./shim --data
which places the information in ShimText in the current
working directory, for later manual parsing. Tick data is no
different in its ability to reach syslog, 'cout', or
'ShimText', but is not well structured for an immediate
database insert.
The History data will also end up in HistoryLog as OHLC detail
rows.
[Caveat: We noticed today that the current shim can over-run
the history database insert this way (it pulled those 25k
lines (3.4 million characters) in about two minutes from IB in
process id: 9894); we'll push a new release addressing this in
a bit.]
[herrold at centos-4 shim.070928]$ sudo tail -2 /var/log/messages ; \
sudo grep 9894/var/log/messages | wc
Oct 9 16:10:55 centos-4 : 9894|58252| 132062070|3| 1|
1|20071009 15:59:58|70.09|70.1|70.09|70.1|
18|70.09|false|STK.SMART.AIG.
Oct 9 16:10:55 centos-4 : 9894|58252| 132062089|3| 1|
1|20071009 15:59:59|70.11|70.12|70.11|70.12|
10|70.12|false|STK.SMART.AIG.
25264 277849 3421453
[herrold at centos-4 shim.070928]$
Tick data needs more massaging as there is less structure to
it, as is well known.
-- Russ Herrold
More information about the ts-general
mailing list