[ts-gen] data output post

R P Herrold herrold at owlriver.com
Sun Oct 26 12:59:10 EDT 2008


initially posted by me to the IBPY mailing list

On Tue, Oct 21, 2008 at 9:29 AM, vfulco1 at gmail.com 
<vfulco1 at gmail.com> wrote:

> Wondering if any list members might have best practices 
> suggestions for collecting RTD for further manipulation?

As you might remember, Vince, we (in the trading-shim project, 
inspired by Troy's work [thanks, Troy]) emit output into 
several forms [including RTD of several forms].

As I am sure you are finding to ask this question, depending 
on the time frame for further use and the management approach, 
one can easily end up with mountains of data in fairly short 
order [collecting huge piles of numbers is trivial; making 
sense of it, less so]:

[0. stdout, of course -- 'In the beginning was the command 
line']

1.  Into a file with a well defined layout as 'plain old 
text', so that it can be post processed ad infinitum

2.  The file can be a pipe, or be 'tee' -ed into a pipe, post 
processed as a stream, and dropped

3.  The file can be sliced and diced into CSV's (the 'tree' 
project uses well defined and well-named 'csv' -ish result 
files)

4.  We can emit a well-formed UDP stream -- think: man logger 
-- for a connectionless data stream, with acceptance of drops

5.  Multicast is a trivial extension of generic UDP.  We have 
not expressly written it, as it is trivial, a la method 2, to 
tee into a multcast broadcaster

6. In [Python] process, or cross process native 'pickle' were 
mentioned by Troy and make prefect sense [for IBPY].

7.  I helped drill SQLite into Red Hat's offerings a few years 
ago at v2 to have a lightweight sqlish ondisk database -- JBJ 
added 'hooks' for RPM to do time trials against Berkeley DB 
(BDB is MUCH faster, at the expense of a less portable 
syntax);  Seth picked it up in YUM as well, so it is in there 
to stay. [We do NOT carry it in the shim, but I mention it 
for completeness]

8.  We use plain old socket based SQL (implemented via MySQL 
presently), but know exactly the change points to drill in 
Oracle, PostGreSQL, and DB2;  some customer will pay us to do 
those additions ;)

9.  We may add an 'in shim' generation of a stream-xml markup, 
which permits to emit the subclass markup of a FIX stream

Our output handlers inside were consciously chosen at 
requirements and design time to give us these flexibilities

[Later note: the UDP logger method is perhaps vestigial, but 
it really helped me to be able to catch the data stream more 
easily early on on our project]

I hope this helps.

-- Russ herrold


More information about the ts-general mailing list