Home → News → One year ago I was a stock market analyst [Part 1]

68 weeks agoOne year ago I was a stock market analyst [Part 1]

I'm in final stages of the project I mentioned a few posts before and it's coming together really good, so I was thinking of the best ways of explaining what's so good about it and why I think I did a good job with it. This reminded me of my past projects and I realized I never told anyone about any of them before.

This seems very unlikely now, but in fact a year ago I was a kind of stock market analyst. I was doing a very, very technical job, but in the end it was stock market analysis. There were about a thousand tasks I was responsible for and I can't really talk about quite a few of them, so I will focus on the technical side of it. Read on if you're interested in programming or stock markets technology in general.

What we were trying to accomplish is to automate stock trading as much as possible. Obviously this starts from data collection and analysis, we had QuotePlus subscription for dailies and IQFeed for live intraday data. I've implemented access modules for both of them in Python. Next step would be to analyze the data. We were primarily aiming for active trading strategies and for that we were trying to figure out some patterns that evolve in a such way that statistically we could make a profit by catching one of the price changes that comprise the pattern. Even once we had a pattern we needed to find the best parameters to maximize the profit, minimize the risk and optimize some other characteristics. This means we need to scan through all data we had countless times. This also involves a lot of programming so using C or C++ was not an option, we wouldn't be caught dead using one of those monsters. So we had to get as much performance from Python as we could.

We were scanning dailies for the most part and QuotePlus is very slow for such a demanding application. I was trying to see if any existing DB module could get us the speed we needed. I tried marshall, DBM, SQLite, MetaKit, PyTables and probably some others I forgot, none of them were even close. It seemed that we would not be able to scan through data fast, but I had a go at implementing my own simplistic database that was optimized for our task. And voila -- it was blazing fast. It was almost unbelievable, it beat everything else I tried before by an order of magnitude while the entire source code for it was.. 1.7K of pure Python. No, seriously.

The whole database of QuotePlus takes about 2Gb of disk space. It fits on a CD when RAR'ed. When converted to my own format (no compression, just raw data) it was taking a whopping 268Mb. And you if you enabled NTFS file compression on that folder it shrank to 229Mb (it only mattered a little because of disk caching). Remember, it's a daily trading history of almost 9'000 stocks, some of them 10 years old or more. So how long the scan was taking? Cold, 'dummy' scan (no processing and data are not in cache yet) of all that was taking less than 2mins on a rather dated AMD with 512Mb DDR1 RAM and IDE drive. Give it more memory, use SATA (like we did later) and it takes less than 1 min. That's impressive. We are talking full daily history of 8'670 stocks and indexes!!

Now, how many other stock market analysts can do that? :) OK, I think that's it for today, I'll have to write part two sometime later.
Cheers!

See Archives for more

Questions?

Enter your name:
Your email address:
And your message: