Getting your hands on high-quality financial data can be really challenging, but once you do — it’s totally worth it.
WHY ? Let’s find out….
But first we must ask ourselves — how important is it to have a good, reliable source of data ?
Well, it should be the most important thing in the world for you — if you care about building reliable strategies 🙂
You want to make sure before you have started the process of gathering data from your vendor and building different tools to interact with it, that you run checks and made sure you are working with a reliable source. Not doing so — can have significant outcomes in the future that will cost you time, money and energy. You’ll want to make sure at least once that you’re are dealing with a reliable vendor, and from that point onwards — focus on stuff that matter to you — analyzing the data, building and testing strategies.
Finding out later on there were issues with the data will force you to switch vendors, spend time on fixing many things, and cost you more money as you might need to re-download everything again from scratch, since you don’t know how far the mistakes reach.
Which tests are good to run on any data vendor ?
Some of the actions you can take when testing whether or not a data vendor can meet your needs:
- Where is the data coming from ? Making sure data is coming from the sources only. You don’t want to deal with 3rd party vendors or other entities that mess with the data. The less middlemen the data went through — the better. You most likely would not want to go straight to the sources yourself, as that will be inefficient for an individual and cost way more, but your ideal scenario would be that the vendor you work with gets all of its data from the sources.
- Understanding how data is being normalized and clean. This is important so you can know if any data is omitted or changed, once retrieved from the sources.
- Granularity of the data — what’s the depth of the data covered by the vendor ? Is it missing some parts ,markets or universes (what you want) ? If the vendor has partially datasets — that will only double your work and force you to have to work with multiple data providers, when you can easily find one that has it all.
- How frequently the vendor updates the data ? This is very important so you know what to expect in the future and be prepared, in case you need daily updates to the data. It also gives you indication of how important is data quality for the vendor you’re working with.
Here are the important things to check when choosing a financial data vendor:
- Splits and Dividends
- Delisted securities
- Outstanding shares
- Symbol changes
- Mergers and acquisitions
- Which markets are covered ? NASDAQ, NYSE, OTC
- Fundamentals and SEC-Filings
Let’s understand the importance of each one :
Splits and Dividends
It’s unbelievably important to have a good data source for splits and dividends, in addition to high quality market data (open/high/low/close/volume etc’).
Why ? Because when pulling historical data you will need to obtain both adjusted, as well as unadjusted data. Splits and dividends allow you to construct the adjusted data and having inaccurate or missing a lot of splits or dividends will result in inaccurate adjusted data. This will hurt your analysis and skew your tests.
Keeping the delisted securities and not deleting them from the database is vital. When you test strategies and scan historical data — you want to get all results and not exclude anything. This is a common thing data vendors do and it produces survivorship biases in your results — SUPER IMPORTANT.
Having reliable outstanding shares dataset is extremely important. Outstanding shares make up the market-cap of a company, a very significant metric that many traders based their decisions on or build strategies around. It’s very important that the vendor you work with obtains all of it’s o/s data directly from the sources, which are the SEC-Filings, and not from other 3rd party websites, which many do. Another challenge is to get the o/s from all of it’s filings and not only main ones, which are the 10Qs/Ks. You’ll want to get them from the S-3/F-3/424/10Q-K/20-F/etc’ as it can change in any of them and you need to store them all to have good point-in-time data.
Companies change their ticker symbol all the time in the market. It’s not an uncommon thing to see. A really good vendor will have both tickers (before and after the change) stored in their database, as it’s useful in many cases to see past historical data with both tickers in place. Unfortunately — many data providers don’t put a big enough emphasis on this matter.
Mergers and acquisitions
Having these events recorded and similar ones can really help build interesting fundamental data points and insights. Gives you more power and more understanding and context of market events, which you can them incorporate into your analysis.
Which markets are covered ?
Be sure to check really good which markets your data vendor is covering and what’s the universe of securities. Having some, but not all markets gives you less options to do research. A great example of this are vendors that offer market data from NASDAQ & NYSE, but not any OTC data. You are missing an entire universe of securities to play around with 🙁
Omitting different securities is bad of course as you’ll miss many potential candidates for your strategies.
spikeet.com is the only platform that not only gives you high-quality financial data, coming all form the sources and all markets, including OTC’s — but takes all of these things really seriously.
The exchanges disseminate the raw form of market data, which is the tick data. You’ll want to check and ask yourself how the OHLCV aggregates are built ? what calculations are made and which guidelines are they following to create the bars ?
Fundamentals and SEC-Filings
Extremely important ! Many vendors out there provide you with inaccurate fundamental data, taken from various questionable sources. Some examples of these data points can be: outstanding shares, float, institutional-ownership, shares float, cash, liabilities, and other financial data. The reason they do this is because it’s easy (and illegal!) to scrape data off of some random website, rather than actually extract it all from the sources, which are the SEC-filings. It takes a lot of hard work and creativity to make sure you are consistently pulling the data from the filings (which are very complex and messy — thank you SEC), and updating it every second, for every new filing that comes out… yes every single filing 🙂
I don’t need to explain the repercussions of having in-accurate fundamental data, as it will cause major issues in your analysis, and in many cases is correlated to your technical data, i.e. : market-cap is calculated as o/s * previous close price, well guess what happens if you have in-accurate outstanding shares in your historical records ?
At spikeet.com we take pride in implementing all the above points ^^^ and we’ve worked really hard on building some of them, quite frankly.
From the start we’ve set an important priority to ourselves and that is providing only high-quality data, no matter what. We’d rather not offer a certain data-point than offer it and have it in-accurately.
How are we implementing all of these things ?
spikeet.com get’s all of its data entirely from the sources. We never use 3rd party tools or vendors. That means for technical data — the exchanges, and for fundamental — the SEC-filings. We also update and consume data on a second/minute/daily basis, depending on the feeds & constantly run tests and comparisons to make sure our data is at the highest level.
spikeet.com has a huge list of splits and dividends going back 20+ years for all US equities. We crosscheck everything and are able to build adjusted and unadjusted data based on these, which are crucial for historical data!
We keep all delisted securities and never delete any ticker or security, ensuring you can pull candidates that matched your historical scans, even if they are no longer active or trading.
We pull and parse all SEC-filings, every second of the day. We are able to extract all information from every filing that comes out 🙂 That give us a huge advantage as we are able to get all outstanding shares/ other earnings or fundamental data-points directly from the source, at the minute it’s published. We don’t go through any 3rd party vendors for our fundamental and earnings data. We are also able to combine dilutions and splits to make sure our outstanding shares counts are as accurate as possible.
In the event of a ticker symbol change — we make sure to keep both old and new tickers so have point-in-time reference and we’re able to see both tickers when looking at the historical data of the ticker.
We’re one of the few vendors that cover all US equity markets, including OTC’s.
So there you go — a one stop shop for all your data needs. When pulling data from spikeet.com you can be certain you are getting the highest data-quality out there that comes directly from the sources. We take extreme measures and build special tools so that you don’t have to. Your job is to consume the data and start analyzing it, not clean it, test it or question its integrity :)>
Yours truly — Noam — A Data addict 🤓