How Investment Firms Should Use AI in 2026

February 28, 2026

I was looking at an NVIDIA 10-K a while back, running it through our system, and it caught something that past models would have buried or missed entirely.

Two customers. Majority of revenue. Right there in the filing, documented, flagged, cross-referenced against the earnings narrative. The kind of customer concentration risk that only becomes obvious after the quarter beats on every metric and everyone on CNBC acts surprised. The model read it the way a careful analyst would, not just pulling the number, but contextualizing what it meant for the earnings story being sold to the street. And 10x faster too.

Earlier versions of these models couldn't do that. They'd find the number. They wouldn't understand why it mattered. That gap closing is not a small thing.

I work at Terminal X. We build AI research systems for institutional investors at hedge funds, family offices, asset managers, etc. I'm an ai engineer. My job is understanding what these systems can actually do, where they break, and how to build them to produce real output for real investment decisions. I've spent the last few years with a front-row seat to what works and what's mostly marketing.

This is what I've learned.

The moment everything changed

For a long time, AI in investment research meant a better search bar. You could ask a question and get an answer faster than you'd find it manually. Useful. Not transformative.

What changed the picture was function calling and MCP servers. The infrastructure that lets AI models talk to external tools, APIs, and data sources in real time. Suddenly you weren't just asking a model a question. You were giving it the ability to go get things, check things, run things, and chain those steps together without a human in the loop at each stage.

That's what made semi-autonomous research systems possible. A query comes in, the system decides which data sources are relevant, pulls from them simultaneously, runs the analysis, and surfaces a finding, all before you'd finished opening your second tab manually. We have thousands of individual pipelines that fire depending on what's being asked and what type of data is involved. Three years of work, encoding how investment-grade ai actually has to function. None of it existed before function calling made it viable.

The firms that understand this aren't just moving faster. They're accessing data that was effectively invisible before because no human would have had time to look.

Most firms are using AI wrong and blaming the model

Only 11% of family offices had implemented generative AI as of last year. 30% said they wanted to. That gap, between wanting it and actually building something that works, is where most of the industry is stuck right now.

And I get it, honestly. The firms that have tried it and walked away frustrated usually ran into the same problem: they asked too much at once and got mush back.

I see this constantly. Someone wants to compare 25 tickers across every metric like consensus estimates, price action, recent news, macro exposure, earnings history. They throw it all into a single prompt, get a wall of text that's technically accurate and practically useless, and conclude that AI isn't ready for serious investment work.

The model isn't the problem. The workflow is.

Think about how you'd actually assign this to an analyst. You wouldn't hand them 25 companies and say "go figure out everything about all of them." You'd start with a screen. Then you'd pull the top candidates and go deeper. Then you'd stress-test the ones that looked interesting. Each step has a defined scope and a clear output before the next one starts.

AI works the same way. The results when you break research into focused, sequential tasks (each one narrow enough to do well) are not incrementally better than the single-prompt approach. They're a different category of output entirely. The firms figuring this out aren't just more patient with the tool. They've rebuilt how they think about research workflows from scratch.

The lie that keeps circulating on X

Every few weeks someone posts that a new model can "one-shot a full DCF." I won't name names but the engagement is always good because it sounds impressive and most people don't think too hard about what that actually means.

Here's what it actually means: the model can pull historical financials from the filing, drop numbers into a template, and produce something that looks like a model. And yes, it can do that. That part works.

What it cannot do is the part analysts actually get paid for.

The assumptions. The judgment calls buried in a WACC estimate. The revenue growth rate that reflects a real view on the competitive dynamics of an industry, not a trend line extrapolated from the last four quarters. The terminal value that accounts for where a business is in its cycle. Those numbers look like inputs. They're actually the entire analytical product. A model with wrong assumptions is worse than no model because it gives you false precision to hide behind.

Any AI company telling you their product replaces that process is either confused about what financial modeling is, or they're hoping you are. The honest version, and a few products are honest about this, is that AI can automate the data collection and template population that used to take a junior analyst half a day. That's real value. It's just not what the posts are implying.

The assumptions still require someone who's spent years building an intuition for what's realistic in a given sector, what management teams tend to overpromise, what the buy-side consensus is missing. That doesn't live in a model. It lives in the person running it.

Why generic AI underperforms here specifically

We benchmarked Terminal X against ChatGPT, Claude, and Gemini across 500 retail investment queries. Terminal X scored 89.9% composite accuracy. The next best was 70.6%.

The gap isn't about which foundation model is smarter. It's about data.

General-purpose AI tools pulled from Tier 1 sources like SEC filings, earnings call transcripts, institutional broker research, Bloomberg, etc less than 6% of the time. Terminal X pulled from Tier 1 sources 49% of the time. The models aren't different. What they have access to is different, and how that information gets processed before the model ever sees it is different.

The difference isn't just which sources get called. It's what if the exact source you need gets found. A 10-K gets processed differently than a CIM, which gets processed differently than a broker note, an internal memo, or a real estate document. Each one has its own chunking logic, metadata extraction, image analysis where relevant, and storage structure. When a query comes in, the retrieval knows what kind of document it's looking at and how to pull from it accurately using all of our tagged metadata. A general-purpose model calling the same tools doesn't have any of that. It gets the source but not the exact document from x date version 3 from your drive. We get the documents and what's inside it, properly tagged and ready to use.

That processing gap is what took three years to build. It's not a prompt. It's not a clever system message. It's thousands of specialized pipelines that understand the difference between how you handle an SEC filing versus a broker note versus a real-time price feed, and process each one accordingly. That retrieval architecture is the actual moat in this industry right now. Data quality in, analysis quality out.

What honest AI adoption looks like

I'm not going to tell you AI changes everything and traditional research is dead. That's not what I've seen.

What I've seen is that AI removes the ceiling on how much ground a small, focused team can cover. An analyst who knows how to work with these systems isn't just faster. They can hold more context simultaneously, catch more signals in more places, and spend the hours they save on the part of the job that actually requires their judgment.

The NVDA customer concentration finding didn't make the investment decision. It surfaced a question worth asking. A human still had to figure out whether it was priced in, how it compared historically, what management had said about it on the last call, and whether it changed the thesis. That's still the job. The job just starts from a better place.

Most of the industry is still at "we use enterprise GPT for some things." That's fine as a starting point. It's not fine as a strategy. The firms building real workflows now, the ones treating AI infrastructure as seriously as they treat their data subscriptions, are compounding an advantage that's going to be very hard to close in two or three years.

The gap between using AI and using AI well is still wide enough to matter. Not for much longer.

-- Gabriel