Dec 3, 2024
Preparing Your Files for AI Success
Note: This is the second post in a series about how financial firms can prepare their internal data for AI search. It is written with a non-technical or semi-technical audience in mind. If you have any questions about the content here, feel free to shoot a note to john@rogo.ai.
Intro
In the first article in this series, we explored why you, as a financial firm, need to “get your files ready” to leverage AI for internal search. For anyone who’s attempted to implement internal search at scale, the reasons we discussed will sound familiar. Here’s a quick recap:
With that foundation covered, let’s dive into the next step: How do you prepare your files for AI search?
Start with the Users
The most critical (and often overlooked) step is starting with your end users. This may sound obvious, but like eating well or exercising to improve mental health, it’s a fundamental truth we often resist.
The way you prepare your file system must align with the specific user workflows you aim to support.
Here’s the good news: You don’t need to interview every single employee. In a 1,000-person firm, talking to 30 users will likely surface recurring themes. Focus your conversations on these three questions:
What use cases are they targeting?
What files do they need? Which files are irrelevant or distracting?
What context do they have when asking a question?
The last question is especially important. Do users know the company name, date, or sector? Are they searching for a specific valuation method? Or are they coming in blind?
This context—the information users supply at query time—is what the AI system uses to filter and refine its search. Your system must be structured to interpret and act on this context effectively.
Identify Key “Swim Lanes”
“Swim lanes” are structured use cases: predictable searches where users can reliably expect strong results. While the goal remains open-ended search across your files, swim lanes help focus your efforts initially.
A different article in this series will cover which use cases are best suited for internal search. For now, here are some illustrative swim lanes:
Industry Credentials
Example: What recent IPOs have we done in EdTech?
Pulling Numbers
Example: What were our entry EBITDA multiples for these transactions?
Finding Comps
Example: What high-growth SaaS companies did we comp XYZ to in last year’s pitch?
Summarizing Content
Example: How have we positioned sponsor-owned CPG companies in distress?
Tracking Deal Teams
Example: Who was the ECM banker on these pitches?
Once you identify swim lanes, focus on designing workflows that make these searches as seamless as possible.
Map Out Informal Knowledge
For each swim lane, create 5-10 example questions. Then, work with users to document how they would answer those questions manually.
This step uncovers the informal knowledge embedded in your firm. It might explain why a user prefers an EMEA folder for an American company, or why they check the valuation report dated right after a deal closes rather than the latest one.
This is your golden ticket. Documenting these workflows not only highlights your firm’s unique practices but also reveals patterns that can guide your AI system.
Organize and Process Your Files
Now comes the hard part: translating user workflows into a structured, AI-ready file system. For each swim lane, describe how an analyst would answer a question. Then, think about how the AI system can replicate that process.
Here’s a toy example:
Swim Lane: Industry Credentials
Example Question: What are our recent EdTech IPOs?
Workflow:
Pull all deal folders in the sector.
Identify which deals were IPOs.
Cross-check dates for relevance.
Locate tombstone pages and extract IPOs.
Compile results into a centralized list, excluding deals that didn’t close.
From this workflow, it’s clear the AI system will need metadata for sector, deal type, and date. This metadata might come from file content, tags, or folder structures, depending on what’s available.
Address Common Challenges
As you proceed, you’ll likely encounter hurdles. Here are a few common ones and how to tackle them:
Iterate and Refine
With workflows and metadata structures in place, the real work begins: testing and iterating. Start small, gather user feedback, and refine your processes.
Key Takeaways
Focus on high-impact use cases first.
Embrace feedback to refine workflows.
Remember: The goal is progress, not perfection.
In the next article, we’ll delve into the technical details of indexing, tagging, and embedding files, as well as strategies for keeping your system up to date. But for now, the groundwork you’ve laid is the most critical step.
By aligning your file system with user workflows, you’re not just enabling AI search—you’re building the foundation for your firm’s future in data-driven decision-making.
More articles
Meet the Rogo team: Aidan Donohue, AI Engineer
Article
·
28/01/2025
Meet Aidan, one of Rogo's AI Engineers
Rogo and Crunchbase Partner to Supercharge Rogo’s AI Analyst
Article
·
16/01/2025
Rogo has announced a partnership with Crunchbase, a leading provider for private company intelligence, to support the development of Rogo’s financial generative AI platform.
Rogo Leverages Quartr API to Power First-of-its-Kind Analysis Tools for Investment Bankers
Article
·
08/01/2025
Rogo, the cutting-edge GenAI platform for finance, today announces a strategic partnership with Quartr, the leading provider of globally aggregated investor relations materials.