Artificial Intelligence
Empowering users with Snowflake’s Connector for SharePoint
SharePoint is a commonly used document management platform for many organizations and while it is a hub of information that facilitates collaboration and knowledge sharing, it is not without its own set of challenges. A company’s SharePoint can be a maze of files, nested in various folders, each with different organization methods, and some contributors may no longer be with the company to answer questions about their work. Sifting through this corpus to find a particular document or answer a particular question can be frustrating and time consuming. This might cause people to avoid your knowledge library, wasting its value.
To tackle this exact issue, Snowflake has introduced its new Connector for SharePoint, which allows you to ingest files from SharePoint within your Snowflake account. This, combined with the new and included Cortex Search Service, enables organizations to significantly improve their document management processes using Generative AI (GenAI). The Connector for SharePoint is a managed solution that automatically ingests your unstructured data including PowerPoints, Word Documents, and PDFs, into a Cortex Search Service. It also comes with security features, limiting access based on controls set in SharePoint, a scalable framework, and the flexibility to build and deploy a UI to best fit your specific use case. The following blog post explores an AI Assistant we created on top of the Connector for SharePoint to help users answer questions about internal documents and track down materials more efficiently.
Using the SharePoint Connector to power an AI Assistant
Snowflake’s Connector for SharePoint easily facilitated our development of a Retrieval Augmented Generation (RAG) pipeline, which is a common GenAI design pattern for this type of use case. A RAG system works by leveraging a retriever to search through the data and return information related to the input query and a generator that uses the returned information and an LLM to produce a relevant and coherent answer for the user. For additional information on creating a RAG in Snowflake check out this blog post by our very own Elizabeth Khan and Shruti Misra here.
Under the hood of the Connector for SharePoint, Snowflake leverages Cortex Parse Document and Cortex Search services, which allows for the easy development and deployment of the retrieval step of the RAG pipeline. The Cortex Parse Document service handles the extraction of information from the documents within SharePoint by identifying text and structural elements of the documents and organizing them into tables that can then be leveraged by Cortex Search. Once the information is accessible, Cortex Search generates the embeddings of the data and manages parameter tuning so that information can be efficiently queried and compared to an input question.
The generation component of the pipeline is then built out using the Cortex Complete function. This function uses a pretrained LLM, such as Llama 3.1, Snowflake Arctic, or Gemma, along with a prompt to develop the answer for the user. To ensure the generated response is accurate and relevant to the user’s questions, we augment the prompt with the information gathered by the retriever prior to sending it to the LLM.
This complete system is summarized in the image below with sections highlighted to indicate Snowflake features that help reduce the lift for deployment.
The RAG chain is orchestrated through a Streamlit in Snowflake App, where the user inputs their question. The query is then sent to the Cortex Search Service where relevant content is retrieved and sent to the Cortex LLM Function (Complete) to generate the final response. Document citation links to SharePoint are also returned so users can navigate directly to the SharePoint documents used to answer their question. The overall steps in the architecture are as follows:
- Content and metadata is extracted from documents in SharePoint. This is done via Cortex Parse Document on the backend and can be set up to refresh as daily, weekly, or monthly intervals.
- The user inputs their question within the Streamlit UI. The UI leverages the new Streamlit chat elements and tracks session information to identify the current user’s permissions.
- Data is queried for content related to user’s question using the Cortex Search Service that is set up by the Connector and the data that is extracted from step 1.
- Relevant content and document links are returned. The Connector for SharePoint extracts content from the SharePoint documents as well as metadata, such as the specific link to the document, which can then be passed through the rest of the workstream.
- A prompt is augmented with returned content and then sent to the Cortex Complete function. Prompt augmentation enables the pretrained LLM to compose a more relevant and accurate answer.
- AI generates a response and it is returned to the Streamlit UI along with the SharePoint links.
In the below example, a user accesses the Streamlit app and asks questions about different types of training offerings from Aimpoint to help prepare a sales deck for an upcoming call.
Some additional features that can be incorporated in this system to better fit your particular use case include filtering the search corpus based on a given user’s permissions in SharePoint, modifying the number of documents to return, and tuning the LLM selection and prompt to best suit your organization’s needs.
Enable your team today
This self-service AI assistant application is a great example of how our team can help implement more productivity engineering across your organization. In this case, we leverage GenAI tools to accelerate finding the right resources and materials to enable users to be more productive, but there are countless other productivity engineering solutions that can transform your organization by saving time, streamlining workflows, and enabling more effective work.
With our expertise, we can design and deploy these solutions to maximize efficiency, optimize processes, and drive measurable results—helping your business stay ahead in a competitive landscape.
Interested in meeting with one of our experts to learn more? Reach out here!
Who are we?
Aimpoint Digital is a market-leading analytics firm at the forefront of solving the most complex business and economic challenges through data and analytical technology. From integrating self-service analytics to implementing AI at scale and modernizing data infrastructure environments, Aimpoint Digital operates across transformative domains to improve the performance of organizations. Connect with our team and get started today.