The potential for artificial intelligence systems to produce inaccurate, deceptive, or completely fabricated information is what makes hallucinations in AI dangerous. These delusions have the potential to undermine confidence in AI, particularly in fields where precise data is crucial, such as healthcare, finance, or law. Inaccurate AI outputs can lead to the spread of misinformation, misguide decision-making, and, in some cases, be exploited for malicious purposes, such as creating fake news or fraudulent documents. Moreover, finding and rectifying such inaccuracies can be tricky, as AI might provide seemingly reasonable yet wrong content. The risks of hallucinations thus underline the significance of providing transparency, accountability, and rigorous safeguards in AI systems to minimise harm and maintain public trust.
Murad Wagh, Director Sales Engineering at Snowflake spoke with V&D around addressing AI risks, how businesses can ensure fair and accurate models, various business strategies to tackle AI hallucinations and bias effectively and much more. Have a look:
V&D: How can businesses tackle challenges such as hallucinations, bias, transparency, and privacy when deploying AI?
Murad Wagh: Cooperation between industry and government is essential for effective regulation, especially as AI-driven software must comply with existing standards. For emerging AI use cases that were previously unfeasible, there is an increasing need for either industry-led self-regulation or government oversight. Addressing these challenges demands a collaborative approach that balances legitimate concerns with the promotion of smart regulations and thoughtful legislation.
At Snowflake, we prioritize transparency and bias prevention by ensuring AI models are auditable, fair, and transparent to minimize risks. Our initiatives also aim to reduce AI hallucinations by developing systems capable of recognizing when to refrain from answering a question. Achieving this requires a robust data strategy to ensure high-quality training data, ultimately leading to more accurate and reliable outcomes. AI hallucinations can also be mitigated by combining LLMs with private data sets requiring LLM customization (fine-tuning) or RAG. The RAG framework gives an LLM access to a specific knowledge base with the most up-to-date, accurate information before generating a response. Because there is no need to retrain the model, this approach cost-effectively extends the capability of any LLM to specific domains. By using the Snowflake platform’s rich foundation for data governance and management, which includes vector data type, developing and deploying an end-to-end AI app using RAG is possible without integrations, as is infrastructure management or data movement using three key features: Snowflake Cortex, Streamlit in Snowflake and Snowpark.
V&D: What is the difference between using enterprise data and public data for training AI models?
Murad Wagh: AI is only as effective as the quality of the data it relies on. Deploying AI without clean, centralized, and well-structured data is unlikely to succeed or deliver the desired outcomes.
Foundational LLMs, trained on the vast expanse of internet data, can address a wide array of topics but often compromise accuracy. In contrast, private enterprise LLMs operate within a narrower scope and, when developed properly, achieve greater precision by relying exclusively on enterprise-specific data.
Augmenting foundational LLMs with proprietary, meticulously vetted datasets allows companies to significantly enhance the accuracy of their AI deployments. This ensures these models serve as highly factual tools for critical tasks, such as engineering or product modeling, where precision is paramount.
V&D: What’s the biggest challenge enterprises face when implementing AI solutions?
Murad Wagh: Enterprises often possess vast amounts of data that are fragmented and dispersed across various systems. To implement an effective AI strategy, it is essential to first establish a robust data strategy. This includes breaking down data silos and ensuring all data is unified and accessible within a cohesive framework, thereby enabling consistency and quality.
A strong data foundation is more critical than ever, especially with the rise of generative AI, which allows datasets to be created at unprecedented speeds and tailored to specific business needs. Since data forms the backbone of AI, its reliability and effectiveness depend entirely on the quality and integrity of the underlying datasets.
In today’s landscape, every company functions as a data-driven entity, yet many lack the tools and infrastructure necessary to excel in this role. A unified source of truth is vital, enabling all stakeholders to seamlessly access and utilize data without excessive reliance on technical teams. Equally important is safeguarding and protecting data while identifying efficient ways to extract its full value, facilitating faster and more informed decision-making.
V&D: With the rapid adoption of GenAI, what specific security measures does Snowflake implement?
Murad Wagh: With the rapid adoption of generative AI, the need for robust governance and data protection has become paramount. Snowflake is committed to ensuring that enterprise AI solutions are both secure and efficient. A cornerstone of this approach is enabling customers to leverage AI models with their own data while maintaining full ownership and control. These models operate entirely within the customer’s account, ensuring that Snowflake neither accesses nor uses their data to enhance the model nor shares it with other customers.
Through Snowflake Cortex, customers retain complete control over their data. We provide pre-built Large Language Models (LLMs) that can be fine-tuned using the customer’s proprietary data, all within the secure confines of their account. No data ever leaves the customer’s environment. This design ensures that AI models interact with the data in place, eliminating the need for data movement and upholding governance and security at every stage. Additionally, with Snowflake Cortex AI with Cortex Guard, a new feature will enable enterprises to easily implement safeguards that filter out potentially inappropriate or unsafe large language model (LLM) responses. Cortex Guard introduces a foundational safety feature that further helps our customers feel confident moving from proof of concept to a production-ready gen AI application.
V&D: How does Snowflake ensure data privacy and security, especially for enterprises handling sensitive information in industries like finance?
Murad Wagh: The financial services industry operates within a highly regulated environment, making compliance with regulations essential. From its inception, Snowflake has prioritized security and privacy, providing tailored protections specifically designed for industries like financial services. All data is encrypted both at rest and in transit, forming the cornerstone of Snowflake's data security strategy. Beyond encryption, safeguarding sensitive information remains a critical focus.
Snowflake empowers banks, asset managers, insurers, payment providers, and other intermediaries to simplify data access, enable secure collaboration, and deploy AI solutions to address critical business challenges. For example, within organizations, teams can derive insights from data without exposing sensitive information by implementing policies such as data masking. This approach allows users to query data while restricting access to confidential details. Advanced techniques like differential privacy further enhance this capability, enabling insights without revealing underlying data, with raw data access limited to a few privileged users.
Snowflake’s clean rooms add another layer of security. If a query risks identifying individuals based on specific factors, results are withheld when the number of rows falls below a predefined threshold. This ensures privacy while delivering meaningful analytics.
Industry leaders such as AXA, BlackRock, Capital One, Refinitiv, Square, State Street, Western Union, and Wise are already part of the Snowflake AI Data Cloud for Financial Services.