KnowBe4 is committed to leveraging artificial intelligence (AI) to enhance cybersecurity training & phishing defense while maintaining the highest standards of data protection and privacy. This document outlines our approach to AI implementation and data usage, focusing on the following key areas:
- Robust Data Protection: We implement strict logical database separation in our multi-tenant architecture, safeguarding individual customer data integrity and privacy.
- Leveraging Pre-trained LLMs: Rather than training Large Language Models (LLMs) from scratch, we utilize existing models pre-trained on vast, general datasets by specialized companies.
- Customization for Human Risk Management: We will fine-tune these pre-trained LLMs using our data sets to enhance their performance in our context. We also employ prompt engineering techniques to guide LLM outputs for our specific use cases.
- Controlled Environment: All LLM operations occur within our own secure AWS environment, ensuring that data and model interactions remain under our direct control.
- Continuous Product Improvement: We use aggregated, de-identified data to improve and optimize our products, as detailed in our contracts and the explanation of “how we use your data” below.
How We Use Your Data
We use aggregated and de-identified data in the following ways:
- Product Enhancement: We analyze usage patterns to identify areas for improvement in our products, ensuring they remain effective against the latest cybersecurity threats.
- Service Optimization: Aggregated data helps us optimize our services, ensuring smooth operation and minimal downtime for all users.
- Personalization: We use de-identified data to tailor our products to your organization's specific needs, improving the relevance and effectiveness of our security training and simulations.
- Threat Intelligence: De-identified data from across our user base helps us stay ahead of emerging threats, allowing us to update our products proactively.
- AI-Driven Features: Our AI systems, including AIDA and PhishML, use this data to power advanced features like personalized training and automated phishing detection.
Examples of Aggregated Data Usage
| Dataset | Feature | Example Data |
|---|---|---|
| De-identified aggregated usage data | Personalized training assignments | user-123 failed 3 phishing simulations in the last 30 days |
| De-identified user responses | Knowledge refresher quizzes | 70% of users struggled with question-101 about password policies, indicating a need for reinforcement in this area |
| De-identified click data | Phishing simulation effectiveness | link-234 in phishing template-567 received a 15% click rate across all simulations |
| De-identified policy interaction data | Policy quiz generation | policy-890 was accessed by 60% of users but only 40% completed the associated quiz |
We're committed to transparency and will always keep our customers informed of significant changes in how we use data to improve our services. Our goal is to leverage AI responsibly to provide you with the most effective and personalized training experience possible while maintaining the highest standards of data protection. We understand the sensitivity of your data and are dedicated to using it in ways that comply with regulations and align with your expectations of privacy and security.
If you have any further questions about our AI implementation or data handling practices, please don't hesitate to contact your account representative or our privacy team. They can be reached by email at privacy@knowbe4.com.