Cleardox FAQ | Answers about AI, security, integrations and pricing

How does Cleardox work?

Cleardox automatically detects and helps remove personal and sensitive information in documents using a combination of machine learning and rule-based detection. The tool is assisted and always requires user review before finalizing results.

What categories of sensitive data can Cleardox detect?

Cleardox can detect sensitive and personal data across 20+ categories, including names, email addresses, phone numbers, addresses, SSN numbers, company information, dates, account numbers, authorization IDs, and geographic data such as regions and municipalities.

The tool also identifies additional sensitive content such as financial data, license plates, titles, and confidential information across documents. Cleardox can also identify indirect personal data such as job title and organization, which in combination can be used to identify a person.

How good is Cleardox at detecting sensitive information?

Within our field, performance is typically measured using two metrics:

Recall - how well we identify the information that needs to be found
Precision - how accurately we identify it

Depending on the quality of your data, you can typically expect:

Recall: around 97%
Precision: around 90%

The reason precision is often slightly lower than recall is that we prefer to identify slightly too much rather than too little. In other words, we prioritize ensuring that sensitive information is not missed - even if it occasionally leads to slight over-anonymization.

Which languages does Cleardox support?

Cleardox supports all languages, although accuracy may vary depending on language complexity and available training data.

Can Cleardox make non-searchable PDFs readable (OCR)?

Yes. Using Optical Character Recognition (OCR), Cleardox can convert non-searchable PDFs (so-called flat PDFs) into machine-readable documents.

The process happens automatically, so the user does not need to take any action. Cleardox intelligently detects whether a page already contains machine-readable text. If no text is detected, the page is automatically processed with OCR, ensuring the document becomes readable and ready for further processing.

Do you support different file formats?

Yes, we support more than 50 different file formats. When uploading documents, you can choose whether they should be converted to PDF. This is the most common setup and is called Cleardox Convert. When Cleardox Convert is enabled, we automatically convert different file formats into PDF upon upload, ensuring a consistent format across all documents in the system.

What does it mean that Cleardox generates a new PDF?

When a document is uploaded (regardless of format), Cleardox automatically creates a new PDF version during processing.

This ensures that redactions are permanent (non-reversible) and that full control over metadata and sensitive content is maintained.

Can you upload a Word or Excel file and download it in the same format?

Yes – this is what we call native support. The term may sound technical, but it simply means that you can download the document in the same format as the one you uploaded to Cleardox. Our native support includes Word files (.docx) and Excel files.

Is there a limit to document size or volume?

No, there are no limits on the number of documents you can upload to Cleardox, and there is no page limit for individual documents. It is not uncommon for files with 10,000+ pages to be uploaded to the system. However, if you upload a document with 20,000 pages, you might want to grab an extra cup of coffee while it processes.

How does case sharing work in Cleardox?

Users can share cases internally with colleagues as either viewers or editors. External sharing can be enabled with additional security controls.

What are the support response times?

At Cleardox, we offer support directly within the system - and we always prioritize responding to inquiries as quickly as possible (we’re pretty quick to respond, if we do say so ourselves 😉).

Our support is available during normal business hours (09:00–17:00 CET), and support requests are categorized and prioritized based on their level of criticality.

How is support access to customer data handled?

Support access is strictly limited to authorized personnel only. Documents are transferred securely, and support staff have read-only access with no ability to modify data.

Does Cleardox have an API?

Yes. Cleardox provides an API that allows integration with external systems. Through the API, you can upload documents, retrieve anonymized/redacted documents, and return processed files back into your systems. This makes it easy to integrate Cleardox into existing workflows.

Does Cleardox use customer data to train AI models?

No. Customer data is not used for training or continuous model retraining unless explicitly agreed in writing. Data is only used for processing and anonymization/redaction of documents.

Can users create custom categories?

Yes. As a user, you can create your own categories with custom search criteria.

As an administrator, you can also create and maintain categories that are shared across your organization, so individual users do not need to recreate the same categories.

If a category is more complex and cannot be solved with simple search terms, our technical team can also help create it for you.

However, most customers find that the existing categories already cover their needs.

Why choose Cleardox?

At Cleardox, we work to make document redaction simpler, safer, and less time-consuming. Our goal is to reduce complexity and workload, so you can end your workday with peace of mind - knowing that your documents have been anonymized correctly.

Cleardox is built with a strong focus on usability, security, and collaboration, making it easier to work with even large document volumes.

What our customers especially highlight:

Cleardox is intuitive and easy to use
It is easy to collaborate with colleagues in the system
You can pseudonymize content – not just redact it
As a lawyer, you have access to all relevant features in one unified PDF solution
We provide support directly within the system

Does Cleardox pseudonymize documents?

Yes, Cleardox also pseudonymizes content.

This means that sensitive information can be replaced with consistent pseudonyms (e.g. P1, P2, V1, etc.). This ensures that the structure and context of the document are preserved.

You can choose between black redaction and pseudonymization, and this can be configured per category in the system.

What is the difference between pseudonymization and black redaction?

Pseudonymization removes sensitive information and replaces it with pseudonyms (e.g. P1, V1, Address 1, etc.), allowing documents to retain their context and become easier to read and work with. This is especially relevant when organizations want to reuse data containing valuable knowledge in internal processes such as knowledge bases or AI-related use cases.

Black redaction removes sensitive information and replaces it with black boxes.

Both methods ensure that sensitive information is removed and is no longer available in the document after anonymization.

Does Cleardox ensure consistent pseudonymization across documents?

Yes. Cleardox ensures consistent pseudonymization both across documents and across users within your organization.

This means that everyone works with the same pseudonymization logic, ensuring that the same type of information is always pseudonymized in the same way – regardless of who is working in Cleardox. You therefore avoid manually maintaining pseudonymization lists, where you would otherwise have to track that, for example, “Oliver” always equals P1 (Person 1) and “Louise” equals P2 (Person 2).

Cleardox provides a set of predefined pseudonyms, which you can customize and adjust as needed. Examples:

Persons: P1, P2, etc.
Companies: C1, C2, etc.
Emails: E1, E2, etc.
Social security numbers: SSN 1, SSN 2, etc.
Addresses: A1, A2, etc.
Phone numbers: Tel. 1, Tel. 2, etc.

The result is a more consistent, efficient, and secure pseudonymization process.

What technology does Cleardox use to detect sensitive information?

Cleardox uses a combination of rule-based detection and machine learning-based Named Entity Recognition. These methods can be used separately or together to maximize coverage.

Does Cleardox use generative AI or large language models (LLMs)?

No. Cleardox uses machine learning strictly for classification and extraction, not for generating text. This improves predictability and security.

Is Cleardox vulnerable to prompt injection or jailbreak attacks?

No. Since Cleardox does not accept prompts or generate text, it is not exposed to prompt injection or jailbreak-style attacks.

What machine learning models does Cleardox use?

Cleardox uses transformer-based models for named entity recognition that are trained on openly licensed and proprietary Cleardox datasets.

How accurate is Cleardox' AI?

Accuracy depends on category and document quality. Cleardox is optimized for high recall to minimize the risk of missing sensitive information.

How does Cleardox detect CPR numbers and similar identifiers?

CPR numbers are detected using rule-based patterns, including validation of date structures, which provides very high accuracy.

Can machine learning be disabled in Cleardox?

Yes. Machine learning can be disabled in specific configurations, for example to improve performance or meet special compliance requirements.

Does Cleardox use open-source technologies?

Yes. Cleardox is built using several open-source technologies as part of its software stack.

Where is Cleardox hosted?

Cleardox is hosted exclusively on European servers. There are no third-country data transfers.

How does Cleardox protect customer data?

All customer data is encrypted both in transit and at rest. Access is strictly controlled using role-based permissions.

Does Cleardox support Single Sign-On (SSO)?

Yes. Cleardox supports Single Sign-On via standard technologies such as SAML, OAuth, and LDAP, typically integrated with Active Directory.

How are user roles and permissions managed?

Cleardox supports both administrator and user roles. Permissions are typically managed via Active Directory or directly within Cleardox.

Can administrators access other users’ cases?

By default, administrators cannot access other users’ active cases. This can be configured if required.

What administrative features are available in Cleardox?

As an admin user, you have access to a range of additional features. These primarily include the ability to create, share, and maintain shared categories, exhibit stamps, templates, and header and footer templates.

How does Cleardox handle multi-factor authentication (MFA)?

Multi-factor authentication is typically handled via the customer’s identity provider, such as Microsoft Entra ID, or via Cleardox’s own supported OTP.

Is Cleardox a cloud-only solution, or can it also be hosted on-premise?

Yes – both options are available.

So far, all of our customers use our hosted cloud solution, as we have a secure setup in place and because most customers value access to continuous and fast product updates.

However, we also recognize that some organizations operate in industries with specific security or compliance requirements. Therefore, we also offer the option to install Cleardox on-premise on your own servers.

Please feel free to contact us to learn more.

How does Cleardox monitor security and detect threats?

Cleardox uses runtime threat detection, infrastructure monitoring, and centralized logging to detect and respond to security incidents.

How long are log files stored?

Log files are stored for 1 year and are then automatically deleted.

Where are encryption keys (KEK) stored?

Cleardox uses an application-managed secret service rather than a hardware HSM or cloud KMS. The KEK is held in an encrypted secrets file (secrets.txt) that is loaded into the secret service at startup. The file is protected by AES-256 encryption with a password-derived key (using PBKDF2), and each entry is HMAC-SHA256 checksummed for integrity. The secret service runs as an isolated container in the cluster with the secrets file mounted as a volume. The password to unlock the file is supplied at service startup via an environment variable and is not persisted to disk.

Who can access the KEK management service?

Access to the secret service is restricted at two layers:

Network layer: The service is not exposed publicly. It is reachable only from within the internal network, on a private port.
Transport layer: The service uses mutual TLS (mTLS). Clients must present a valid certificate signed by the internal CA to establish a connection. Only the Cleardox backend application holds such a certificate, it is generated inside the cluster.

Cleardox personnel cannot access the underlying host and the default access privileges do not allow container exec or secrets viewing permissions, meaning that elevated permissions are needed to access the encrypted key and its password.

Is KEK access restricted during runtime?

Yes. At runtime, only the backend application can issue requests to the secret service, enforced by mTLS client certificate validation - the service rejects any connection that does not present a matching certificate. There is no API exposed to end users or external parties.

Are decryption requests logged?

We don't log every request as the application continously requests decryption so it would be a flood of logs with low signal. We log in aggregate to alert on suspicious behavior like bursty usage or use of a rotated key.

Does Cleardox support key rotation?

Key rotation is supported by a versioned multi-key scheme. A new key is introduced and it will be used for new incoming projects and documents. While the service continues to decrypt data protected by older keys.

Documents on the platform get deleted once the project is completed or after a timelimit. We monitor a count of how many projects a using the old vs the new key and the old key gets decomissions once it has 0 usage. The system keeps the newest key always. The keys creation timestamp is recorded, and the service logs the age of each loaded key at startup.

How are updates deployed in Cleardox?

Updates are tested in staging environments before being rolled out gradually to production systems.

How stable is Cleardox in daily operation?

Cleardox is built for high availability with monitoring and alerting to ensure stable day-to-day operations.

Does Cleardox have an IT audit report from an independent third party?

Yes. Cleardox’ security and internal controls are documented in an ISAE 3000 Type II assurance report with a high level of assurance, which is ISO 27001 compliant.

The report describes our security processes, controls, and procedures related to the handling of customer data and the operation of the platform. You can download the report here.

Which systems does Cleardox integrate with?

Cleardox integrates with systems such as iManage, SharePoint, Acadre, Workzone, HighQ from Thomson Reuters, and other case and document management systems, either through direct integrations or via the Cleardox API. New integrations are continuously added and can easily be built by customers themselves using our external API.

How does Cleardox integrate with iManage and SharePoint?

Integrations use secure authentication flows. Cleardox only accesses documents that users already have permission to view in the source system.

How does Cleardox respect access rights in integrations?

Cleardox fully respects permissions from the source system. Users can only access and process documents they are authorized to view.

How does implementation work?

At Cleardox, you will be assigned a dedicated customer manager who will help you get started quickly and smoothly.

The onboarding process is tailored to your needs and can include:

System setup
Configuration of categories
Training in how to use the system

Implementation is usually very straightforward. Once users have been introduced to the system, we typically find that they quickly become self-sufficient without the need for additional training.

What is a user-based model in Cleardox?

In a user-based model, you pay per named user with access to the system. Pricing is tied to specific users regardless of usage volume.

What is a seat-based model in Cleardox?

In a seat-based model, you purchase a number of active users per month. Anyone in the organization can access the system, but only unique active users in a given month count as a seat.

What is the difference between seat-based and user-based pricing?

The seat model is based on active monthly usage and is flexible for fluctuating teams. The user-based model is based on fixed named users and is best for stable user groups.

What are the benefits of a seat-based model?

A seat-based model allows broad access without paying for all potential users. You only pay for those who actively use the system in a given month, making it flexible and cost-efficient.

How is pricing calculated in the seat model?

Pricing is based on the number of unique active users per month, regardless of how many different people use the system over time.

Which pricing model should I choose?

If usage is spread across many potential users, the seat model is usually best. If usage is limited to a defined group of users, the user-based model may be more suitable.

Does Cleardox offer fixed-price projects?

Yes. Cleardox offers fixed-price project engagements for larger, defined tasks such as data subject access requests (DSAR's). Pricing depends on document volume and complexity. Contact us for a tailored assessment.

Can I switch between pricing models?

Yes. You can switch models if your needs change over time.

Can the trial period be extended?

Yes. The standard trial period is typically two weeks, but extensions can be arranged if more evaluation time is needed.

What is the minimum subscription or project size at Cleardox?

Cleardox is typically offered either as a user-based model, a seat-based model, or as a project-based engagement.

The smallest setup will usually be either 1 seat, 3 named users, or a limited project based on a specific number of pages.

The exact price depends on your needs, data volume, and use case. Feel free to contact us to receive a tailored quote.

Frequently Asked Questions

All Your Questions, Answered

Security

FAQ: Frequently Asked Questions & Answers

Price

Cleardox Team

Contact Us

Get a free trial
contact@cleardox.io

Bag Elefanterne 1, 2. tv
1799 Copenhagen

CVR-nr. 40984992