Data redaction
Overview
The Data Redaction feature is designed to automatically remove sensitive information from the node-chat
application at regular intervals to comply with data retention policies and privacy concerns. This feature runs as a cron job named 'data-redaction-task' every day at midnight, soft-deleting or redacting specified data older than a customizable retention period. As this is a soft delete, files or other data assigned to a chat are not deleted; this is handled as a dedicated feature.
Activation
To activate the Data Redaction feature, you need to define an environment variable in node-chat
:
Environment Variable:
DATA_RETENTION_IN_DAYS
Description: Specifies the number of days that data should be retained before being redacted. Data older than this specified duration will be redacted.
Redaction Process
The feature runs a daily cron job that performs the following actions on data older than the defined retention period:
Soft Delete Chats: Mark chats as soft deleted instead of hard deleting them, which allows the data to be kept for analytics and metrics purposes.
Redact Specific Fields: To comply with data privacy, specific sensitive fields in chats and messages are emptied or sanitized, including:
Chat:
Title: Set to empty.
Feedback:
Text: Set to empty.
Additional Info: Set to empty.
Short Term Memory:
Data: Set as an empty object.
Messages:
Text: Set to an empty string.
Original Text: Set to an empty string.
GPT Request: Set to an empty object.
Debug Info: Set to empty while retaining
userAgent
andchosenModule
.
Delete Benchmarks: Remove benchmark data older than the retention date to ensure outdated performance metrics are not stored.
Key Points
Customizable Retention Duration: The retention duration is set using the
DATA_RETENTION_IN_DAYS
environment variable, allowing the feature to adapt to different compliance or internal data retention policies.Soft Deletion for Analytics: Instead of hard deleting data, this feature soft-deletes sensitive information, enabling the data to be used for benchmark and performance metrics analysis while maintaining user privacy.
File Retention: Files assigned to a chat are not deleted.
Dedicated File Retention Feature: File retention is handled as a dedicated feature: Set file retention of uploads in a chat
Example Usage
To activate the feature and set the retention period to 30 days, add the following environment variable:
export DATA_RETENTION_IN_DAYS=30
This will ensure that all data older than 30 days is redacted as per the rules defined above.
Considerations
The redaction process runs daily at midnight, ensuring that the latest data compliance policies are applied regularly.
Soft deletion keeps anonymized data available for analytics purposes, so any analysis of benchmarks and metrics remains unaffected.
This feature helps comply with privacy regulations by limiting the amount of sensitive information retained unnecessarily.
For any issues or to further customize the behavior of this feature, please refer to the node-chat
source code or contact an administrator.
Author | @Sebastien Barbier |
---|
© 2024 Unique AG. All rights reserved. Privacy Policy – Terms of Service