Dynamic Data Masking
Dynamic Data Masking (DDM) obscures sensitive information in query results while you maintain access to the masked columns.
DDM presents data in an obscured form during SELECT queries without altering the stored data.
This feature protects against accidental data exposure.
|
Users with direct access to SSTable files can read the clear data.
HCD disables DDM by default.
Set the |
How masking functions work
DDM uses CQL native functions to obscure sensitive information.
You can apply these functions directly in SELECT queries or attach them permanently to table columns.
| Function | Description | Example |
|---|---|---|
|
Replaces the value with null. The function returns a non-existent column, not a not-null column representing a null value. |
|
|
Replaces the value with a fixed default value of the same type.
The function converts text values to |
|
|
Replaces the value with the specified replacement. The replacement must have the same type as the original value. |
|
|
Replaces characters in the middle of text, varchar, or ascii values.
The function exposes the specified number of prefix and suffix characters.
The function uses |
|
|
Replaces characters at the beginning and end of text, varchar, or ascii values.
The function masks the specified number of prefix and suffix characters.
The function uses |
|
|
Returns a blob containing the hash of the value. The function uses SHA-256 by default. You can specify an alternative algorithm supported by your Java security provider. |
|
Apply masking functions in queries
You can use masking functions in SELECT queries to view obscured data.
-
Create a table with sensitive data:
CREATE TABLE customer_orders ( order_id timeuuid PRIMARY KEY, customer_email text, order_date date ); -
Insert sample data:
INSERT INTO customer_orders(order_id, customer_email, order_date) VALUES (now(), 'user_one@shop.com', '2024-01-15'); INSERT INTO customer_orders(order_id, customer_email, order_date) VALUES (now(), 'user_two@shop.com', '2024-02-20'); -
Query the data with masking functions:
SELECT mask_inner(customer_email, 4, null), mask_default(order_date) FROM customer_orders;Resultsystem.mask_inner(customer_email, 4, NULL) | system.mask_default(order_date) --------------------------------------------+---------------------------------- user_t*********** | 1970-01-01 user_o*********** | 1970-01-01
Attach masking functions to columns
You can attach masking functions permanently to table columns to automatically mask data in all SELECT queries.
Users cannot see that you masked a column unless they examine the table definition.
Define column masks during table creation
Use the MASKED WITH clause to apply masking functions when you define columns.
Specify masking functions in the CREATE TABLE statement.
CREATE TABLE customer_orders (
order_id timeuuid PRIMARY KEY,
customer_email text MASKED WITH mask_inner(4, null),
order_date date MASKED WITH mask_default()
);
When you attach a function to a column, omit the first argument.
HCD automatically uses the column value as the first argument.
For example, mask_inner requires three arguments in a SELECT query but only two when you attach it to a column.
You cannot apply masking functions to PRIMARY KEY columns. Apply masking only to non-PRIMARY KEY columns.
-- This fails with a syntax error:
CREATE TABLE users (
id int PRIMARY KEY MASKED WITH mask_default(),
email text
);
-- Apply masking to non-PRIMARY KEY columns:
CREATE TABLE users (
id int PRIMARY KEY,
email text MASKED WITH mask_default()
);
Query tables with masked columns
Insert data normally into tables with masked columns. HCD stores the data without altering it.
INSERT INTO customer_orders(order_id, customer_email, order_date) VALUES (now(), 'user_one@shop.com', '2024-01-15');
INSERT INTO customer_orders(order_id, customer_email, order_date) VALUES (now(), 'user_two@shop.com', '2024-02-20');
Query the table to see the masked data. HCD automatically applies the masking function to the column values.
SELECT customer_email, order_date FROM customer_orders;
customer_email | order_date
-------------------+------------
user_o*********** | 1970-01-01
user_t*********** | 1970-01-01
You can update the masking function applied to a column at any time without dropping and recreating the table.
Modify column masks
You can modify existing column masks to change the masking function or remove masking entirely.
Change the masking function on a column using ALTER TABLE:
ALTER TABLE customer_orders ALTER customer_email
MASKED WITH mask_default();
Remove a masking function from a column using ALTER TABLE with DROP MASKED:
ALTER TABLE customer_orders ALTER customer_email
DROP MASKED;
Manage masking permissions
You can control who sees unmasked data using the UNMASK and SELECT_MASKED permissions.
Enable authentication
|
Without authentication, all users have the |
To control access to masked data, you must enable authentication, if you haven’t done so already:
-
Edit the
cassandra.yamlfile:authenticator: class_name: com.datastax.cassandra.auth.AdvancedAuthenticator parameters: enabled: true default_scheme: internal -
Restart HCD.
-
Connect with the default credentials or create an admin user.
Default behavior
HCD creates ordinary users without the UNMASK permission.
These users see masked values in query results.
HCD automatically grants the UNMASK permission to superusers, who see unmasked values in query results.
Grant UNMASK permission
Grant the UNMASK permission to users who need to see clear data.
To grant the UNMASK permission, do the following:
-
Create a table with masked columns:
CREATE TABLE customer_orders ( order_id timeuuid PRIMARY KEY, customer_email text MASKED WITH mask_inner(4, null), order_date date MASKED WITH mask_default() ); -
Insert sample data:
INSERT INTO customer_orders(order_id, customer_email, order_date) VALUES (now(), 'user_one@shop.com', '2024-01-15'); INSERT INTO customer_orders(order_id, customer_email, order_date) VALUES (now(), 'user_two@shop.com', '2024-02-20'); -
Create a privileged user with
UNMASKpermission:CREATE USER admin_user WITH PASSWORD 'secure_pass_123'; GRANT SELECT ON TABLE customer_orders TO admin_user; GRANT UNMASK ON TABLE customer_orders TO admin_user; -
Create an unprivileged user without
UNMASKpermission:CREATE USER report_user WITH PASSWORD 'secure_pass_456'; GRANT SELECT ON TABLE customer_orders TO report_user;
The privileged user sees unmasked data:
LOGIN admin_user
SELECT customer_email, order_date FROM customer_orders;
customer_email | order_date
--------------------+------------
user_one@shop.com | 2024-01-15
user_two@shop.com | 2024-02-20
The unprivileged user sees masked data:
LOGIN report_user
SELECT customer_email, order_date FROM customer_orders;
customer_email | order_date
-------------------+------------
user_o*********** | 1970-01-01
user_t*********** | 1970-01-01
Revoke UNMASK permission
Revoke the UNMASK permission like any other permission:
REVOKE UNMASK ON TABLE customer_orders FROM admin_user;
Restrict WHERE clause usage
Only users with the UNMASK permission can use masked columns in the WHERE clause of a SELECT query.
This restriction prevents malicious users from discovering clear data through exhaustive queries.
Users without the UNMASK permission receive an error when they attempt to use masked columns in WHERE clauses.
CREATE USER external_user WITH PASSWORD 'secure_pass_789';
GRANT SELECT ON TABLE customer_orders TO external_user;
LOGIN external_user
SELECT customer_email, order_date FROM customer_orders WHERE customer_email = 'user_one@shop.com' ALLOW FILTERING;
Unauthorized: Error from server: code=2100 [Unauthorized]
message="User external_user has no UNMASK nor SELECT_UNMASK permission on table k.customer_orders"
Use SELECT_MASKED permission
Grant the SELECT_MASKED permission to trusted applications that need to query masked columns in WHERE clauses while still seeing masked data in results.
This permission helps when a trusted application extracts masked data for display to untrusted end users.
CREATE USER api_service WITH PASSWORD 'secure_pass_abc';
GRANT SELECT, SELECT_MASKED ON TABLE customer_orders TO api_service;
LOGIN api_service
SELECT customer_email, order_date FROM customer_orders WHERE customer_email = 'user_one@shop.com' ALLOW FILTERING;
customer_email | order_date
-------------------+------------
user_o*********** | 1970-01-01
Create custom masking functions
You can create user-defined functions (UDFs) to implement custom masking logic.
Requirements for custom masking functions
Custom masking functions must:
-
Belong to the same keyspace as the masked table.
-
Accept at least one argument.
-
Accept the masked column’s type as the first argument.
-
Return values of the same type as the masked column.
Create and use a custom masking function
-
Create a custom masking function:
CREATE FUNCTION hide_email(input text) CALLED ON NULL INPUT RETURNS text LANGUAGE java AS 'return "[EMAIL_HIDDEN]";'; -
Create a table that uses the custom function:
CREATE TABLE customer_orders ( order_id timeuuid PRIMARY KEY, customer_email text MASKED WITH hide_email(), order_date date );
|
Creating a custom masking function creates a dependency between the table schema and the function. Drop the mask from the column before you drop the function. HCD removes the dependency when you drop the column, table, or keyspace. You cannot use aggregate functions as masking functions. |
Mask date values with custom functions
Use LocalDate.fromDaysSinceEpoch() to create custom date masking functions.
The UDF context does not support the fromYearMonthDay() method.
CREATE FUNCTION mask_date_to_epoch(input_date date)
CALLED ON NULL INPUT
RETURNS date
LANGUAGE java
AS 'return com.datastax.driver.core.LocalDate.fromDaysSinceEpoch(0);';
|
Use the built-in |