Setting a Dummy PDF in a Database Column for Masking
This guide provides detailed steps for setting a dummy PDF in a database column. The objective is to mask data by replacing it with a predefined file. This guide describes the process for a PDF file, but this will work for all kinds of binary files as JPG, PNG, ZIP, etc.
Overview
The process involves creating and encoding a dummy file, setting it in the database via a modification method, and integrating it into a modification set. This approach ensures data masking while maintaining database integrity.
Steps
1. Create the Dummy File
Create the file (e.g., dummy.pdf) that will replace sensitive data in the target database column.
2. Convert the File to Base64 Representation
Use a Base64 encoder to convert the dummy file into a text-based format.
This allows the file content to be embedded in a script.
This can be done with the command line tool base64, Notepad++ or with an online tool.
3. Create an XDM File of Type Text
-
Create an XDM file.
-
Upload the Base64 representation of the dummy file into the XDM file.
4. Develop a Modification Method in Groovy
-
Add a parameter to the modification method:
-
Name:
fileName -
Display Name:
File Name -
Type:
String
-
-
Use the following code in the
init,apply, andclosemethods:
import org.apache.commons.codec.binary.Base64
import javax.sql.rowset.serial.SerialBlob
import de.ubs.xdm3.script.evaluator.ScriptValidationException
fileContentBinary = null
/*
* This function is called once before the processing starts.
*/
void init() {
// Retrieve the Base64 content of the dummy file
file = api.files.find{ modificationMethod.fileName.equals(it.displayName)}
if (file.equals(null)) {
throw new ScriptValidationException("File " + modificationMethod.fileName + " could not be found.")
} else {
// Decode the Base64 content to binary
fileContentBase64 = file.content
fileContentBinary = Base64.decodeBase64(fileContentBase64)
}
}
/*
* This function modifies each row of the table.
*/
def apply() {
// Convert to Blob for binary storage in database
data[columnIndex] = new SerialBlob(fileContentBinary)
return true
}
/*
* This function is executed after the table modification is complete.
*/
def close() {
// Perform any necessary cleanup (if required)
}
The ByteArrayWrapperBlob class is a utility class that converts a byte array to a Blob object.
This is necessary, if the database column is represented as BLOB in the data process.
If it represented as byte[], the Wrapper is not necessary.
Please compare the data type mappings in Data Types Mappings.