Although not typical, it is an occasional necessity to store sensitive data in a website in such a way that it is reasonably well protected from prying eyes. On a recent project, we needed to implement a method for content editors to encrypt some parts of the nodes' content.
Configuring certain fields of a content type to be fully encrypted by delegating hard work to Encrypt and Field Encryption modules would have been one feasible way to go. The reason we did not choose this path is because fully encrypting a field would have prevented search indexing and practically disabled search for that field. In our instance, we had a long-text field, most of which was not sensitive data. It did contain, however, some parts that we preferred to not keep in the database as plain text. At the same time, we wanted to maintain search capabilities for the rest of the text.
To address the problem, we decided to introduce a custom text markup that editors can use to designate sensitive data in the text, which in turn would be encrypted before the corresponding field is saved into the database. Conversely, when the field is viewed, an input filter would find the encrypted parts, and decrypt them for display. Assuming we choose our special markup to be double curly brackets, the following would be an example of how editors would enter content.
"The suspect approached the door and entered {{2323}} on the keypad."
The numbers, in this case, would be encrypted in the saved content, but not the rest of the text.
Let's enumerate the steps we need to take to implement this functionality.
- The implementation delegates the actual encryption work to Encrypt module, so we need to have that module installed and configured.
- We will need to find the marked-up parts and encrypt them before content is saved in each node. hook_node_presave() would be adequate for this job.
- When we edit existing content, we will need to decrypt the text before it's displayed for editing.
- We will need a text filter that could find the encrypted parts, and decrypt them before the field is rendered.
Finding and encrypting marked-up text
As mentioned, this task will be performed by hook_node_presave(). This hook will find the markup we have added, encrypt the content, and replace it with the encrypted data. For the encrypted data, we will need to use a different markup so that we can later identify the encrypted parts in the text. For this purpose, we will use double square brackets (e.g. [[encrypted_text]]). Our implementation looks as follows.
/** * Implements hook_node_presave(). */ function mymodule_node_presave($node) { // Find text marked up with double curly brackets in the body field. if (!empty($node->body[LANGUAGE_NONE][0]['value'])) { preg_match_all('#\{\{(.*?)\}\}#', $node->body[LANGUAGE_NONE][0]['value'], $matches); if (!empty($matches)) { foreach ($matches[0] as $key => $match) { $plaintext = $matches[1][$key]; // Use Encrypt module for encrypting. $encrypted_data = encrypt($plaintext); // Replace the original text with the encrypted text and the new markup. $text = str_replace($match, '[[' . base64_encode($encrypted_data) . ']]', $text); } } return $text; } }
The encrypt() function returns a serialized array. We encode this as base64 in order to prevent possible conflicts with other markup. Back to our previous example, the saved field data will look something like this:
"The suspect approached the door and entered [[IEkgYmV0IHNhdnZ5IGRldmVsb3BlcnMgd2lsbCB0cnkgdG8gZGVjb2RlIHRoaXM6XQ==]] on the keypad."
The new markup allows for easy identification of the encrypted parts.
Editing existing content
When we edit a node, the form normally displays the actual data that's saved in the fields, which in our case would contain encrypted text. Depending on where we are editing the field, hook_form_alter() will most likely be a suitable place to run the decryption before the content is rendered in the form. The following implementation will check if there is a form element for the body field, and find any encrypted content in it.
/** * Implements hook_form_alter(). */ function mymodule_form_alter(&$form, $form_state, $form_id) { if (strpos($form_id, '_node_form') !== FALSE && !empty($form['body'][LANGUAGE_NONE][0]['#default_value'])) { preg_match_all('#\[\[(.*?)\]\]#', $form['body'][LANGUAGE_NONE][0]['#default_value'], $matches); if (!empty($matches[0])) { foreach ($matches[0] as $key => $match) { $cipher = $matches[1][$key]; $plaintext = decrypt(base64_decode($cipher)); $plaintext = '{{' . $plaintext . '}}'; $form['body'][LANGUAGE_NONE][0]['#default_value'] = str_replace($match, $plaintext, $form['body'][LANGUAGE_NONE][0]['#default_value']); } } } }
We need to add the original markup back to the text so that when the node is re-saved, the content will be encrypted again.
Implementing the text filter
Our text filter will look for the double square brackets and decrypt the content in between them. The first step to introducing a new filter is declaring it to Drupal using hook_filter_info().
/** * Implements hook_filter_info(). */ function mymodule_filter_info() { $filters = array(); $filters[‘mymodule_decrypt'] = array( 'title' => t('Decrypt markup'), 'description' => t('Decrypts encrypted content in the text.'), 'process callback' => ‘mymodule_decrypt_filter', 'cache' => FALSE, ); return $filters; }
The filter callback implementation is shown below.
/** * Callback for the custom filter. */ function mymodule_decrypt_filter($text, $filter, $format, $langcode, $cache, $cache_id) { preg_match_all('#\[\[(.*?)\]\]#', $text, $matches); if (!empty($matches[0])) { foreach ($matches[0] as $key => $match) { $cipher = $matches[1][$key]; $plaintext = decrypt(base64_decode($cipher)); $text = str_replace($match, $plaintext, $text); } } return $text; }
In this case, we do not need to add any markup because this filter is for display purposes. The rendered text will look as if no extra markup has been added to the content. Be sure to enable the new filter in the corresponding text formats.
This concludes our list of items to cover. The encryption markup is ready for use in the body field.