Anonymization (also sometimes referred to as de-identification) is the process of replacing certain values with random but similar values. For example, the identified name of “John Smith” may be replaced with “David Jones”, or an identified phone number of 123-555-9358 may be replaced by 842-436-2042. Anonymization is useful in instances where you want to remove PHI but the original document still needs to be useful for some other purpose.
Each individual filter has its own anonymization process. Anonymization for each filter can also be independently enabled and disabled. This mean, for example, that phone numbers can be anonymized while social security numbers are simply replaced with a placeholder such as *****.
Consistent anonymization refers to the process of always anonymizing the same values with the same values. For example, if the name “John Smith” is anonymized randomly as “Pete Baker”, all other occurrences of “John Smith” will also be replaced by “Pete Baker.” Consistent anonymization can be done on the document level or on the context level. When enabled on the document level, “John Smith” will only be replaced by “Pete Baker” in the same document. If “John Smith” occurs in a separate document it will be anonymized differently. When enabled on the context level, “John Smith” will be replaced by “Pete Baker” for all documents in the same context.
Consistent anonymization on the context level requires a cache service. If a single instance of Philter is running, its internal cache service is the best choice. If multiple instances of Philter are deployed together (either behind a load balancer for REST API or for clustered streaming), Philter requires access to a Redis cache service. See the Configuration on how to set up the Redis cache.