Anonymization in the context of Philter is the process of replacing certain values with random but similar values. For example, the identified name of “John Smith” may be replaced with “David Jones”, or an identified phone number of 123-555-9358 may be replaced by 842-436-2042. Anonymization is useful in instances where you want to remove PHI/PII from text without changing the meaning of the text. Anonymization can be enabled for a type of PHI/PII in the filter profile by setting the filter strategy to
RANDOM_REPLACE. (See Filter Profiles for more information.)
How identified PHI/PII is anonymized is determined by the type of PHI/PII. Philter tries to generate random, but realistic values for each type. For example, for an identified first name, Philter will replace it with a randomly selected first name from an internally predefined list. A VIN number will be replaced by a 17 character randomly selected VIN number that adheres to the standard for VIN numbers.
Consistent anonymization refers to the process of always anonymizing the same values with the same values. For example, if the name “John Smith” is anonymized randomly as “Pete Baker”, all other occurrences of “John Smith” will also be replaced by “Pete Baker.” Consistent anonymization can be done on the document level or on the context level. When enabled on the document level, “John Smith” will only be replaced by “Pete Baker” in the same document. If “John Smith” occurs in a separate document it will be anonymized differently. When enabled on the context level, “John Smith” will be replaced by “Pete Baker” for all documents in the same context.
Consistent anonymization on the context level requires a cache service. If a single instance of Philter is running, its internal cache service is the best choice. If multiple instances of Philter are deployed together, Philter requires access to a Redis cache service. See the Configuration on how to set up the Redis cache.