How to Anonymize Text Using Presidio in Python
Install presidio:
from presidio_analyzer import AnalyzerEngine
from presidio_anonymizer import AnonymizerEngine
analyzer = AnalyzerEngine()
anonymizer = AnonymizerEngine() Define anon function:
def anonymize_text(text, _analyzer, _anonymizer):
results = _analyzer.analyze(
text=text,
entities=["PHONE_NUMBER", "EMAIL_ADDRESS", "PERSON"],
language='en'
)
anonymized_text = _anonymizer.anonymize(
text=text,
analyzer_results=results
)
return anonymized_text.text Apply
convo_to_eval_sample['anonymized_convo'] = convo_to_eval_sample['conversation'].apply(
lambda text: anonymize_text(text, analyzer, anonymizer)
)