sanityze.spotters¶
Module Contents¶
Classes¶
The Spotter interface to be implemented |
|
The Credit Card Spotter Subclass |
|
The Email Spotter Subclass |
- class sanityze.spotters.Spotter(uid: str, hashSpotted=False)[source]¶
The Spotter interface to be implemented
- uid¶
uid of the spotter
- Type:
str
- hashSpotted¶
False by default, whether to hash or replace the spotted sensitive information
- Type:
bool, optional
- process(text)[source]¶
process the text depending on the hashSpotted value, if it is hash, replace it with hash otherwise, replace it with some default value
Examples
Spotter should be initialized in a subclass level, therefore, skipping examples in the parent class >>>
- getSpotterUID() str[source]¶
Getting the spotter uid
- Returns:
self.uid – the spotter uid
- Return type:
str
Examples
>>> sub_spotter.getSpotterUID() "<sub class spotter UID>"
- isHashSpotted() bool[source]¶
Getting the value of hashSpotted
- Returns:
self.hashSpotted – the Truth value of hashSpotted
- Return type:
bool
Examples
>>> sub_spotter.isHashSpotted() TRUE
- process(text: str) str[source]¶
Process the given text, if hashSpotted is True, replace the spotted text with hash, otherwise, replace the spotted text with some default values
- Parameters:
text (str) – The text to be spotted & modified
- Returns:
new_text
- Return type:
str
Examples
>>> df = pd.DataFrame(data = {'product_name': ['laptop', 'printer foo@gaga.com', 'tablet', 'desk 5555 5555 5555 4444', 'chair'], 'price': [1200, 150, 300, 450, 200]}) >>> c = Cleanser() >>> c.clean(df, verbose=False) product_name price 0 laptop 1200 1 printer EMAILADDRS 150 2 tablet 300 3 desk 5555 5555 5555 4444 450 4 chair 200
- class sanityze.spotters.CreditCardSpotter(uid: str, hashSpotted=False)[source]¶
Bases:
SpotterThe Credit Card Spotter Subclass
- uid¶
uid of the spotter, “CREDITCARD”
- Type:
str
- hashSpotted¶
False by default, whether to hash or replace the spotted sensitive information
- Type:
bool, optional
- isHashSpotted()¶
return whether the hashSpotted is True or False
- process(text)[source]¶
process the text depending on the hashSpotted value, if hashSpotted is True, replace the spotted credit card number with hash otherwise, replace the spotted credit card number with some default value
Examples
>>> CreditCardSpotter("CREDITCARDS",True) <sanityze.spotters.CreditCardSpotter object at 0x000001207F7B5880>
- getSpotterUID() str[source]¶
Getting the credit card spotter uid
- Returns:
“CREDITCARD” – a fixed str value for CreditCardSpotter
- Return type:
str
Examples
>>> cc = CreditCardSpotter("CREDITCARDS",True) >>> cc.getSpotterUID() CREDITCARD
- process(text: str) str[source]¶
Process the given text, if hashSpotted is True, replace the spotted credit card number with hash, otherwise, replace the spotted credit card number with some default values
- Parameters:
text (str) – The text to be spotted & modified
- Returns:
new_text – the text with credit card number replaced by a hash or the default string value
- Return type:
str
Examples
>>> cc = CreditCardSpotter("CREDITCARDS", False) >>> cc.process("4556129404313766") CREDITCARD
- class sanityze.spotters.EmailSpotter(uid: str, hashSpotted=False)[source]¶
Bases:
SpotterThe Email Spotter Subclass
- uid¶
uid of the spotter, “EMAILADDRS”
- Type:
str
- hashSpotted¶
False by default, whether to hash or replace the spotted sensitive information
- Type:
bool, optional
- isHashSpotted()¶
return whether the hashSpotted is True or False
- process(text)[source]¶
process the text depending on the hashSpotted value, if hashSpotted is True, replace the spotted email with hash otherwise, replace the spotted email with some default value
- getSpotterUID() str[source]¶
Getting the email spotter uid
- Returns:
“EMAILADDRS” – a fixed str value for EmailSpotter
- Return type:
str
Examples
>>> ee = EmailSpotter("EMAILS", False) >>> ee.getSpotterUID() EMAILADDRS
- process(text: str) str[source]¶
Process the given text, if hashSpotted is True, replace the spotted email with hash, otherwise, replace the spotted email with some default values
- Parameters:
text (str) – The text to be spotted & modified
- Returns:
new_text – the text with email replaced by a hash or the default string value
- Return type:
str
Examples
>>> ee = EmailSpotter("EMAILS", False) >>> ee.process("abcd1234@gmail.com") EMAILADDRS