How can I use the regex pattern with the exclude function for filtering out unwanted data from a dataset?

342 Asked by DorineHankey in Salesforce , Asked on Apr 18, 2024

I am currently working on a text-processing project in which I need to extract specific information from large datasets by using the regular expression. How can I use a regex pattern with the exclude function for filtering out unwanted data and only capture the desired information?

Answered by Deepa bhawana

In the context of Salesforce, here is the approach given:-

Let us consider a scenario where you have a text dataset that contains email addresses and you want to extract only email addresses that do not end with “.gov”. Here is how you can achieve this by using a negative lookahead assertion on the regex pattern:-

Import java.util.regex.Matcher;

Import java.util.regex.Pattern;

Public class RegexExample {

    Public static void main(String[] args) {

        String text = “Emails: john.doe@example.com, jane_smith@gmail.com, admin@example.gov”;

        // Define the regex pattern to match email addresses not ending with “.gov”

        String regexPattern = \b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.(?!gov\b)[A-Za-z]{2,}\b;

        // Compile the pattern and create a matcher object

        Pattern pattern = Pattern.compile(regexPattern);

        Matcher matcher = pattern.matcher(text);

        // Find and print matching email addresses

        While (matcher.find()) {

            System.out.println(“Match: “ + matcher.group());

        }

    }

}

Here is the example given by using the Python programming language

Import re

# Function to extract email addresses not ending with “.gov” from a given text

Def extract_emails(input_text):

    # Define the regex pattern to match email addresses not ending with “.gov”

    Regex_pattern = r’[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+.(?!gov)[A-Za-z]{2,}’

    # Find all matching email addresses

    Matches = re.findall(regex_pattern, input_text)

    Return matches

# Read text from a file (e.g., emails.txt)

Def read_text_from_file(file_path):

    With open(file_path, ‘r’) as file:

        Text = file.read()

    Return text

# Example usage

If __name__ == “__main__”:

    # Read text from a file (you can replace ‘emails.txt’ with your file path)

    Input_text = read_text_from_file(‘emails.txt’)

    # Extract email addresses not ending with “.gov”

    Extracted_emails = extract_emails(input_text)

    # Print the extracted email addresses

    Print(“Email addresses not ending with ‘.gov’:”)

    For email in extracted_emails:

        Print(email)

Here is the example given by using HTML:-

Email Extraction

[removed]

    Function extractEmails() {

        Var inputText = document.getElementById(‘inputText’).value;

        Var regexPattern = /[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+.(?!gov)[A-Za-z]{2,}/g;

        Var extractedEmails = inputText.match(regexPattern);

        // Display extracted email addresses

        Var outputDiv = document.getElementById(‘outputDiv’);

        outputDiv[removed] = “Email addresses not ending with ‘.gov’:”;

        if (extractedEmails && extractedEmails.length > 0) {

            for (var I = 0; I < extractedEmails xss=removed xss=removed>

[removed]

Email Extraction

Enter text containing email addresses:


            
               Your Answer
            
                           
                  
                  
                                          
                                                                   
                     
                        
                        
                     
                                                                                       
                           
                           
                           Email me when someone reply to thread


         
         
         
         
         

	Categories
	
		
			
									
						 Salesforce (1353) 													
																	
											Salesforce Lightning (25)
																			
																	
											Development (82)
																			
															
											
									
													Business Analyst (260)
																	
									
						 QA Testing (437) 													
																	
											Manual Testing (45)
																			
																	
											Automation Testing (71)
																			
																	
											Selenium (44)
																			
															
											
									
													AWS (427)
																	
									
													SQL Server (1349)
																	
									
						 Data Science (764) 													
																	
											Machine Learning (122)
																			
																	
											Natural Language Processing (117)
																			
																	
											Deep Learning (2)
																			
																	
											R (123)
																			
															
											
									
						 Devops (497) 													
																	
											Ansible (4)
																			
																	
											Docker (20)
																			
																	
											Nagios (27)
																			
																	
											Git (27)
																			
																	
											Maven (4)
																			
																	
											Linux (26)
																			
																	
											kubernetes (16)
																			
															
											
									
													Tableau (217)
																	
									
													Big Data Hadoop (35)
																	
									
						 Python (640) 													
																	
											Angular (36)
																			
																	
											HTML (9)
																			
																	
											Module (22)
																			
															
											
									
													Java (573)
																	
									
													Business Intelligence (8)
																	
									
													Cyber Security (835)
																	
									
													Power BI (22)
																	
									
													Spark (11)
																	
									
													Web-development (63)
																	
									
													Artificial intelligence (75)
																	
									
													Android App Development (6)
																	
									
													azure (12)
																	
									
													Digital Marketing (12)
																	
							
		
	
	
		
			Download Free eBooks
		
				
		
	
	
		
			
				Demo Classes Available			
			
		
	
	
		
			
			JanBask
eSchool