In your example the ending of the strings is equal, but how do you determine that the two strings are equal enough?
Are they equal when one word is different, or is this different for different strings?
Once you determine the rule for strings to be equal enough you can setup a function that determines equal enough strings by searching and string manipulation. And you may be able to get to a point that the variants of the items can be related to the other item. If this is a consistent pattern you might create the function to e.g. remove the first word and so consolidate automatically. If there is little consistency then manual action might still be needed.
With java you can find the matching words in two strings and then determine the action you would like to take. The example below could help you get started:
import java.io.*;
class CommonWords
{
public static void main (String args[])throws IOException
{
BufferedReader br=new BufferedReader(new InputStreamReader (System.in));
int i,j,l1,l2,p,x,y;
String str1, str2;
char ch;
System.out.print("Enter two sentences terminated by either a '?', '.' or '!' : ");
str1 = br.readLine();
str2 = br.readLine();
l1= str1.length();
l2= str2.length();
String s1[] = new String[l1];
x=0;
p=0;
//Store all the words of the first sentence in a string array
for(i=0;i< l1;i++)
{
ch = str1.charAt(i);
if(ch == ' ' || ch == '?' || ch == '.' || ch == '!'){
s1[x++]=str1.substring(p,i);
p = i+1;
}
}
String s2[] = new String[l1];
y=0;
p=0;
//Store all the words of the second sentence in a string array
for(i=0;i< l2;i++)
{
ch = str2.charAt(i);
if(ch == ' ' || ch == '?' || ch == '.' || ch == '!'){
s2[y++]=str2.substring(p,i);
p = i+1;
}
}
//Now compare the words stored in two arrays
for(i=0;i< x;i++){
for(j=0;j< y;j++){
//If match is found, print the word and store blank space to avoid repetition
if(s1[i].equalsIgnoreCase(s2[j]) && !s2[j].equals(" ")){
System.out.println(s1[i]);
s2[j]=" ";
}
}
}
}
}
you could wrap this in a java action: https://karussell.wordpress.com/2011/04/14/longest-common-substring-algorithm-in-java/
I highly recommend you to create a mapping table, making it a laborious task, but simple, traceble and predictable:
Add power cable - high voltage → Output: power cable - high voltage
Renew power cable - high voltage → Output: power cable - high voltage
Leggen Kabel (LS) → Output: Kabel (LS)
Verwijderen Kabel (LS) → Output: Kabel (LS)
Vervangen Kabel (LS) → Output: Kabel (LS)
Leggen Kabel (LS) → Output: Kabel (LS)
Verwijderen Kabel groen (LS) → Output: Kabel (LS)
Vervangen Kabel met een leuk patroon (LS) → Output: Kabel (LS)
If you make a custom function for this, than it will be hard to ensure that you will not get some unexpected result in one or two of the objects in your large dataset. The fact that you will be processing a large number of objects, should not stop you from this, since you need be certain of your results. You don’t want an object getting grouped to a wrong WBS-element with unknown financial concequenses in SAP.
If you decide to create a custom function, you should extensively test it.
I’d try something like this: create a microflow that receives a list of objects with your string attribute. Use java action StringSplit (in CommunityCommons) with space as delimiter on the first. You then have a list of SplitItems. Then loop through the others to do the same StringSplit and in a nested loop loop through your first list of SplitItems to see if the words also occur in the new list. If not, remove from the first list. In the end the first list should only have SplitItems (words) that occur in all input objects. Then concatenate them back into a string.