Senior Member
Join Date: Nov 2011
Location: South Riding, VA
Posts: 841
|
Is there an intrinsic function to count how many times a substring appears in a given string?
|
#1 |
Senior Member
Volunteer Data File Contributor
Join Date: Jan 2010
Location: Chicago, IL (USA)
Posts: 10,729
|
Not that I am aware of. I guess you could "simulate" this by using pos() function, loop and keep cutting the string down in size using mid() until pos() returned -1.
Have to say I think that would be "really" high CPU as HL does not do great with string calculations... Hero Lab Resources: Pathfinder - d20pfsrd and Pathfinder Pack Setup 3.5 D&D (d20) - Community Server Setup 5E D&D - Community Server Setup Hero Lab Help - Hero Lab FAQ, Editor Tutorials and Videos, Editor & Scripting Resources. Created by the community for the community - Realm Works kickstarter backer (Alpha Wolf) and Beta tester.- d20 HL package volunteer editor. |
#2 |
Senior Member
Join Date: Nov 2011
Location: South Riding, VA
Posts: 841
|
That is what I was afraid of.
I am looking at if there is a feasible way to implement something like Boyer-Moore (which is O(n) and I think is the standard for regex type searches) or other fast search algorithms using the scripting language. The one saving grace is that in HL we typically do not have long strings to search through. |
#3 |
Senior Member
Lone Wolf Staff
Join Date: May 2005
Posts: 13,213
|
See if you can resolve this by assigning tags to represent whatever you're testing for, and then counting the tags.
|
#4 |
Senior Member
Join Date: Nov 2011
Location: South Riding, VA
Posts: 841
|
I don't think tag would work here. I am looking to parse a string with some delimiter between items and convert that into a properly formatted list ending with the proper final conjunction with the Oxford comma if there are more than two delimiters.
I got something working but I am trying to optimize it. I will post the code when I am mostly happy with it. |
#5 |
Senior Member
Join Date: Nov 2011
Location: South Riding, VA
Posts: 841
|
Ok, I did find I way to do it intrinsically, but it seemed to run slow as well as not be able to handle the cases where we were searching for a string made up of repeated characters very well. So... I went with a workaround requires the input of the number of things that make up the list instead of figuring it out from the string itself since these strings will be generated with things like tagname[], or foreach loops.
Here is what I have for parsing the list Code:
~ this procedure is used to generate a formatted list with and/or at the end ~ from an inputted delimited string ~ ~ Variables ~ ~ v_string is our input string which is changed by this procedure ~ v_delim is out list delimiter, unless specified we assume ", " ~ v_andor is the final conjuction used, we assume "and" if not specified ~ v_number is the number of things in our list ~ var v_string as string var v_delim as string var v_number as number var v_andor as string ~ ~ Internal variables ~ ~ x_len - length of v_string ~ x_lastpos - last position of delimiter ~ x_frontend - after we split the string, this is the front piece ~ x_backend - after we split the string, this is the back piece ~ var x_len as number var x_lastpos as number var x_frontend as string var x_backend as string ~ if we do not have a string, get out doneif (empty(v_string) <> 0) ~ also if v_number is less than 2, get out doneif (v_number < 2) ~ set defaults if (empty(v_delim) <> 0) then v_delim = ", " endif if (empty(v_andor) <> 0) then v_andor = "and" endif ~ set length and position variables x_len = length(v_string) x_lastpos = lastpos(v_string,v_delim) ~ split our string x_frontend = left(v_string,x_lastpos) x_backend = right(v_string,x_len - x_lastpos - length(v_delim)) if (v_number = 2) then v_string = x_frontend & " " & v_andor & " " & x_backend elseif (v_number > 2) then v_string = x_frontend & ", " & v_andor & " " & x_backend endif Code:
~ this procedure counts now many times a given substring appears in a given string ~ ~ VARIABLES ~ ~ v_string is our string we are searching through ~ v_search is our substring ~ v_count is how many times v_search appears in v_string ~ var v_string as string var v_search as string var v_count as number ~ if we don't have either, get out doneif (empty(v_string) <> 0) doneif (empty(v_search) <> 0) ~ ~ ALGORITHM ~ ~ String S is of length n ~ Search string s is of length m ~ current search string C(i) is a piece of S starting at position i of length m ~ we search left to right taking C piecewise. ~ if we match we increment the count, if not take next search string which is C(i+m+1) ~ this means that last possible position we could have for i is n-m. If we do not ~ match we shift by 1 var x_n as number var x_m as number var x_C as string var x_endPos as number var x_ii as number x_n = length(v_string) x_m = length(v_search) ~ out search string cannot be greater than our starting string doneif (x_m > x_n) x_endPos = x_n - x_m while (x_ii <= x_endPos) x_C = mid(v_string,x_ii,x_m) debug "pos: " & x_ii & " - " & x_C if (compare(x_C,v_search) = 0) then v_count += 1 x_ii += x_m else x_ii += 1 endif loop Last edited by frumple; April 9th, 2016 at 12:39 PM. |
#6 |
|
|