I've recently had to analyze URL path elements, and check each word in it. However I ran into an issue that I need to check each camel case word. For this, I've create a function to split a string to camel case words.
import (
"fmt"
"unicode"
)
func SplitCamelWords(segment string) []string {
var words []string
var word string
var prevCharLetter bool
var prevCharUpper bool
inUpperWord := false
for charIndex, currentChar := range segment {
currCharUpper := unicode.IsUpper(currentChar)
currCharLetter := unicode.IsLetter(currentChar)
if charIndex > 0 {
if currCharLetter {
if prevCharLetter {
if prevCharUpper {
if currCharUpper {
inUpperWord = true
} else {
if inUpperWord {
words = append(words, word)
word = ""
}
}
} else {
if currCharUpper {
words = append(words, word)
word = ""
} else {
inUpperWord = false
}
}
}
} else {
inUpperWord = false
if prevCharLetter {
words = append(words, word)
word = ""
}
}
}
prevCharUpper = currCharUpper
prevCharLetter = currCharLetter
word += fmt.Sprintf("%c", currentChar)
}
if prevCharLetter {
words = append(words, word)
word = ""
}
return words
}
and a test output is:
A -> [A]
a -> [a]
Aaaaa -> [Aaaaa]
AAAAA -> [AAAAA]
aaaaa -> [aaaaa]
A1 -> [A]
a1 -> [a]
Aaaaa1 -> [Aaaaa]
AAAAA1 -> [AAAAA]
aaaaa1 -> [aaaaa]
aB -> [a B]
aaaaB -> [aaaa B]
AaaaB -> [Aaaa B]
AaaaBBBB -> [Aaaa BBBB]
AaaaBbbbb -> [Aaaa Bbbbb]
aB2 -> [a B]
aaaaB2 -> [aaaa B]
AaaaB2 -> [Aaaa B]
AaaaBBBB2 -> [Aaaa BBBB]
AaaaBbbbb2 -> [Aaaa Bbbbb]
Aaaa1b -> [Aaaa 1b]
Aaaa1B -> [Aaaa 1B]
Aaaa1bbb -> [Aaaa 1bbb]
Aaaa1Bbb -> [Aaaa 1Bbb]
AaBbCc -> [Aa Bb Cc]
A1B2C3 -> [A 1B 2C]
Aa1Bb2Cc3 -> [Aa 1Bb 2Cc]
AAA BBB Ccc -> [AAA BBB Ccc]
AAA1BBB Ccc -> [AAA 1BBB Ccc]
AAA1BBB Ccc -> [AAA 1BBB Ccc]
Notice that the task is not as obvious as it might appear at first glace. We cannot just split whenever we find an upper case character, but instead we need to consider the sequence of characters.
For example: HouseOfLove would count as 3 words: House, Of, Love.
However, houseOFLove would still count as 3 words, since we have a sequence of upper case characters: House, OF, Love.
No comments:
Post a Comment