Answer:
Explanation:
Assuming input is a string contains space separated words,
like x = "a b c d" we can use the following function
def ngrams(input, n):
input = input.split(' ')
output = []
for i in range(len(input)-n+1):
output.append(input[i:i+n])
return output
ngrams('a b c d', 2) # [['a', 'b'], ['b', 'c'], ['c', 'd']]
If you want those joined back into strings, you might call something like:
[' '.join(x) for x in ngrams('a b c d', 2)] # ['a b', 'b c', 'c d']
Lastly, that doesn't summarize things into totals, so if your input was 'a a a a', you need to count them up into a dict:
for g in (' '.join(x) for x in ngrams(input, 2)):
grams.setdefault(g, 0)
grams[g] += 1
Putting all together
def ngrams(input, n):
input = input.split(' ')
output = {}
for i in range(len(input)-n+1):
g = ' '.join(input[i:i+n])
output.setdefault(g, 0)
output[g] += 1
return output
ngrams('a a a a', 2) # {'a a': 3}