It seems there is a lack of trackability when using Summarize, Expand and Tone Shift. This makes it difficult to assess the impact and quality check the use of Generative AI. I understand that applying e.g. a tag when using these features may not necessarily mean that the feature was used for the reply, e.g. you could apply Expand and then revert back to your original reply. Nonetheless, we somehow need clear trackability here or we would just set these features loose without any control or measurement mechanisms in place.
As an example, I'm trying to assess the impact from the trial and I have to rely on agents remembering to using the features - I cannot track if the features have been used so I can't remind agents who are not using them. At the end of the trial I cannot with certainty say that there was an impact because I have no clue if agents have used the features. Why would we commit to paying a good amount of money for this if we cannot assess ROI properly?