AI enabled platform for marketing analytics

Customized AI platform for analyzing and classifying social media content, fine-tuned on English and vernacular content.

About the project

Our Client is a marketing analytics firm. They track and analyze social media trends and customer sentiment for a variety of prominent brands across industries as diverse as technology, healthcare, consumer goods and banking.

The Client wanted Lattice to develop a system that analyzes social media posts in order to classify sentiment, and clusters related posts dynamically.

Technical approach

For sentiment analysis, we first evaluated performance of content-driven versus semantics-driven machine learning models.

Content-driven models refer to "classic" machine learning techniques, in which individual words are vectorized, and then analyzed using algorithms such as random forest classification (RFC) and support vector machines (SVM). They are deterministic and computationally efficient, but have difficulty classifying out-of-vocabulary content. After evaluating a variety of algorithms, we finally settled on RFC.

In contrast, large language models (LLMs) understand semantics, and can handle posts with out-of-vocabulary words. We fine-tuned a foundation LLM by training it on the Client's proprietary data. A foundation model is an LLM released as open-source; the advantage of using it was that the Client's data stayed within the system.

RFC and the fine-tuned foundation LLM (ft-LLM) both delivered results comparable with manual data classification. However, as the test data grew, and more unlabelled data was introduced to the system, the LLM was able to pick up greater nuance, especially semantic-heavy aspects such as sarcasm or humor.

The ft-LLM was also compared with state-of-the-art commercial models released by OpenAI and Google, and performed comparably.

We wrapped the content in a login-controll web-interface, deployed on AWS. The coding pipeline used a variety of services from AWS, HuggingFace and Google Cloud Services (GCS).