In this post, we're going to look at a standard web-analytics setup. This system is more complex than the stand-alone data processing system we looked at in the last article in this series. That said, it has a lot of the same components at its core. The largest difference between this system and the standalone analytics system is that the web analytics system is loosely connected to an active web application.
If you look at the AWS reference architecture for web analytics you'll find that the first two components create the data for our analytics processes--they don't necessarily play a direct role. If we abstract this well, we can entirely decouple these pieces with the only connection between the two systems being the log files that the application produces and the analytics system analyzes.
From there, our analytics system revolves around the two core components we looked at for our standalone analytics system: object storage (S3, Azure Blob, etc.) and cloud compute or a managed cluster (ElasticMapReduce or HDInsight). Unlike before, though, where we wanted to place our analysis into an object store, with web analytics, typically we'll want to make our analytics available to business analysts in a data lake or data warehouse system. In this example, AWS provides a stand-in for that with an "analytics database"
Ultimately, though, the web analytics system boils down to having a web application feed our base analytics system.July 15, 2019