Anomalies in Steam Community data
In a recent post I introduced the Steam Community API , and showed how to retrieve gamer data and perform a few simple but fun analyses. While writing the posting, I came across several problems associated with the data that's returned. If you're thinking about using Steam Community data, it's worth bearing these anomalies in mind because of the impact they'll have on downstream processing and further analysis. Frustratingly, the quality of the data available through the Steam Community API is quite variable - in particular there are many discrepancies between global achievement data compared to achievement data for individual players. I also came across several global achievement rates that were clearly invalid, and in some cases found that global achievement records for games were totally missing. The net result: it's hard to trust that the data that's returned. It is still possible to analyze returned data, but you're going to need strong