Case study: polluted reports shows how system can be polluted with dummy data.
Saving data (even HTTP referer) without validation can contaminate system as well:
SELECT TOP (10)
[ContactId]
,[LastModified]
,[FacetData]
,JSON_QUERY(FacetData,'$.Referrers') as [Referrers]
, DATALENGTH(JSON_QUERY(FacetData,'$.Referrers')) as [ReferrerSize]
FROM
[xdb_collection].[ContactFacets]
WHERE
[FacetKey]='InteractionsCache'
AND CHARINDEX('"Referrers":["', FacetData) > 0
ORDER BY [ReferrerSize] DESC
The results show astonishing 28KB for storing single value:

Next time you see Analytics shards worth 600 GB – recall this post.