Dynamic Clustering, a new ad targeting methodology

This article is a summary of my proposed methodology to targeting web users with relevant ads, but without intrusion, e.g. without third-party cookies. The original version, including a broader depiction of the context, is a longer post in French : “Le régime sans cookies, le nouvel âge du ciblage sur internet

Basically, as related in many articles, including previous posts on this blog (see for instance “Giving up cookies for a new internet… The third age of targeting is at your door“), the usage of third-party cookies is bond to dwindle, if not to disappear. Some solutions have already been submitted, such as fingerprinting (still intrusive though) or unique identifiers (but too much linked to the major existing internet companies).

So, we need a non-intrusive contextual targeting solution, which takes privacy protection into account. This is the core idea of my proposed solution, e.g. “dynamic clustering“.


How does it work?

  1. Based on ISP and/or Operator log files, browsing data will be collected and anonymized (for instance through a double-anonymization filter) so as to protect one’s user privacy; Anonymous
  2. Files will be cleaned (“noise-reduction” processes), and organized at various categorization levels, so as to generate multiple dimensions, all of them rich and flexible. This will allow to create “outlined profiles” for each unique anonymous user; Categorization
  3. Using these dimensions, clusters will be generated, made of users with similar usage behaviors, based on each advertiser’s hypothesis, creating hence an infinite number of target groups, whose volatility is an asset, as it will always cover the client issue of the given moment.

ClusterSo yes, the no-cookie diet is possible… And it goes along with a more virtuous targeting of the internet users…

Convinced by this new diet? Willing to collaborate to the recipe development? Let’s meet!

Analytics without cookies? My follow-up to #MeasureCamp IV

As mentioned in my previous post “Giving up cookies for a new internet… The third age of targeting is at your door.“, I have attended the fourth Measure Camp in London (http://www.measurecamp.org), on March 29th. And my (voluntarily controversial) topic has been: “Web Analytics without cookies?

The subject has been introduced by the following three charts, a short introduction to what I expected to be a discussion, and a hot one it has been!

Measure Camp IV (post)

Basically, the discussion has been getting around three topics:

  • Are really cookies going to disappear, and if yes which ones and how?
  • Are cookies disapproved by the users because of their lack of privacy or rather because of some all-too aggressive third-party cookie strategies?
  • Are there any solutions, and when do we need them at last?

Topic number 1 definitely is the most controversial. It already is difficult to imagine how to deal without what has been the basics of collection, targeting and analysis. On top of this, some valid objections also have been given, such as the necessity to keep first-party cookies for a decent browsing experience as well as the request from a fair share of the users to keep ads, providing they were relevant to them. A very good follow-up has been brought by James Sandoval (Twitter: @checkyourfuel) and the BrightTag team. Thanks to them for their inputs.

Clearly, the participants were all agreeing that a cookie ban would only impact third-party ones, and occur for political reasons (maybe not before 3 to 5 years), lest a huge privacy scandal ignites an accelerated decision process. Still, a fair amount of the internet revenue would then be imperiled.

At this stage, there still remains the acceptance of cookies by the users. There is a wide consensus within the digital community that people browsing the internet accept a reasonable amount of cookie intrusion in their lives, should this generate relevant ads. Actually, I think this view is biased, as nobody has ever asked whether people would rather browse with or without ads… The question always has been between”wild” and “reasoned” ad targeting… It reminds me of an oil company asking if car drivers would rather tank diesel or lead-free, not allowing “electricity” as a valid answer…

So the question of cookie acceptance remains open in my eyes, and this may be a key driver to designing alternative solutions.

What options do we have at hand then?

The first and blatant one is a better regulation of third-party cookies, especially the ability of the user to master how, when and with whom their first-party cookies could and should be shared in an opt-in mode. The law (in the EU) theoretically rules this (see EU rules about cookie consent here), through a warning to the user about cookies, when he or she opens a new website. Still, national transcriptions and various ways of web page developments have made this law non-understandable, and mostly not actionable on a global basis.

A first step would then be to abide by the user’s choice, and give him the ability to manage his or her own cookies, sharing some, all or none of them with third-parties, as they wish. A difficult task, especially when nearly 30 government bodies are to be implied… So why not investigate non-cookie options?

In London, I have introduced two possible ways:

  1. Create a unique Id for each user, somewhat like Google’s unique Id, but managed by an independent body. My suggestion is that such an Id should belong to the whole community, like HTML or HTTP… A huge task.
  2. The other idea is mine… It would consist of the generation of anonymized profiles, based on browsing patterns. This idea I shall develop more in detail in future posts, but the idea is worth thinking, especially when one imagines that today’s user mood may not be tomorrow’s, and require a very dynamic targeting methodology…

So this hot discussion on cookies at least has initiated discussions among the digital community. It also proved that such fresh (and sometimes idealistic) views as mine are necessary to keep the digital community staying on the edge of innovation. So stay tuned, I shall go on providing food for thought so as to “shake the tree” of Measurement…

Giving up cookies for a new internet… The third age of targeting is at your door.

While preparing next week’s Measure Camp in London (http://www.measurecamp.org), I had been wondering what would be the most interesting topic in my eyes. And my question is: “How would Web Analytics work without cookies?

Actually, last year, in September, I had read an interesting post by Laurie Sullivan, posted on the MediaPost.com site: “Where The Next Ad-Targeting Technology Might Come From“. This had been the core of my thoughts for the past months, so I wanted to elaborate on Laurie’s post so as to introduce my own ideas about this topic.

I personally believe that the mean of collecting information from the web users through cookies is fading and soon to disappear. There are many reasons for this, including the user privacy concerns, the lack of contextuality of the cookie as well as the development of multiple access point and devices, that render such a data collection highly hazardous.

The disappearance of cookies would have an impact on at least three areas: data collection, targeting and analytics.

  • Data collection is highly based on cookies, especially when dealing with ad exposure and browsing habits. High impact.
  • Targeting is also based on cookies, as most tools use history to handle their most likely customers. High impact.
  • Analytics are also using cookies, especially for site-centric analysis as well as various page-level analysis. High impact.

Considering the high impacts, time has come for a more contextual and more behavioral targeting. We are now entering the third age of targeting. The first age had been based on sociodemographics, widely used by TV Ads or direct post mailing. The second age has been based on using past behavior to predict potential future actions, and, in internet, is widely using cookies to pursue this goal. The third age will be the age of context, targeting anonymous users with current common interests.

How will it work? One possible way: we would use network log files (provided by ISP’s or Telco’s) to collect data, organize these data with a categorization at various levels and through multiple dimensions so as to generate rich but heterogeneous user clusters and hence allow targeting of potential customers based on ad-hoc inputs. I shall elaborate in further posts, especially regarding the process, but the main advantage is the respect of privacy, especially thanks to cookie avoidance…


So, yes, giving up cookies may be difficult; this is why I believe we ought to prepare to go on a diet as of today…

And act for alternative methodologies instead of shouting “me want cookies!”

Data Privacy, between a rock and a hard place

How are we to handle Data Privacy? Through goodwill, as original free internet promoters would like to? Or through coercive regulation measures, as government bodies are prone to? This definitely is no easy dilemma…

The Marketing Mobile Association in France has been willing to put the question on the table, last Wednesday (Feb 12th), on the very same day when the US were having the so-called “safer internet day”. The meeting venue was more on the goodwill side, as the event has been hosted by the Mozilla Foundation in their Paris office. A nice place, by the way, see for yourself…

Mozilla Meeting Room

The discussion panel was more balanced, with Etienne Drouard attorney at K&L Gates, specialized in Privacy matters, and Geoffrey Delcroix, CNIL Innovation Director (CNIL being the French Internet Regulatory Body), as well as Hervé Le Jouan, CEO of Privowny, and Tristan Nitot, Principal Evangelist Mozilla Europe (a brilliant coffee brewer as well…), the whole thing being moderated by Bruno Perrin, Media & Entertainment Leader at EY.

Between tools to manage oneself’s privacy (see my own selection at the bottom of this post) and various comments to the Privacy Laws, the main impression that remains from this panel discussion is that handling Data Privacy is like walking on a tight rope…

Two opposite views are currently cleaving the internet:

  • On one side, the “libertarian” internet promoters, with their concepts based on freedom as wide as possible (net neutrality, open data, open source, etc…), whose view of privacy is linked to each person individual right to protect one’s privacy. A global “do-not-track” by default would certainly please them, especially if companies are to respect it forcefully…
  • On another side, at the opposite of the scope, we have the state bodies, willing to set more control on the internet, as this is something that they do not only misunderstand, but also fear; in this respect, they wish to instate regulations, privacy by design, control over content, etc…

And, in the middle, the so-called “new economy”, all these companies and people trying to make a sensible use of the internet… Not easy, mmh? What I understood very clearly from the panel discussion is that none of the extreme behaviors depicted above would give internet a chance. Setting “do-not-track” by default would simply lead companies to ignore it, and hence kill the idea. And on the other side, regulating the market by law would technically make it die, in the end. Hence, the tight rope strategy is the only one that remains, with a difficult balance between market freedom and people’s protection, between business and privacy…

So what are we left with? We can try to manage our own privacy, and ensure it does not go beyond the borders we have set. Nobody lives in a cave with no contact to the outside any more (as this would probably be the only way to fully protect one’s privacy…). But nobody wants to live constantly under the eyes of watchers, as in a personal Truman Show, especially when your information is wanted for their business… We may go on using internet, conscious that we are watched, but managing this, and knowingly give our consent wherever we believe it makes sense, blocking all other non-sollicited requests…

There are many tools to do so. Probably too many. I personally use five.

  1. An ad-blocker: this is not a must have, but it may be useful , especially to speed up your browsing. I use AdBlock, a Chrome extension. The disadvantage of this, is that most ad-blockers do not offset the changes in the layout of the website, making it sometimes barely readable (as for instance my favorite sport page, Sport24). And do not forget that most sites earn their money thanks to the ads… So I disable it now and then, especially when visiting sites with less audience.
  2. A user/password manager: this is highly interesting, to ensure you know what and where you have been logging in, and ensure nobody is using some of your identities without you knowing it. I am using the Privowny tool bar, a very useful add-on.
  3. An identity verifier: this is for Twitter in particular. To avoid being followed (and spammed) by robots and fake followers, I am using TrueTwit, a simple (and not so expensive) tool to filter and verify any Twitter user. I have less followers now, but only real people…
  4. A do-not-track option: I also use, now and then, the do-not-track feature in my browser (Chrome). This I do especially when shopping or banking online, so as to minimize the amount of cookies shared by these companies that also own very personal information of mine. I know, this is more a wishful thinking, but at least shows that I am not ready to let everything leak.
  5. A graphical cookie tracer: I have uploaded CookieViz from the CNIL website, a free software to visualize your browsing, and the cookies that have been shared with third parties. At least, when you browse websites, including your favorite ones, you know what you are at… Below a short description of this tool (currently only available for Windows OS, soon to come for Mac and Unix).

CookieViz example

The picture shows a session for 7 browsed sites (9 views total). The 7 websites are “circled” with red pentagons. Up right is Sport24 (link provided above), below e-commerce website CDiscount.fr and information website LeMonde.fr.

At the bottom, from right to left, a gaming website BigPoint.com, my About.me profile and this blog’s dashboard page. In the middle, Avinash Kaushik’s blog (Occam’s Razor), showing that even the blog of a respected digital evangelist like Avinash may share third-party cookies…

The graph is, I believe, self-explanatory; the visited websites (red pentagons) are generating cookies (all blue round spots), which are kept for first-party usage (blue links) or shared with third-party (red links). To be clear, I have disabled the AdBlock to generate this graph, so as to prevent partial representation.

This tool is highly interesting in my eyes. It does not block anything, but shows you everything. At least, the user knows what happens when he/she visits a website, and may decide to go on browsing, or choose alternatives websites with a better sharing policy, especially regarding third-party cookies.

A better informed customer always makes better choices.

The Great Discoveries, the Enlightenment and the Internet

A few weeks ago, I have been discussing about cookies while working on an assignment for my one of UBC Award of Achievement in Digital Analytics module and I had an interesting argument about comparing one’s personal computer to a home, and beyond this to the fact that the internet development is a real revolutionary spread.

Actually, this comparison makes sense, as far as you consider your home for what it is, i.e. your home base before and after any possible journey you would make, for work, leisure, shopping, vacation… Your computer is like your home, only if you do not walk outside, i.e. only if you do not connect to the web. Browsing is like going outside, to shops, to leisure activities, to theaters, to restaurants, and there, people collect constantly your own personal information.

Considering the privacy issue on my computer, I have especially focused my thoughts on the cookie management, a difficult balance between a good browsing experience and an all too present advertisement intrusion. It is similar to any visit or phone call to my home, as I would not want to tell or show too much, unless I have cleaned my floor, hidden what I would not want others to have a look at, or set my mind to “politically-correct”…

There is a schizophrenic behavior of users requesting an ever improved speed and usability for the websites, but grumbling against the website adaptation to the client’s browsing preferences, in a “but how do they know so much about me?” mode. As far as I am aware that I give up some of my privacy, allowing first-party cookies to improve my own user experience through increased page loading speed and saved preferences (such as passwords on on-line gaming sites or pre-entered Personal Information on e-shopping ones), this is fine: I do accept them freely, wherever they make sense, i.e. when they offer a real service to me. On small exception, when accessing “sensible” information (Online Banking or Tax Payments), I usually activate the “do-not-track” option, which is supposed to prevent the website from collecting cookies. So far so good for my online privacy.

In parallel to my home, I just lock it with a key, draw the curtains and lock the shutters, and my home privacy is also safe. Still a personal computer cannot be a stand-alone object any more, with no connection to the outside world, just like no one would ever stay at home all of his/her life. A computer is very much like one’s life, in constant interaction with outside inputs and outputs, and hence everyone has to acknowledge that data are collected about oneself, one way or the other.

Beyond the debate about how to manage cookies and whether they should be more strictly ruled (I may handle this later), I believe that we are anyway at a turn of history, a moment when the technology itself (internet, broadband, mobile, NFC, GPS…) is altering the very way we behave.

To refer precisely to my post title, I believe we are at a turn of tides, like at the times of the Great Discoveries in the 15th and 16th century, when we got to learn about other worlds, or those of the Enlightenment, in the 18th and 19th century, when the emergence of new ideas, as well as the tremendous progress of transportation, meant no one in the world could remain hidden from others. In this respect, the internet (and its multiple technological spin-offs) brIngs a new era of openness to the world, as one may not only be aware of new ideas or have the possibility to go and see other people and culture, but in a more efficiently manner, anyone may now summon anything and anyone to his own couch, through the power of a connected device…

This is like my home actually, as in the Middle Ages, it was totally isolated in the middle of the nowhere, and then someone discovered the way to it, then an enlightening road was built, and now, a full connected town is growing around my place. My behavior, and my relation to others will definitely be altered, as well as the depth of the knowledge they will be able to gather about me (and me about them)…

The Great Discoveries led to the fading power of the Catholic Church in Western Europe, as well as the possibility to travel enriched the revolutionary ideas of the Enlightenment… So is also the internet revolution, it changes our relationship to the world, to others, and even to ourselves!

it goes without saying that any revolution has its drawbacks; like the Revolutions from 1789 to 1917 have led to massive kills of “Ancien Régime” people, the internet is unfortunately not only nibbling our privacy, but also killing some traditional activities (Physical Cultural Media, including paper, CD’s or tapes, Brick and Mortar retailing…). Still, the same Revolutions brought new rights to the majority of the people (Liberty, Equality, Fraternity), just like the internet now allows equal rights and access to education, information, services, products… I do believe in the positive change that is brought by the latest technologies. Let us just remind ourselves that the internet only is a mean to reach a more comfortable world…