Paul Crowley (ciphergoth) wrote in trustmetrics,
Paul Crowley

The TrustFlow algorithm

TrustFlow is a "trust metric" algorithm, which uses human-generated information about trustworthiness in a human-sized community of a few hundred people to generate guesses about trustworthiness in an Internet-sized community of millions of people; this is useful in applications like ranking search hits and preventing spam and vandalism. TrustFlow is unique among "attack resistant" trust metrics in that it can load information about who trusts who incrementally as needed; this makes it well suited for distributed use.

We apply TrustFlow to LiveJournal by treating the decision to place someone on your friends list as an assertion of trust in them. This isn't 100% true and leads to some strange artifacts, but it works well enough to produce interesting results.

One user is special: the trust root is the source from which all trust emanates.

Each user starts with a virtual bucket into which "trust juice" is poured; all buckets have one litre capacity, and they all start out empty. Trust juice pours from heaven into the bucket of the trust root. After one litre has been poured, this bucket is full to overflowing; when it starts to overflow, gutters around the edge of the bucket carry the trust juice in equal quantity to the buckets of all their friends.

Suppose they have ten friends. After eleven litres have been poured, the buckets of the trust root and of all their friends will be full, and start to overflow. At this point things get interesting. Now the juice pours from heaven into the trust root's bucket, overflows into the bucket of one of their friends, and then overflows into the buckets of the trust root's friends of friends. People who are friended by many friends of friends will get more juice per second, but the friends links of people who list few friends will carry more juice than the links of those with many.

Eventually one of the buckets of the friends of friends will fill up; they then are the first people on the list that TrustFlow displays. Their score is the number of litres of trust juice that got poured into the trust root before their bucket filled up, and any more trust juice they receive overflows to be shared among their friends. And so on, until 200 more people have filled their buckets.

One final detail. Sometimes the juice hits "dead ends". If Bob and Carol are the only people on each other's friends lists, then once their buckets are both full they will have no-one to pass trust juice on to. In this instance the juice just "backs up" - no more juice flows to either of them. Each person with a full bucket shares the juice they receive evenly between all of their friends who have got someone with a non-full bucket to pass it on to, either directly or indirectly.

That's the sketch of the workings. If you want to actually implement this, or indeed any trust metric, it helps to understand a little about graph theory. TrustFlow analyzes a digraph of trust. Each arc is an assertion of trust; for example, each vertex might be a LiveJournal user, and there is an arc from A to B when A has B on their friends list. You'll also need to know what an eigenvector is; calculating the inbound flow on each node requires finding an eigenvector for a flow matrix. I use an iterative algorithm and I usually only need one iteration to update the flow; you could probably do things even more efficiently if you only recalculated the parts that needed to be recalculated.
  • Post a new comment


    default userpic

    Your reply will be screened

    When you submit the form an invisible reCAPTCHA check will be performed.
    You must follow the Privacy Policy and Google Terms of use.